Deployment Runbook (MT-RAG)¶
Last verified: 2026-02-15 (Europe/Zurich)
This is the complete, stand-alone deployment runbook for:
- DGX Spark running vLLM (OpenAI-compatible)
- Tailscale networking
- Hostinger VPS running Dokploy + Docker Compose (React frontend + FastAPI + Neo4j)
- Neo4j Community (single DB:
graph-fixed-size) with GraphRAG + communities pipeline
Note: Streamlit was used as a test/prototype UI and can still be run as an optional fallback. For frontend runtime and local workflow details, see Frontend (React).
Secrets policy: - Do NOT commit real secrets (API keys, tokens, passwords). - Keep placeholders in docs and store real values in Dokploy secrets/env.
0) Where to run commands (IMPORTANT)¶
Dokploy has two different terminals:
0.1 VPS host terminal (has docker, can modify /srv/...)¶
Use this for:
- any command that starts with
docker ... - deleting/creating files under
/srv/mt_uploadand/srv/mt_data - deleting Docker volumes
Open it by SSH from your laptop:
| PowerShell | |
|---|---|
0.2 Dokploy "Docker Terminal" (inside a container, NO docker command)¶
Use this for:
- running
python -m graphbuild ...andpython -m communities ...(inside the api container) - running
cypher-shell ...(inside the neo4j container)
In the Docker Terminal you are already inside a container shell, so
docker ...will not work.
1) Source-of-truth infrastructure values¶
DGX Spark (vLLM host)¶
- Hostname:
VLLM_HOST - Tailscale IP:
VLLM_TAILSCALE_IP - Tailnet DNS (TLS):
VLLM_TAILNET_DNS - vLLM port:
8000 - vLLM model id:
VLLM_MODEL_ID - vLLM container name:
VLLM_CONTAINER
Hostinger VPS (Dokploy host)¶
- Public IPv4:
VPS_PUBLIC_IP - Hostname:
VPS_HOSTNAME - Tailscale IP:
VPS_TAILSCALE_IP
Windows laptop¶
- Device name:
ag - Tailscale IP:
LAPTOP_TAILSCALE_IP - Spark SSH key:
%USERPROFILE%\.ssh\tailscale_spark - VPS SSH key:
%USERPROFILE%\.ssh\vps_hostinger
2) Persistent config - DGX Spark¶
2.1 Expose vLLM over tailnet (no tunnel)¶
2.2 Auto-restart vLLM container¶
| Bash | |
|---|---|
3) Dokploy environment variables + mounts¶
3.1 vLLM connectivity (required)¶
Set these in Dokploy env/secrets for frontend and API services (and Streamlit if fallback is enabled):
VLLM_URL=http://VLLM_TAILSCALE_IP:8000/v1/chat/completionsVLLM_MODEL=VLLM_MODEL_IDVLLM_API_KEY=<SECRET_TOKEN>(example:token-local-dev)
3.2 Data mount (required)¶
The app expects a mounted data root:
CT_DATA_ROOT=/data
Recommended host paths on the VPS:
- Upload staging:
/srv/mt_upload - Data root:
/srv/mt_data
Compose mount (via Dokploy env + compose var):
- Set in Dokploy env:
CT_DATA_HOST=/srv/mt_data - Compose volume line mounts
${CT_DATA_HOST}:/data
Tip: make the mount read-only if you want safety:
/srv/mt_data:/data:ro
3.3 Neo4j GraphRAG (single DB)¶
Required:
NEO4J_URI=bolt://neo4j:7687(inside Compose network)NEO4J_USER=neo4jNEO4J_PASSWORD=<SECRET_PASSWORD>(used by API and optional Streamlit containers)NEO4J_DATABASE=graph-fixed-size
Optional (only if you need the dataset->DB fallback):
NEO4J_DATABASE_FIXED_SIZE=graph-fixed-size
Do NOT use multi-db on the VPS (Community):
- Do not set
NEO4J_DATABASE_SEMANTIC - Do not refer to
graph-semantic
Graph expansion default:
COMM_INGEST_TAG=comm_fixed_C1_g1_2(update this when you change ingest tag)
3.4 RAG API auth (optional but recommended)¶
RAG_API_KEY=<SECRET_API_KEY>
3.5 Legacy Streamlit auto-pipeline¶
AUTO_PIPELINE_FROM_CHAT=1
3.6 Public domain routing (optional)¶
If you expose the React frontend under a domain (for example crains.souveraen.cloud):
- DNS
Arecord points to the VPS IP:VPS_PUBLIC_IP - If you use CAA records, include:
CAA 0 issue "letsencrypt.org" - In Dokploy -> mt-rag -> Domains, map the host to the frontend service on container port 3000 and enable Let's Encrypt.
4) Critical Neo4j note (why dump/load is NOT the default)¶
Local store format check result for graph-fixed-size:
store = block-block-1.1
Neo4j Community cannot load a database that uses the block store format. Therefore:
- Default deployment approach: rebuild graph on VPS from mounted GOLD using
graphbuild ingest - Do NOT plan on
neo4j-admin database dump/loadfrom local to VPS Community (it will fail)
5) Minimal "it works" tests¶
5.1 Windows -> Spark vLLM over Tailscale¶
| PowerShell | |
|---|---|
PowerShell chat completion test:
5.2 VPS -> Spark vLLM over Tailscale (deployment-critical)¶
| Bash | |
|---|---|
6) Dokploy deploy checklist (high level)¶
- Dokploy app created
- Repo linked
- Compose file:
deploy/docker-compose.yml - Host dirs created:
/srv/mt_upload/srv/mt_data- Dokploy env set:
CT_DATA_ROOT=/dataCT_DATA_HOST=/srv/mt_data- Neo4j + vLLM vars above
AUTO_PIPELINE_FROM_CHAT=1- Deploy
- Verify:
- Frontend responds (React in production; Streamlit only if fallback is enabled)
- API
/healthzok - API
/queryreturns{answer, sources} - GraphRAG path works (graph expansion uses Neo4j)
7) START HERE after deploy: VPS/Dokploy GraphDB smoke test (Option C)¶
7.1 Verify /data mount (inside api container)¶
Dokploy -> Docker Terminal -> select the api container:
| Bash | |
|---|---|
7.2 Verify Neo4j is up and DB exists (inside neo4j container)¶
Dokploy -> Docker Terminal -> select the neo4j container:
| Bash | |
|---|---|
Use ${NEO4J_AUTH#*/} because the neo4j container should NOT receive NEO4J_PASSWORD as an env var (strict validation in Neo4j 2025.x).
7.3 Run ingest (rebuild graph from GOLD) (inside api container)¶
Dokploy -> Docker Terminal -> select the api container:
| Bash | |
|---|---|
7.4 Verify counts (inside neo4j container)¶
Dokploy -> Docker Terminal -> select the neo4j container:
| Bash | |
|---|---|
8) Full graph + communities bootstrapping (recommended after ingest)¶
Run these inside the api container (Dokploy Docker Terminal -> api):
Verify (inside neo4j container):
| Bash | |
|---|---|
9) OPTIONAL: Dump/load (only if you change edition/format later)¶
Dump/load is NOT recommended for the current setup because the local DB uses block store format.
Only consider dump/load if:
- You switch VPS Neo4j to Enterprise, OR
- You migrate/rebuild the DB into aligned store format first
Also: do NOT copy Neo4j Desktop data directories from Windows to Linux.
10) Recommended compose edits (quick summary)¶
In deploy/docker-compose.yml:
- Use
neo4j:2025.08.0(Community) - Set default DB name:
NEO4J_initial_dbms_default__database=${NEO4J_DATABASE:-graph-fixed-size}- Set password only via:
NEO4J_AUTH=neo4j/${NEO4J_PASSWORD}- Do NOT pass
NEO4J_PASSWORDinto the neo4j container (strict validation in 2025.x) - Neo4j healthcheck should use
NEO4J_AUTH, e.g.: cypher-shell -u neo4j -p "$${NEO4J_AUTH#*/}" "RETURN 1"- Remove semantic DB env vars from API + any legacy Streamlit service
- Set
PYTHONPATHfor API + Streamlit (if enabled) to: /app/src:/app/src/rag
10.1 Domain routing + internal service DNS (important)¶
If you route Streamlit (legacy fallback) via Dokploy/Traefik external network (dokploy-network), ensure Streamlit is connected to both:
- the compose default network (for
api/neo4jhostname resolution) - the external dokploy network (for domain routing)
Example snippet:
| YAML | |
|---|---|
11) Security notes¶
- Prefer tailnet-only access to vLLM.
- Do not expose Neo4j to the public internet by default.
- Store all secrets in Dokploy secrets/env, not in git.
Appendix - Full Reset + Reupload + Rebuild (wipe EVERYTHING)¶
This appendix is the combined full reset procedure. Use it when you want the VPS dataset and Neo4j graph to be totally empty before reupload/rebuild.
A) Full Reset + Reupload + Rebuild Runbook (VPS + Dokploy)¶
This runbook completely wipes:
- all uploaded dataset artifacts on the VPS (
/srv/mt_upload,/srv/mt_data) - the entire Neo4j database for this Dokploy app (by deleting the Neo4j Docker volumes)
Then it rebuilds everything (ingest + communities + summaries + vector index).
Destructive: this deletes all Neo4j graph data and all uploaded dataset files for this MT deployment.
A.1 Variables you choose each run¶
DATASET=fixed_sizeDATE=YYYY-MM-DD(example:2026-01-24)TAG=comm_fixed_<DATE>_C1_g1_2(example:comm_fixed_2026_01_24_C1_g1_2)LEVEL=C1
A.2 Step 1 - Update Dokploy env vars (before you deploy)¶
In Dokploy -> mt-rag -> Environment update:
COMM_INGEST_TAG=<TAG>
Recommended:
UI_DEFAULT_DATASET=fixed_sizeUI_DEFAULT_DATE=<DATE>UI_DEFAULT_INGEST_TAG=<TAG>
Stable / keep as-is:
CT_DATA_HOST=/srv/mt_dataCT_DATA_ROOT=/dataNEO4J_DATABASE=graph-fixed-size
A.3 Step 2 - Stop the app in Dokploy (recommended)¶
Dokploy -> mt-rag -> General -> Stop
A.4 Step 3 - FULL WIPE of uploaded dataset files on the VPS (HOST via SSH)¶
SSH into the VPS:
| PowerShell | |
|---|---|
On the VPS:
A.5 Step 4 - FULL WIPE of Neo4j graph database (HOST via SSH)¶
A.5.1 Identify the compose project name¶
| Bash | |
|---|---|
Example compose project prefix:
| Bash | |
|---|---|
A.5.2 Bring the stack down (no volume deletion yet)¶
| Bash | |
|---|---|
A.5.3 Delete Neo4j volumes¶
| Bash | |
|---|---|
A.6 Step 5 - Reupload fresh data (Windows -> VPS)¶
A.6.1 Update scripts/sync_mt.ps1¶
Edit:
- $Date = "<DATE>"
A.6.2 Run upload/extract script (Windows)¶
| PowerShell | |
|---|---|
A.6.3 Verify files exist on VPS (optional)¶
A.7 Step 6 - Deploy in Dokploy¶
Dokploy -> mt-rag -> General -> Deploy
A.8 Step 7 - Rebuild the graph (inside API container)¶
| Bash | |
|---|---|
A.9 Step 8 - Re-run processing (communities + summaries + index)¶
| Bash | |
|---|---|
A.10 Step 9 - Verify counts (inside Neo4j container)¶
A.11 Step 10 - Quick functional test¶
- Open frontend domain (for example
https://crains.souveraen.cloud) - Run a query
Optional API check (inside api container):
| Bash | |
|---|---|
A.12 Notes¶
- Step A.4 deletes all uploaded files in
/srv/mt_uploadand/srv/mt_data. - Step A.5 deletes the Neo4j volumes, so the graph DB is empty.
If you only need to replace one date folder or only rebuild by tag, skip A.4/A.5 and just:
- upload new data (overwriting only the relevant folders)
- re-run ingest with
--delete-tag - re-run communities pipeline