Self-Hosted Deployment: Docker Compose, Kubernetes and High Availability
Chapter 19: Private Deployment — Docker Compose / K8s / High-Availability Architecture
Moving Dify from cloud SaaS to your own infrastructure preserves data sovereignty and enables enterprise-scale stability — this chapter delivers production-ready blueprints you can deploy today.
Chapter Overview
Most teams reach the same inflection point after months on Dify Cloud: data residency requirements, compliance audits demanding raw log access, or throttling under peak load. Private deployment is not just docker compose up — it requires decisions about networking, storage, service orchestration, rolling upgrades, and disaster recovery.
This chapter scales through three tiers:
- Single-node Docker Compose — PoC / small teams (< 50 users)
- Multi-node with hot standby — mid-size teams (50–500 users)
- Kubernetes HA cluster — large enterprises (> 500 users, SLA ≥ 99.9%)
By the end, you will be able to:
- Complete a production-grade private Dify deployment independently
- Choose the right architecture for your scale
- Configure Nginx reverse proxy, SSL termination, and session affinity
- Design failover and zero-downtime upgrade workflows
Level 1: Core Concepts (1–3 Years Experience)
Dify Service Landscape
Before touching a terminal, understand what Dify actually runs:
| Service | Role | Exposed Port |
|---|---|---|
api |
Backend API (Flask) | 5001 |
worker |
Celery async tasks (document indexing) | none |
web |
Frontend Next.js app | 3000 |
db |
PostgreSQL | 5432 |
redis |
Cache + message queue | 6379 |
weaviate |
Vector database (default) | 8080 |
sandbox |
Code execution sandbox | 8194 |
nginx |
Reverse proxy entry point | 80/443 |
Mental model: Think of Dify as a restaurant. web is the front-of-house host, api is the head chef, worker is the dishwasher handling slow tasks, db is the cold storage, redis is the expediting window for fast retrieval, weaviate is the recipe index, and nginx is the front door bouncer.
Single-Node Docker Compose Deployment
Minimum production hardware:
- CPU: 4 cores
- RAM: 8 GB (16 GB recommended)
- Disk: 100 GB SSD+
- OS: Ubuntu 22.04 LTS / Debian 12
Step 1: Install Docker
curl -fsSL https://get.docker.com | sh
sudo usermod -aG docker $USER
newgrp docker
docker --version # Docker version 24.x
docker compose version # Docker Compose version v2.x
Step 2: Clone and configure
git clone https://github.com/langgenius/dify.git
cd dify/docker
cp .env.example .env
Step 3: Critical environment variables
# .env — must-change items for production
SECRET_KEY=your-super-secret-key-$(openssl rand -hex 32)
DB_USERNAME=dify
DB_PASSWORD=StrongPasswordHere
DB_HOST=db
DB_PORT=5432
DB_DATABASE=dify
REDIS_HOST=redis
REDIS_PORT=6379
REDIS_PASSWORD=StrongRedisPassword
VECTOR_STORE=weaviate
STORAGE_TYPE=local
STORAGE_LOCAL_PATH=/app/api/storage
CONSOLE_WEB_URL=https://dify.yourcompany.com
APP_WEB_URL=https://dify.yourcompany.com
Step 4: Start services
docker compose up -d
docker compose ps
docker compose logs -f api
Step 5: Initialize admin account
After startup, visit http://your-server-ip and complete the setup wizard.
Nginx Reverse Proxy Configuration
# /etc/nginx/sites-available/dify.conf
upstream dify_web {
server 127.0.0.1:3000;
}
upstream dify_api {
server 127.0.0.1:5001;
}
server {
listen 80;
server_name dify.yourcompany.com;
return 301 https://$host$request_uri;
}
server {
listen 443 ssl http2;
server_name dify.yourcompany.com;
ssl_certificate /etc/ssl/certs/dify.crt;
ssl_certificate_key /etc/ssl/private/dify.key;
ssl_protocols TLSv1.2 TLSv1.3;
location / {
proxy_pass http://dify_web;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
location /api/ {
proxy_pass http://dify_api;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_read_timeout 300s;
proxy_send_timeout 300s;
# SSE streaming support
proxy_buffering off;
proxy_cache off;
}
client_max_body_size 100m;
}
Level 2: Mechanism Deep Dive (3–5 Years Experience)
Production-Grade Docker Compose
# docker-compose.prod.yaml
version: '3.8'
services:
api:
image: langgenius/dify-api:0.10.0
restart: always
environment:
MODE: api
LOG_LEVEL: INFO
SECRET_KEY: ${SECRET_KEY}
DB_USERNAME: ${DB_USERNAME}
DB_PASSWORD: ${DB_PASSWORD}
DB_HOST: db
DB_DATABASE: ${DB_DATABASE}
REDIS_HOST: redis
REDIS_PASSWORD: ${REDIS_PASSWORD}
CELERY_BROKER_URL: redis://:${REDIS_PASSWORD}@redis:6379/1
VECTOR_STORE: weaviate
WEAVIATE_ENDPOINT: http://weaviate:8080
STORAGE_TYPE: local
volumes:
- dify_storage:/app/api/storage
depends_on:
db:
condition: service_healthy
redis:
condition: service_healthy
deploy:
resources:
limits:
memory: 2G
cpus: '2'
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:5001/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 60s
logging:
driver: "json-file"
options:
max-size: "100m"
max-file: "3"
worker:
image: langgenius/dify-api:0.10.0
restart: always
environment:
MODE: worker
LOG_LEVEL: INFO
SECRET_KEY: ${SECRET_KEY}
DB_USERNAME: ${DB_USERNAME}
DB_PASSWORD: ${DB_PASSWORD}
DB_HOST: db
DB_DATABASE: ${DB_DATABASE}
REDIS_HOST: redis
REDIS_PASSWORD: ${REDIS_PASSWORD}
CELERY_BROKER_URL: redis://:${REDIS_PASSWORD}@redis:6379/1
volumes:
- dify_storage:/app/api/storage
deploy:
resources:
limits:
memory: 4G
cpus: '4'
db:
image: postgres:15-alpine
restart: always
environment:
POSTGRES_USER: ${DB_USERNAME}
POSTGRES_PASSWORD: ${DB_PASSWORD}
POSTGRES_DB: ${DB_DATABASE}
command: >
postgres
-c shared_buffers=256MB
-c max_connections=200
-c work_mem=4MB
-c effective_cache_size=512MB
volumes:
- postgres_data:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -U ${DB_USERNAME}"]
interval: 10s
timeout: 5s
retries: 5
redis:
image: redis:7-alpine
restart: always
command: >
redis-server
--requirepass ${REDIS_PASSWORD}
--maxmemory 512mb
--maxmemory-policy allkeys-lru
--appendonly yes
volumes:
- redis_data:/data
healthcheck:
test: ["CMD", "redis-cli", "-a", "${REDIS_PASSWORD}", "ping"]
interval: 10s
timeout: 5s
retries: 5
volumes:
postgres_data:
driver: local
driver_opts:
type: none
o: bind
device: /data/dify/postgres
redis_data:
driver: local
driver_opts:
type: none
o: bind
device: /data/dify/redis
dify_storage:
driver: local
driver_opts:
type: none
o: bind
device: /data/dify/storage
Backup Strategy
#!/bin/bash
# /opt/dify/backup.sh
BACKUP_DIR="/backup/dify"
DATE=$(date +%Y%m%d_%H%M%S)
RETENTION_DAYS=30
mkdir -p "$BACKUP_DIR"
# PostgreSQL dump
docker exec dify-db-1 pg_dump \
-U $DB_USERNAME \
-d $DB_DATABASE \
--format=custom \
> "$BACKUP_DIR/postgres_${DATE}.dump"
# Storage files
tar -czf "$BACKUP_DIR/storage_${DATE}.tar.gz" \
-C /data/dify storage/
# Weaviate data
tar -czf "$BACKUP_DIR/weaviate_${DATE}.tar.gz" \
-C /data/dify weaviate/
# Cleanup old backups
find "$BACKUP_DIR" -name "*.dump" -mtime +$RETENTION_DAYS -delete
find "$BACKUP_DIR" -name "*.tar.gz" -mtime +$RETENTION_DAYS -delete
echo "[$(date)] Backup completed"
Zero-Downtime Upgrade Script
#!/bin/bash
# /opt/dify/upgrade.sh
NEW_VERSION=$1
docker pull langgenius/dify-api:$NEW_VERSION
docker pull langgenius/dify-web:$NEW_VERSION
./backup.sh
sed -i "s/dify-api:[0-9.]*/dify-api:$NEW_VERSION/g" docker-compose.prod.yaml
sed -i "s/dify-web:[0-9.]*/dify-web:$NEW_VERSION/g" docker-compose.prod.yaml
# Rolling restart: worker → db migration → api → web
docker compose -f docker-compose.prod.yaml up -d --no-deps worker
sleep 30
docker compose -f docker-compose.prod.yaml exec api flask db upgrade
docker compose -f docker-compose.prod.yaml up -d --no-deps api
sleep 30
docker compose -f docker-compose.prod.yaml up -d --no-deps web
Common Pitfalls
Pitfall 1: Weaviate OOM
Symptom: vector search slows down, container restarts repeatedly.
Fix:
weaviate:
deploy:
resources:
limits:
memory: 4G
Pitfall 2: PostgreSQL connection exhaustion
Error: FATAL: remaining connection slots are reserved for non-replication superuser connections
Fix: Add PgBouncer in transaction pooling mode (see Level 3).
Pitfall 3: SSE streaming buffered by Nginx
Symptom: chat responses appear all at once instead of streaming.
Fix:
location /api/ {
proxy_buffering off;
proxy_cache off;
proxy_set_header Connection '';
proxy_http_version 1.1;
}
Level 3: Source Code and Architecture (5+ Years)
Kubernetes High-Availability Deployment
Cluster layout for a 1,000-person enterprise:
Control plane (3 nodes, HA etcd):
master-01/02/03: 8C/16G each
Worker nodes:
app-nodes (3x): 16C/32G — api, web, worker
db-nodes (2x): 32C/128G NVMe — PostgreSQL, Redis
vector-nodes (2x): 32C/64G — Weaviate
Dify API Deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
name: dify-api
namespace: dify-prod
spec:
replicas: 3
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
template:
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app: dify-api
topologyKey: kubernetes.io/hostname
containers:
- name: api
image: langgenius/dify-api:0.10.0
resources:
requests:
memory: "512Mi"
cpu: "500m"
limits:
memory: "2Gi"
cpu: "2000m"
readinessProbe:
httpGet:
path: /health
port: 5001
initialDelaySeconds: 30
periodSeconds: 10
livenessProbe:
httpGet:
path: /health
port: 5001
initialDelaySeconds: 60
periodSeconds: 30
HPA Auto-scaling:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: dify-api-hpa
namespace: dify-prod
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: dify-api
minReplicas: 3
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
behavior:
scaleDown:
stabilizationWindowSeconds: 300
Ingress with SSL and rate limiting:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: dify-ingress
namespace: dify-prod
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod
nginx.ingress.kubernetes.io/proxy-read-timeout: "300"
nginx.ingress.kubernetes.io/proxy-buffering: "off"
nginx.ingress.kubernetes.io/proxy-body-size: "100m"
nginx.ingress.kubernetes.io/limit-rps: "60"
spec:
tls:
- hosts:
- dify.yourcompany.com
secretName: dify-tls
rules:
- host: dify.yourcompany.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: dify-web-svc
port:
number: 3000
- path: /api
pathType: Prefix
backend:
service:
name: dify-api-svc
port:
number: 5001
PgBouncer Connection Pooling
[databases]
dify = host=pg-primary port=5432 dbname=dify
[pgbouncer]
listen_addr = 0.0.0.0
listen_port = 6432
auth_type = md5
pool_mode = transaction
max_client_conn = 1000
default_pool_size = 50
min_pool_size = 10
server_idle_timeout = 600
Level 4: Production Traps and Decisions (Expert Perspective)
Architecture Decision Framework
Case 1: FinTech company (300 employees)
Chose K8s for "high availability." Result:
- Deployment complexity exceeded team capacity (3 hours per deploy)
- etcd corruption caused 6-hour outage — no one knew how to recover
- Rolled back to Docker Compose with manual standby after 3 months
Lesson: Teams under 300 people without a dedicated K8s engineer should use Docker Compose with scheduled backups and a warm-standby machine.
Case 2: Manufacturing company (800 employees)
Chose K8s but ran PostgreSQL and Weaviate inside the cluster on PVCs. Result:
- PVC IOPS insufficient, query p99 exceeded 2 seconds
- Node maintenance caused risky database pod migrations
Lesson: Run stateless services (api/web/worker) in K8s, keep databases on dedicated physical machines or managed services (RDS/Managed Weaviate).
Storage Backend Comparison
| Option | Pros | Cons | Best For |
|---|---|---|---|
| Local filesystem | Zero config | No multi-replica support | Single-node only |
| NFS | Simple sharing | Poor performance, SPOF | Small-scale testing |
| Ceph/GlusterFS | HA, high performance | Operational complexity | Self-managed K8s |
| AWS S3/Alibaba OSS | Zero ops, high availability | Network latency, cost | Cloud deployments |
| MinIO | S3-compatible, self-hosted | Needs maintenance | Data residency requirements |
MinIO configuration:
STORAGE_TYPE=s3
S3_ENDPOINT=http://minio:9000
S3_BUCKET_NAME=dify-storage
S3_ACCESS_KEY=minio-access-key
S3_SECRET_KEY=minio-secret-key
S3_REGION=us-east-1
Safe Upgrade SOP
1. Run new version on identical staging for ≥ 24 hours
2. Full backup: database + vector store + storage files
3. Schedule maintenance window, display maintenance page
4. Execute upgrade script
5. Smoke test within 10 minutes (cover core user flows)
6. Remove maintenance page
7. Keep old images for 7 days for rollback capability
Security Hardening Checklist
# 1. Never expose Docker API on 0.0.0.0
dockerd --host=unix:///var/run/docker.sock
# 2. Network isolation in docker-compose.yaml
networks:
frontend: {}
backend:
internal: true # No external access
# 3. Drop container privileges
security_opt:
- no-new-privileges:true
cap_drop:
- ALL
# 4. Scan images for CVEs
docker scout cves langgenius/dify-api:0.10.0
# 5. Enable audit logging
ENABLE_AUDIT_LOG=true
Chapter Summary
Key takeaways:
-
Size-appropriate architecture: Docker Compose for < 100 users, hot standby for 100–500, Kubernetes for > 500 (with databases outside K8s).
-
Never use defaults in production:
SECRET_KEY, database passwords, and Redis passwords must be strong, randomly generated values. -
Real high availability requires PostgreSQL primary/standby (Patroni), Redis Sentinel, and multiple API replicas with anti-affinity rules.
-
Shared storage is mandatory in multi-replica deployments — local filesystem causes data inconsistency across pods.
-
Treat every upgrade as a potential data migration — back up first, test on staging, then roll out to production.
-
Security is non-negotiable: network isolation, least-privilege containers, regular CVE scanning.
Quick reference commands:
# Check all service status
docker compose ps
# Monitor container resources
docker stats dify-api-1
# Execute shell in container
docker compose exec api bash
# Check PostgreSQL active connections
docker compose exec db psql -U dify -c "SELECT count(*) FROM pg_stat_activity;"
# Force recreate a single service
docker compose up -d --force-recreate --no-deps api
# K8s: watch pod status
kubectl get pods -n dify-prod -w
# K8s: rolling restart
kubectl rollout restart deployment/dify-api -n dify-prod