Chapter 19

Self-Hosted Deployment: Docker Compose, Kubernetes and High Availability

Chapter 19: Private Deployment — Docker Compose / K8s / High-Availability Architecture

Moving Dify from cloud SaaS to your own infrastructure preserves data sovereignty and enables enterprise-scale stability — this chapter delivers production-ready blueprints you can deploy today.

Chapter Overview

Most teams reach the same inflection point after months on Dify Cloud: data residency requirements, compliance audits demanding raw log access, or throttling under peak load. Private deployment is not just docker compose up — it requires decisions about networking, storage, service orchestration, rolling upgrades, and disaster recovery.

This chapter scales through three tiers:

Single-node Docker Compose — PoC / small teams (< 50 users)
Multi-node with hot standby — mid-size teams (50–500 users)
Kubernetes HA cluster — large enterprises (> 500 users, SLA ≥ 99.9%)

By the end, you will be able to:

Complete a production-grade private Dify deployment independently
Choose the right architecture for your scale
Configure Nginx reverse proxy, SSL termination, and session affinity
Design failover and zero-downtime upgrade workflows

Level 1: Core Concepts (1–3 Years Experience)

Dify Service Landscape

Before touching a terminal, understand what Dify actually runs:

Service	Role	Exposed Port
`api`	Backend API (Flask)	5001
`worker`	Celery async tasks (document indexing)	none
`web`	Frontend Next.js app	3000
`db`	PostgreSQL	5432
`redis`	Cache + message queue	6379
`weaviate`	Vector database (default)	8080
`sandbox`	Code execution sandbox	8194
`nginx`	Reverse proxy entry point	80/443

Mental model: Think of Dify as a restaurant. web is the front-of-house host, api is the head chef, worker is the dishwasher handling slow tasks, db is the cold storage, redis is the expediting window for fast retrieval, weaviate is the recipe index, and nginx is the front door bouncer.

Single-Node Docker Compose Deployment

Minimum production hardware:

CPU: 4 cores
RAM: 8 GB (16 GB recommended)
Disk: 100 GB SSD+
OS: Ubuntu 22.04 LTS / Debian 12

Step 1: Install Docker

curl -fsSL https://get.docker.com | sh
sudo usermod -aG docker $USER
newgrp docker

docker --version        # Docker version 24.x
docker compose version  # Docker Compose version v2.x

Step 2: Clone and configure

git clone https://github.com/langgenius/dify.git
cd dify/docker
cp .env.example .env

Step 3: Critical environment variables

# .env — must-change items for production

SECRET_KEY=your-super-secret-key-$(openssl rand -hex 32)

DB_USERNAME=dify
DB_PASSWORD=StrongPasswordHere
DB_HOST=db
DB_PORT=5432
DB_DATABASE=dify

REDIS_HOST=redis
REDIS_PORT=6379
REDIS_PASSWORD=StrongRedisPassword

VECTOR_STORE=weaviate

STORAGE_TYPE=local
STORAGE_LOCAL_PATH=/app/api/storage

CONSOLE_WEB_URL=https://dify.yourcompany.com
APP_WEB_URL=https://dify.yourcompany.com

Step 4: Start services

docker compose up -d
docker compose ps
docker compose logs -f api

Step 5: Initialize admin account

After startup, visit http://your-server-ip and complete the setup wizard.

Nginx Reverse Proxy Configuration

# /etc/nginx/sites-available/dify.conf

upstream dify_web {
    server 127.0.0.1:3000;
}

upstream dify_api {
    server 127.0.0.1:5001;
}

server {
    listen 80;
    server_name dify.yourcompany.com;
    return 301 https://$host$request_uri;
}

server {
    listen 443 ssl http2;
    server_name dify.yourcompany.com;

    ssl_certificate     /etc/ssl/certs/dify.crt;
    ssl_certificate_key /etc/ssl/private/dify.key;
    ssl_protocols       TLSv1.2 TLSv1.3;

    location / {
        proxy_pass http://dify_web;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
    }

    location /api/ {
        proxy_pass http://dify_api;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_read_timeout 300s;
        proxy_send_timeout 300s;
        # SSE streaming support
        proxy_buffering off;
        proxy_cache off;
    }

    client_max_body_size 100m;
}

Level 2: Mechanism Deep Dive (3–5 Years Experience)

Production-Grade Docker Compose

# docker-compose.prod.yaml
version: '3.8'

services:
  api:
    image: langgenius/dify-api:0.10.0
    restart: always
    environment:
      MODE: api
      LOG_LEVEL: INFO
      SECRET_KEY: ${SECRET_KEY}
      DB_USERNAME: ${DB_USERNAME}
      DB_PASSWORD: ${DB_PASSWORD}
      DB_HOST: db
      DB_DATABASE: ${DB_DATABASE}
      REDIS_HOST: redis
      REDIS_PASSWORD: ${REDIS_PASSWORD}
      CELERY_BROKER_URL: redis://:${REDIS_PASSWORD}@redis:6379/1
      VECTOR_STORE: weaviate
      WEAVIATE_ENDPOINT: http://weaviate:8080
      STORAGE_TYPE: local
    volumes:
      - dify_storage:/app/api/storage
    depends_on:
      db:
        condition: service_healthy
      redis:
        condition: service_healthy
    deploy:
      resources:
        limits:
          memory: 2G
          cpus: '2'
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:5001/health"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 60s
    logging:
      driver: "json-file"
      options:
        max-size: "100m"
        max-file: "3"

  worker:
    image: langgenius/dify-api:0.10.0
    restart: always
    environment:
      MODE: worker
      LOG_LEVEL: INFO
      SECRET_KEY: ${SECRET_KEY}
      DB_USERNAME: ${DB_USERNAME}
      DB_PASSWORD: ${DB_PASSWORD}
      DB_HOST: db
      DB_DATABASE: ${DB_DATABASE}
      REDIS_HOST: redis
      REDIS_PASSWORD: ${REDIS_PASSWORD}
      CELERY_BROKER_URL: redis://:${REDIS_PASSWORD}@redis:6379/1
    volumes:
      - dify_storage:/app/api/storage
    deploy:
      resources:
        limits:
          memory: 4G
          cpus: '4'

  db:
    image: postgres:15-alpine
    restart: always
    environment:
      POSTGRES_USER: ${DB_USERNAME}
      POSTGRES_PASSWORD: ${DB_PASSWORD}
      POSTGRES_DB: ${DB_DATABASE}
    command: >
      postgres
      -c shared_buffers=256MB
      -c max_connections=200
      -c work_mem=4MB
      -c effective_cache_size=512MB
    volumes:
      - postgres_data:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U ${DB_USERNAME}"]
      interval: 10s
      timeout: 5s
      retries: 5

  redis:
    image: redis:7-alpine
    restart: always
    command: >
      redis-server
      --requirepass ${REDIS_PASSWORD}
      --maxmemory 512mb
      --maxmemory-policy allkeys-lru
      --appendonly yes
    volumes:
      - redis_data:/data
    healthcheck:
      test: ["CMD", "redis-cli", "-a", "${REDIS_PASSWORD}", "ping"]
      interval: 10s
      timeout: 5s
      retries: 5

volumes:
  postgres_data:
    driver: local
    driver_opts:
      type: none
      o: bind
      device: /data/dify/postgres
  redis_data:
    driver: local
    driver_opts:
      type: none
      o: bind
      device: /data/dify/redis
  dify_storage:
    driver: local
    driver_opts:
      type: none
      o: bind
      device: /data/dify/storage

Backup Strategy

#!/bin/bash
# /opt/dify/backup.sh

BACKUP_DIR="/backup/dify"
DATE=$(date +%Y%m%d_%H%M%S)
RETENTION_DAYS=30

mkdir -p "$BACKUP_DIR"

# PostgreSQL dump
docker exec dify-db-1 pg_dump \
  -U $DB_USERNAME \
  -d $DB_DATABASE \
  --format=custom \
  > "$BACKUP_DIR/postgres_${DATE}.dump"

# Storage files
tar -czf "$BACKUP_DIR/storage_${DATE}.tar.gz" \
  -C /data/dify storage/

# Weaviate data
tar -czf "$BACKUP_DIR/weaviate_${DATE}.tar.gz" \
  -C /data/dify weaviate/

# Cleanup old backups
find "$BACKUP_DIR" -name "*.dump" -mtime +$RETENTION_DAYS -delete
find "$BACKUP_DIR" -name "*.tar.gz" -mtime +$RETENTION_DAYS -delete

echo "[$(date)] Backup completed"

Zero-Downtime Upgrade Script

#!/bin/bash
# /opt/dify/upgrade.sh
NEW_VERSION=$1

docker pull langgenius/dify-api:$NEW_VERSION
docker pull langgenius/dify-web:$NEW_VERSION

./backup.sh

sed -i "s/dify-api:[0-9.]*/dify-api:$NEW_VERSION/g" docker-compose.prod.yaml
sed -i "s/dify-web:[0-9.]*/dify-web:$NEW_VERSION/g" docker-compose.prod.yaml

# Rolling restart: worker → db migration → api → web
docker compose -f docker-compose.prod.yaml up -d --no-deps worker
sleep 30

docker compose -f docker-compose.prod.yaml exec api flask db upgrade

docker compose -f docker-compose.prod.yaml up -d --no-deps api
sleep 30

docker compose -f docker-compose.prod.yaml up -d --no-deps web

Common Pitfalls

Pitfall 1: Weaviate OOM

Symptom: vector search slows down, container restarts repeatedly.

Fix:

weaviate:
  deploy:
    resources:
      limits:
        memory: 4G

Pitfall 2: PostgreSQL connection exhaustion

Error: FATAL: remaining connection slots are reserved for non-replication superuser connections

Fix: Add PgBouncer in transaction pooling mode (see Level 3).

Pitfall 3: SSE streaming buffered by Nginx

Symptom: chat responses appear all at once instead of streaming.

Fix:

location /api/ {
    proxy_buffering off;
    proxy_cache off;
    proxy_set_header Connection '';
    proxy_http_version 1.1;
}

Level 3: Source Code and Architecture (5+ Years)

Kubernetes High-Availability Deployment

Cluster layout for a 1,000-person enterprise:

Control plane (3 nodes, HA etcd):
  master-01/02/03: 8C/16G each

Worker nodes:
  app-nodes (3x): 16C/32G — api, web, worker
  db-nodes  (2x): 32C/128G NVMe — PostgreSQL, Redis
  vector-nodes (2x): 32C/64G — Weaviate

Dify API Deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: dify-api
  namespace: dify-prod
spec:
  replicas: 3
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
  template:
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchLabels:
                app: dify-api
            topologyKey: kubernetes.io/hostname
      containers:
      - name: api
        image: langgenius/dify-api:0.10.0
        resources:
          requests:
            memory: "512Mi"
            cpu: "500m"
          limits:
            memory: "2Gi"
            cpu: "2000m"
        readinessProbe:
          httpGet:
            path: /health
            port: 5001
          initialDelaySeconds: 30
          periodSeconds: 10
        livenessProbe:
          httpGet:
            path: /health
            port: 5001
          initialDelaySeconds: 60
          periodSeconds: 30

HPA Auto-scaling:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: dify-api-hpa
  namespace: dify-prod
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: dify-api
  minReplicas: 3
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300

Ingress with SSL and rate limiting:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: dify-ingress
  namespace: dify-prod
  annotations:
    cert-manager.io/cluster-issuer: letsencrypt-prod
    nginx.ingress.kubernetes.io/proxy-read-timeout: "300"
    nginx.ingress.kubernetes.io/proxy-buffering: "off"
    nginx.ingress.kubernetes.io/proxy-body-size: "100m"
    nginx.ingress.kubernetes.io/limit-rps: "60"
spec:
  tls:
  - hosts:
    - dify.yourcompany.com
    secretName: dify-tls
  rules:
  - host: dify.yourcompany.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: dify-web-svc
            port:
              number: 3000
      - path: /api
        pathType: Prefix
        backend:
          service:
            name: dify-api-svc
            port:
              number: 5001

PgBouncer Connection Pooling

[databases]
dify = host=pg-primary port=5432 dbname=dify

[pgbouncer]
listen_addr = 0.0.0.0
listen_port = 6432
auth_type = md5
pool_mode = transaction
max_client_conn = 1000
default_pool_size = 50
min_pool_size = 10
server_idle_timeout = 600

Level 4: Production Traps and Decisions (Expert Perspective)

Architecture Decision Framework

Case 1: FinTech company (300 employees)

Chose K8s for "high availability." Result:

Deployment complexity exceeded team capacity (3 hours per deploy)
etcd corruption caused 6-hour outage — no one knew how to recover
Rolled back to Docker Compose with manual standby after 3 months

Lesson: Teams under 300 people without a dedicated K8s engineer should use Docker Compose with scheduled backups and a warm-standby machine.

Case 2: Manufacturing company (800 employees)

Chose K8s but ran PostgreSQL and Weaviate inside the cluster on PVCs. Result:

PVC IOPS insufficient, query p99 exceeded 2 seconds
Node maintenance caused risky database pod migrations

Lesson: Run stateless services (api/web/worker) in K8s, keep databases on dedicated physical machines or managed services (RDS/Managed Weaviate).

Storage Backend Comparison

Option	Pros	Cons	Best For
Local filesystem	Zero config	No multi-replica support	Single-node only
NFS	Simple sharing	Poor performance, SPOF	Small-scale testing
Ceph/GlusterFS	HA, high performance	Operational complexity	Self-managed K8s
AWS S3/Alibaba OSS	Zero ops, high availability	Network latency, cost	Cloud deployments
MinIO	S3-compatible, self-hosted	Needs maintenance	Data residency requirements

MinIO configuration:

STORAGE_TYPE=s3
S3_ENDPOINT=http://minio:9000
S3_BUCKET_NAME=dify-storage
S3_ACCESS_KEY=minio-access-key
S3_SECRET_KEY=minio-secret-key
S3_REGION=us-east-1

Safe Upgrade SOP

1. Run new version on identical staging for ≥ 24 hours
2. Full backup: database + vector store + storage files
3. Schedule maintenance window, display maintenance page
4. Execute upgrade script
5. Smoke test within 10 minutes (cover core user flows)
6. Remove maintenance page
7. Keep old images for 7 days for rollback capability

Security Hardening Checklist

# 1. Never expose Docker API on 0.0.0.0
dockerd --host=unix:///var/run/docker.sock

# 2. Network isolation in docker-compose.yaml
networks:
  frontend: {}
  backend:
    internal: true  # No external access

# 3. Drop container privileges
security_opt:
  - no-new-privileges:true
cap_drop:
  - ALL

# 4. Scan images for CVEs
docker scout cves langgenius/dify-api:0.10.0

# 5. Enable audit logging
ENABLE_AUDIT_LOG=true

Chapter Summary

Key takeaways:

Size-appropriate architecture: Docker Compose for < 100 users, hot standby for 100–500, Kubernetes for > 500 (with databases outside K8s).
Never use defaults in production: SECRET_KEY, database passwords, and Redis passwords must be strong, randomly generated values.
Real high availability requires PostgreSQL primary/standby (Patroni), Redis Sentinel, and multiple API replicas with anti-affinity rules.
Shared storage is mandatory in multi-replica deployments — local filesystem causes data inconsistency across pods.
Treat every upgrade as a potential data migration — back up first, test on staging, then roll out to production.
Security is non-negotiable: network isolation, least-privilege containers, regular CVE scanning.

Quick reference commands:

# Check all service status
docker compose ps

# Monitor container resources
docker stats dify-api-1

# Execute shell in container
docker compose exec api bash

# Check PostgreSQL active connections
docker compose exec db psql -U dify -c "SELECT count(*) FROM pg_stat_activity;"

# Force recreate a single service
docker compose up -d --force-recreate --no-deps api

# K8s: watch pod status
kubectl get pods -n dify-prod -w

# K8s: rolling restart
kubectl rollout restart deployment/dify-api -n dify-prod

Rate this chapter

4.6 / 5 (10 ratings)