Chapter 20

Multi-Platform Gateway Architecture (CLI/Telegram/Discord/Slack)

Chapter 20: Multi-Platform Gateway Architecture (CLI/Telegram/Discord/Slack)

Hermes Agent is not merely a command-line tool — through its gateway architecture, the same Agent core connects to multiple user interface platforms. Whether it's the CLI preferred by engineers, the Slack used for team collaboration, or Telegram where automation bots reside, Hermes can be configured and deployed in a unified manner. This chapter analyzes the architectural differences between platforms, token overhead comparisons, and platform selection guidance.

20.1 Gateway Architecture Overview

Hermes multi-platform support is built on a layered "core + gateway adapter" architecture:

┌──────────────────────────────────────────────────────────────┐
│                    Hermes Agent Core                          │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌─────────────┐ │
│  │LLM Reason│  │Tool Exec │  │  Memory  │  │Skill Orch.  │ │
│  └──────────┘  └──────────┘  └──────────┘  └─────────────┘ │
└──────────────────────────┬───────────────────────────────────┘
                           │ Agent Core API
        ┌──────────────────┼──────────────────┐
        │                  │                  │
   ┌────▼────┐        ┌────▼────┐       ┌────▼────┐
   │CLI Gate │        │Telegram │       │Discord  │
   │         │        │Gateway  │       │Gateway  │
   └─────────┘        └─────────┘       └─────────┘
        │                  │                  │
   ┌────▼────┐        ┌────▼────┐       ┌────▼────┐
   │Terminal │        │TG Users │       │DC Users │
   │  User   │        │         │       │         │
   └─────────┘        └─────────┘       └─────────┘

Each gateway adapter is responsible for:

Listening to platform-specific message events
Converting platform message formats into Hermes's unified message format
Converting Agent responses back to platform format (handling character limits, formatting, etc.)
Managing session state (user ID → session ID mapping)

20.2 CLI Mode Deep Dive

CLI is Hermes's most lightweight operating mode, enabling direct terminal interaction with the Agent.

Launch Options

# Interactive chat mode
hermes chat

# Single-turn mode (suitable for scripting)
hermes run "Analyze the most common error types in ./logs/error.log"

# With custom configuration
hermes chat --config ./hermes_prod.yaml

# With system prompt
hermes chat --system "You are a professional DevOps engineer specializing in Kubernetes operations"

# Load workspace (auto-discover local Skills)
hermes chat --workspace ./my_project/

# Debug mode (show tool call details and token statistics)
hermes chat --debug

# Specify model
hermes chat --model nous-hermes-2-mixtral-8x7b-dpo

Session Management

# List past sessions
hermes session list

# Resume a historical session
hermes session resume session_20240115_103022

# Export session record
hermes session export session_abc123 --format json > session.json

# Delete a session
hermes session delete session_abc123

CLI Mode Token Profile

CLI mode has the most streamlined prompt structure. Typical context consumption:

System prompt:        ~800 tokens  (capability description + tool list)
User message:         ~50-200 tokens
Conversation history: ~0-2000 tokens (depends on configured memory length)
Tool call results:    ~0-1000 tokens (proportional to tool calls made)
─────────────────────────────────────────────────────────────
Typical total range:  850 - 4,000 tokens/turn

20.3 Gateway Mode: Token Overhead Analysis

Gateway modes (Telegram/Discord/Slack) must handle platform metadata, permission management, message routing, and other additional logic — resulting in significantly higher token overhead than CLI mode.

Token Overhead Comparison

Component	CLI Mode	Gateway Mode	Difference
Base system prompt	~800 tokens	~800 tokens	Same
Platform context injection	0 tokens	~200-400 tokens	User identity, channel, permission info
Message routing instructions	0 tokens	~100-200 tokens	Multi-user message dispatch logic
Concurrent session management	0 tokens	~150-300 tokens	Multi-user session state
Format constraints	0 tokens	~50-150 tokens	Platform character limit notices
Total extra overhead	0	500-1,050 tokens	2-3x the base overhead

Measured data (GPT-4 Turbo 128K, 20-turn conversation scenario):

CLI mode average context:      3,200 tokens
Telegram gateway average:      7,800 tokens  (+144%)
Discord gateway average:       8,100 tokens  (+153%)
Slack gateway average:         8,900 tokens  (+178%)

Why Gateway Mode Costs More

# Example of additional system context injected in gateway mode
GATEWAY_SYSTEM_INJECTION = """
## Current Runtime Environment
- Platform: Telegram
- Channel type: group_chat
- Channel ID: -1001234567890
- Channel name: AI Research Team

## Current User Information
- User ID: 123456789
- Username: @alice_researcher
- Display name: Alice
- Role permissions: member (non-admin)
- Language preference: en-US
- Timezone: America/New_York

## Message Format Constraints
- Telegram max single message: 4,096 characters
- Responses exceeding limit will be auto-split
- Markdown V2 format is supported
- Inline images not supported (use URL links instead)

## Multi-User Session Isolation
- Active users: 3 (Alice/Bob/Charlie)
- This message is from Alice (@alice_researcher)
- Other users' conversations are in isolated contexts
"""

20.4 Platform Gateway Deep Dives

20.4.1 Telegram Gateway

Configuration:

gateways:
  telegram:
    enabled: true
    bot_token: "${TELEGRAM_BOT_TOKEN}"
    
    allowed_users: []
    allowed_groups: [-1001234567890]
    admin_users: [123456789]
    
    max_message_length: 4096
    auto_split_long_messages: true
    typing_indicator: true
    
    handle_photos: true
    handle_documents: true
    handle_voice: true
    
    enable_inline_buttons: true
    approval_timeout_seconds: 300

Unique capabilities:

Inline keyboard buttons (InlineKeyboardMarkup) for Agent action approvals
Automatic voice message transcription
Bot commands (/start, /help, /status)
Channel post publishing mode

hermes gateway telegram start
# Background mode
hermes gateway telegram start --daemon

20.4.2 Discord Gateway

Configuration:

gateways:
  discord:
    enabled: true
    bot_token: "${DISCORD_BOT_TOKEN}"
    application_id: "${DISCORD_APP_ID}"
    
    register_slash_commands: true
    slash_commands:
      - name: "ask"
        description: "Ask Hermes Agent a question"
      - name: "research"
        description: "Run a deep research task"
    
    allowed_channel_ids: [123456789, 987654321]
    
    max_message_length: 2000
    use_embeds: true
    max_embed_fields: 25
    
    create_thread_for_long_tasks: true
    thread_auto_archive_minutes: 60

Unique capabilities:

Slash commands (/ask, /research, etc.)
Rich embed cards (fields, thumbnails, color coding)
Automatic thread creation for long-running task isolation
Discord Webhooks for proactive push notifications

20.4.3 Slack Gateway

Configuration:

gateways:
  slack:
    enabled: true
    bot_token: "${SLACK_BOT_TOKEN}"
    app_token: "${SLACK_APP_TOKEN}"    # Required for Socket Mode
    
    event_subscriptions:
      - app_mention
      - message.im
    
    use_block_kit: true
    enable_workflow_steps: true
    
    default_channel: "#hermes-bot"
    allowed_channels: ["#ai-team", "#dev-tools"]

Unique capabilities:

Block Kit rich message formatting
Workflow Builder integration (no-code automation)
Slack Actions (in-message button interactions)
App Home tab (personal dashboard)
Thread replies for clean conversation organization

20.4.4 WhatsApp Special Restrictions

WhatsApp gateway carries critical restrictions:

Restriction	Details
Dedicated phone number	Must register a separate WhatsApp Business number — personal numbers cannot be reused
Session window	After a user initiates contact, bot has a 24-hour free-reply window
Template messages	Outside the 24h window, only pre-approved template messages can be sent
API pricing	WhatsApp Business API charges per conversation
Format limits	No Markdown support; message formatting is highly restricted

gateways:
  whatsapp:
    enabled: true
    provider: "360dialog"
    api_key: "${WA_API_KEY}"
    phone_number_id: "${WA_PHONE_ID}"
    business_phone: "+1234567890"
    templates:
      task_complete: "wa_template_task_done_v1"
      error_notify: "wa_template_error_v2"

20.5 Unified Multi-Platform Configuration

Hermes supports layered configuration to maintain consistent Agent behavior across platforms:

# hermes_config.yaml — Full multi-platform example

# Global config (shared by all platforms)
global:
  model: "nous-hermes-2-mixtral-8x7b-dpo"
  temperature: 0.7
  max_tokens: 4096
  system_prompt: |
    You are the YiteAI assistant, focused on helping users solve technical problems.
    Always respond in the user's language.

memory:
  provider: "redis"
  redis_url: "${REDIS_URL}"
  session_ttl_hours: 24
  max_history_turns: 20

tools:
  enabled_categories: [system, network, file, code]
  permission_profile: "readonly_research"

# Platform-specific config (overrides global settings)
gateways:
  telegram:
    enabled: true
    bot_token: "${TELEGRAM_BOT_TOKEN}"
    model_overrides:
      temperature: 0.5    # Telegram: more conservative, stricter answers
    
  discord:
    enabled: true
    bot_token: "${DISCORD_BOT_TOKEN}"
    application_id: "${DISCORD_APP_ID}"
    permission_profile_override: "full_developer"   # Discord: broader tool access
    
  slack:
    enabled: true
    bot_token: "${SLACK_BOT_TOKEN}"
    app_token: "${SLACK_APP_TOKEN}"
    extra_tools: ["jira_create_ticket", "confluence_search"]  # Enterprise integration

20.6 Platform Selection Guide

Use Case	Recommended Platform	Rationale
Individual developer daily use	CLI	Lowest token overhead, full features, no latency
Small tech team collaboration	Slack	Deep engineering toolchain integration, Workflow support
Community / public bot	Discord	Excellent Slash Command UX, strong thread support
Private personal/small team use	Telegram	Good privacy, simple deployment
Customer service scenarios	WhatsApp	Largest user base, but 24h window restriction
CI/CD pipeline integration	CLI (API mode)	Script-friendly, supports non-interactive mode

Production Deployment with Docker Compose

# docker-compose.yml
version: '3.8'
services:
  hermes-telegram:
    image: nousresearch/hermes-agent:4.0
    command: ["hermes", "gateway", "telegram", "start"]
    environment:
      - TELEGRAM_BOT_TOKEN=${TELEGRAM_BOT_TOKEN}
      - REDIS_URL=redis://redis:6379
    depends_on: [redis]
  
  hermes-slack:
    image: nousresearch/hermes-agent:4.0
    command: ["hermes", "gateway", "slack", "start"]
    environment:
      - SLACK_BOT_TOKEN=${SLACK_BOT_TOKEN}
      - SLACK_APP_TOKEN=${SLACK_APP_TOKEN}
      - REDIS_URL=redis://redis:6379
    depends_on: [redis]
  
  redis:
    image: redis:7-alpine
    volumes:
      - redis_data:/data

20.7 Summary

This chapter systematically covered the Hermes Agent multi-platform gateway architecture:

Layered architecture: Agent core and platform gateways are decoupled for independent scaling
Token overhead gap: Gateway mode consumes 2-3x more tokens than CLI (an extra 500-1,050 tokens per turn)
Four platform comparison: CLI / Telegram / Discord / Slack each excel in specific scenarios
WhatsApp restrictions: Requires a dedicated number and imposes a 24h conversation window
Unified configuration: Layered YAML supports global config with per-platform overrides

Thoughtful platform selection achieves the optimal balance between user experience and operational cost.

Review Questions

The extra system context injected in gateway mode (user identity, channel info, etc.) consumes valuable context window space. How would you design a "context compression" scheme that reduces token consumption without losing critical information?
When the same user interacts with Hermes simultaneously on both Telegram and Slack, should the two platform sessions share context or remain isolated? How would you design a cross-platform session merging strategy?
Discord Slash Commands must be pre-registered with the Discord API, meaning changes to the tool list require manually triggering a registration update. How would you design an auto-sync mechanism to refresh the Slash Command list when Skills or Tools are updated?

Rate this chapter

4.8 / 5 (13 ratings)

Multi-Platform Gateway Architecture (CLI/Telegram/Discord/Slack)

Chapter 20: Multi-Platform Gateway Architecture (CLI/Telegram/Discord/Slack)

20.1 Gateway Architecture Overview

20.2 CLI Mode Deep Dive

Launch Options

Session Management

CLI Mode Token Profile

20.3 Gateway Mode: Token Overhead Analysis

Token Overhead Comparison

Why Gateway Mode Costs More

20.4 Platform Gateway Deep Dives

20.4.1 Telegram Gateway

20.4.2 Discord Gateway

20.4.3 Slack Gateway

20.4.4 WhatsApp Special Restrictions

20.5 Unified Multi-Platform Configuration

20.6 Platform Selection Guide

Production Deployment with Docker Compose

20.7 Summary

Review Questions

💬 Comments