Multi-Platform Gateway Architecture (CLI/Telegram/Discord/Slack)
Chapter 20: Multi-Platform Gateway Architecture (CLI/Telegram/Discord/Slack)
Hermes Agent is not merely a command-line tool โ through its gateway architecture, the same Agent core connects to multiple user interface platforms. Whether it's the CLI preferred by engineers, the Slack used for team collaboration, or Telegram where automation bots reside, Hermes can be configured and deployed in a unified manner. This chapter analyzes the architectural differences between platforms, token overhead comparisons, and platform selection guidance.
20.1 Gateway Architecture Overview
Hermes multi-platform support is built on a layered "core + gateway adapter" architecture:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Hermes Agent Core โ
โ โโโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโโโโโ โ
โ โLLM Reasonโ โTool Exec โ โ Memory โ โSkill Orch. โ โ
โ โโโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Agent Core API
โโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโ
โ โ โ
โโโโโโผโโโโโ โโโโโโผโโโโโ โโโโโโผโโโโโ
โCLI Gate โ โTelegram โ โDiscord โ
โ โ โGateway โ โGateway โ
โโโโโโโโโโโ โโโโโโโโโโโ โโโโโโโโโโโ
โ โ โ
โโโโโโผโโโโโ โโโโโโผโโโโโ โโโโโโผโโโโโ
โTerminal โ โTG Users โ โDC Users โ
โ User โ โ โ โ โ
โโโโโโโโโโโ โโโโโโโโโโโ โโโโโโโโโโโ
Each gateway adapter is responsible for:
- Listening to platform-specific message events
- Converting platform message formats into Hermes's unified message format
- Converting Agent responses back to platform format (handling character limits, formatting, etc.)
- Managing session state (user ID โ session ID mapping)
20.2 CLI Mode Deep Dive
CLI is Hermes's most lightweight operating mode, enabling direct terminal interaction with the Agent.
Launch Options
# Interactive chat mode
hermes chat
# Single-turn mode (suitable for scripting)
hermes run "Analyze the most common error types in ./logs/error.log"
# With custom configuration
hermes chat --config ./hermes_prod.yaml
# With system prompt
hermes chat --system "You are a professional DevOps engineer specializing in Kubernetes operations"
# Load workspace (auto-discover local Skills)
hermes chat --workspace ./my_project/
# Debug mode (show tool call details and token statistics)
hermes chat --debug
# Specify model
hermes chat --model nous-hermes-2-mixtral-8x7b-dpo
Session Management
# List past sessions
hermes session list
# Resume a historical session
hermes session resume session_20240115_103022
# Export session record
hermes session export session_abc123 --format json > session.json
# Delete a session
hermes session delete session_abc123
CLI Mode Token Profile
CLI mode has the most streamlined prompt structure. Typical context consumption:
System prompt: ~800 tokens (capability description + tool list)
User message: ~50-200 tokens
Conversation history: ~0-2000 tokens (depends on configured memory length)
Tool call results: ~0-1000 tokens (proportional to tool calls made)
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Typical total range: 850 - 4,000 tokens/turn
20.3 Gateway Mode: Token Overhead Analysis
Gateway modes (Telegram/Discord/Slack) must handle platform metadata, permission management, message routing, and other additional logic โ resulting in significantly higher token overhead than CLI mode.
Token Overhead Comparison
| Component | CLI Mode | Gateway Mode | Difference |
|---|---|---|---|
| Base system prompt | ~800 tokens | ~800 tokens | Same |
| Platform context injection | 0 tokens | ~200-400 tokens | User identity, channel, permission info |
| Message routing instructions | 0 tokens | ~100-200 tokens | Multi-user message dispatch logic |
| Concurrent session management | 0 tokens | ~150-300 tokens | Multi-user session state |
| Format constraints | 0 tokens | ~50-150 tokens | Platform character limit notices |
| Total extra overhead | 0 | 500-1,050 tokens | 2-3x the base overhead |
Measured data (GPT-4 Turbo 128K, 20-turn conversation scenario):
CLI mode average context: 3,200 tokens
Telegram gateway average: 7,800 tokens (+144%)
Discord gateway average: 8,100 tokens (+153%)
Slack gateway average: 8,900 tokens (+178%)
Why Gateway Mode Costs More
# Example of additional system context injected in gateway mode
GATEWAY_SYSTEM_INJECTION = """
## Current Runtime Environment
- Platform: Telegram
- Channel type: group_chat
- Channel ID: -1001234567890
- Channel name: AI Research Team
## Current User Information
- User ID: 123456789
- Username: @alice_researcher
- Display name: Alice
- Role permissions: member (non-admin)
- Language preference: en-US
- Timezone: America/New_York
## Message Format Constraints
- Telegram max single message: 4,096 characters
- Responses exceeding limit will be auto-split
- Markdown V2 format is supported
- Inline images not supported (use URL links instead)
## Multi-User Session Isolation
- Active users: 3 (Alice/Bob/Charlie)
- This message is from Alice (@alice_researcher)
- Other users' conversations are in isolated contexts
"""
20.4 Platform Gateway Deep Dives
20.4.1 Telegram Gateway
Configuration:
gateways:
telegram:
enabled: true
bot_token: "${TELEGRAM_BOT_TOKEN}"
allowed_users: []
allowed_groups: [-1001234567890]
admin_users: [123456789]
max_message_length: 4096
auto_split_long_messages: true
typing_indicator: true
handle_photos: true
handle_documents: true
handle_voice: true
enable_inline_buttons: true
approval_timeout_seconds: 300
Unique capabilities:
- Inline keyboard buttons (InlineKeyboardMarkup) for Agent action approvals
- Automatic voice message transcription
- Bot commands (/start, /help, /status)
- Channel post publishing mode
hermes gateway telegram start
# Background mode
hermes gateway telegram start --daemon
20.4.2 Discord Gateway
Configuration:
gateways:
discord:
enabled: true
bot_token: "${DISCORD_BOT_TOKEN}"
application_id: "${DISCORD_APP_ID}"
register_slash_commands: true
slash_commands:
- name: "ask"
description: "Ask Hermes Agent a question"
- name: "research"
description: "Run a deep research task"
allowed_channel_ids: [123456789, 987654321]
max_message_length: 2000
use_embeds: true
max_embed_fields: 25
create_thread_for_long_tasks: true
thread_auto_archive_minutes: 60
Unique capabilities:
- Slash commands (/ask, /research, etc.)
- Rich embed cards (fields, thumbnails, color coding)
- Automatic thread creation for long-running task isolation
- Discord Webhooks for proactive push notifications
20.4.3 Slack Gateway
Configuration:
gateways:
slack:
enabled: true
bot_token: "${SLACK_BOT_TOKEN}"
app_token: "${SLACK_APP_TOKEN}" # Required for Socket Mode
event_subscriptions:
- app_mention
- message.im
use_block_kit: true
enable_workflow_steps: true
default_channel: "#hermes-bot"
allowed_channels: ["#ai-team", "#dev-tools"]
Unique capabilities:
- Block Kit rich message formatting
- Workflow Builder integration (no-code automation)
- Slack Actions (in-message button interactions)
- App Home tab (personal dashboard)
- Thread replies for clean conversation organization
20.4.4 WhatsApp Special Restrictions
WhatsApp gateway carries critical restrictions:
| Restriction | Details |
|---|---|
| Dedicated phone number | Must register a separate WhatsApp Business number โ personal numbers cannot be reused |
| Session window | After a user initiates contact, bot has a 24-hour free-reply window |
| Template messages | Outside the 24h window, only pre-approved template messages can be sent |
| API pricing | WhatsApp Business API charges per conversation |
| Format limits | No Markdown support; message formatting is highly restricted |
gateways:
whatsapp:
enabled: true
provider: "360dialog"
api_key: "${WA_API_KEY}"
phone_number_id: "${WA_PHONE_ID}"
business_phone: "+1234567890"
templates:
task_complete: "wa_template_task_done_v1"
error_notify: "wa_template_error_v2"
20.5 Unified Multi-Platform Configuration
Hermes supports layered configuration to maintain consistent Agent behavior across platforms:
# hermes_config.yaml โ Full multi-platform example
# Global config (shared by all platforms)
global:
model: "nous-hermes-2-mixtral-8x7b-dpo"
temperature: 0.7
max_tokens: 4096
system_prompt: |
You are the YiteAI assistant, focused on helping users solve technical problems.
Always respond in the user's language.
memory:
provider: "redis"
redis_url: "${REDIS_URL}"
session_ttl_hours: 24
max_history_turns: 20
tools:
enabled_categories: [system, network, file, code]
permission_profile: "readonly_research"
# Platform-specific config (overrides global settings)
gateways:
telegram:
enabled: true
bot_token: "${TELEGRAM_BOT_TOKEN}"
model_overrides:
temperature: 0.5 # Telegram: more conservative, stricter answers
discord:
enabled: true
bot_token: "${DISCORD_BOT_TOKEN}"
application_id: "${DISCORD_APP_ID}"
permission_profile_override: "full_developer" # Discord: broader tool access
slack:
enabled: true
bot_token: "${SLACK_BOT_TOKEN}"
app_token: "${SLACK_APP_TOKEN}"
extra_tools: ["jira_create_ticket", "confluence_search"] # Enterprise integration
20.6 Platform Selection Guide
| Use Case | Recommended Platform | Rationale |
|---|---|---|
| Individual developer daily use | CLI | Lowest token overhead, full features, no latency |
| Small tech team collaboration | Slack | Deep engineering toolchain integration, Workflow support |
| Community / public bot | Discord | Excellent Slash Command UX, strong thread support |
| Private personal/small team use | Telegram | Good privacy, simple deployment |
| Customer service scenarios | Largest user base, but 24h window restriction | |
| CI/CD pipeline integration | CLI (API mode) | Script-friendly, supports non-interactive mode |
Production Deployment with Docker Compose
# docker-compose.yml
version: '3.8'
services:
hermes-telegram:
image: nousresearch/hermes-agent:4.0
command: ["hermes", "gateway", "telegram", "start"]
environment:
- TELEGRAM_BOT_TOKEN=${TELEGRAM_BOT_TOKEN}
- REDIS_URL=redis://redis:6379
depends_on: [redis]
hermes-slack:
image: nousresearch/hermes-agent:4.0
command: ["hermes", "gateway", "slack", "start"]
environment:
- SLACK_BOT_TOKEN=${SLACK_BOT_TOKEN}
- SLACK_APP_TOKEN=${SLACK_APP_TOKEN}
- REDIS_URL=redis://redis:6379
depends_on: [redis]
redis:
image: redis:7-alpine
volumes:
- redis_data:/data
20.7 Summary
This chapter systematically covered the Hermes Agent multi-platform gateway architecture:
- Layered architecture: Agent core and platform gateways are decoupled for independent scaling
- Token overhead gap: Gateway mode consumes 2-3x more tokens than CLI (an extra 500-1,050 tokens per turn)
- Four platform comparison: CLI / Telegram / Discord / Slack each excel in specific scenarios
- WhatsApp restrictions: Requires a dedicated number and imposes a 24h conversation window
- Unified configuration: Layered YAML supports global config with per-platform overrides
Thoughtful platform selection achieves the optimal balance between user experience and operational cost.
Review Questions
-
The extra system context injected in gateway mode (user identity, channel info, etc.) consumes valuable context window space. How would you design a "context compression" scheme that reduces token consumption without losing critical information?
-
When the same user interacts with Hermes simultaneously on both Telegram and Slack, should the two platform sessions share context or remain isolated? How would you design a cross-platform session merging strategy?
-
Discord Slash Commands must be pre-registered with the Discord API, meaning changes to the tool list require manually triggering a registration update. How would you design an auto-sync mechanism to refresh the Slash Command list when Skills or Tools are updated?