← Back to Skills Marketplace
alltoken-ai

AllToken

by AllToken · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ✓ Security Clean
73
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install alltoken
Description
Bootstrap a modular AllToken agent — chat, async image+video, model routing, OpenAI-compatible SDK. Works inside Hermes, OpenClaw, Claude Code, Codex CLI, Op...
README (SKILL.md)

Build a Modular AI Agent with AllToken

This skill helps you create a modular AI agent powered by AllToken — a unified, OpenAI-compatible API with access to leading language, image, and video models behind one endpoint, plus automatic provider fallbacks and cost-effective routing.

Designed to be invoked from Hermes, OpenClaw, Claude Code, Codex CLI, or any other agent runtime that consumes skills.

  • Standalone Agent Core — runs independently, extensible via hooks
  • OpenAI SDK compatible — change two settings and you're done
  • Multi-modal — chat, image (async), video (async) on one key
  • Optional Ink TUI — terminal UI cleanly separated from agent logic

Architecture

┌─────────────────────────────────────────────────────┐
│                Your Application (TS/Py)             │
├─────────────────────────────────────────────────────┤
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐  │
│  │   Ink TUI   │  │  HTTP API   │  │   Hermes /  │  │
│  │             │  │             │  │   OpenClaw  │  │
│  └──────┬──────┘  └──────┬──────┘  └──────┬──────┘  │
│         │                │                │         │
│         └────────────────┼────────────────┘         │
│                          ▼                          │
│              ┌───────────────────────┐              │
│              │      Agent Core       │              │
│              │  (hooks & lifecycle)  │              │
│              └───────────┬───────────┘              │
│                          ▼                          │
│              ┌───────────────────────┐              │
│              │   AllToken REST API   │              │
│              │ api.alltoken.ai/v1    │              │
│              └───────────────────────┘              │
└─────────────────────────────────────────────────────┘

Prerequisites

  1. Create an AllToken account at https://alltoken.ai.
  2. Generate an API key in Settings → API Keys (the key is shown only once — copy it).
  3. Top up credits if needed in Settings → Billing.

Security: never commit your API key. Use ALLTOKEN_API_KEY from the environment.

API at a glance

  • Base URL: https://api.alltoken.ai/v1
  • Auth header: Authorization: Bearer $ALLTOKEN_API_KEY
  • Compatibility: OpenAI-compatible — any OpenAI SDK works by overriding base_url/baseURL.
  • Coverage:
    • POST /chat/completions — chat (streaming, tool calls, thinking, web search)
    • GET /models — OpenAI-compatible model list
    • POST /images/generations/async + GET /images/generations/{id} — async image generation
    • POST /videos/generations + GET /videos/generations/{id} — async video generation
    • GET /api-account/models / /{model_path} / /filters — full catalog with pricing and capabilities (public, no auth required)
    • GET /api-account/providers (+ /{id}/stats) — providers, health, throughput (public)
    • GET /api-account/rankings/all — leaderboards, benchmarks, speed rankings (public)
    • GET /api-account/health/{routes,summary} — route health & availability (public)
    • GET /api-account/user/{api-keys,usage,billing,balance}web-session token only, not callable with your Bearer API key (you'll get 401 auth_error / invalid_token). Manage these in Settings → API Keys / Billing on https://alltoken.ai.

Project Setup

Step 1 — Initialize project

mkdir my-alltoken-agent && cd my-alltoken-agent
npm init -y
npm pkg set type="module"

Step 2 — Install dependencies

npm install openai zod eventemitter3
npm install ink react        # optional: TUI only
npm install -D typescript @types/react tsx

Step 3 — tsconfig.json

{
  "compilerOptions": {
    "target": "ES2022",
    "module": "NodeNext",
    "moduleResolution": "NodeNext",
    "jsx": "react-jsx",
    "strict": true,
    "esModuleInterop": true,
    "skipLibCheck": true,
    "outDir": "dist"
  },
  "include": ["src"]
}

Step 4 — Scripts in package.json

{
  "scripts": {
    "start": "tsx src/cli.tsx",
    "start:headless": "tsx src/headless.ts",
    "dev": "tsx watch src/cli.tsx"
  }
}

Step 5 — File layout

src/
├── client.ts       # AllToken client (OpenAI SDK with overridden baseURL)
├── agent.ts        # Standalone agent core with hooks
├── tools.ts        # Function-calling tool definitions
├── media.ts        # Async image + video helpers (poll loop)
├── cli.tsx         # Optional Ink TUI
└── headless.ts     # Headless / scriptable example

Step 1 — AllToken Client

Create src/client.ts. AllToken is OpenAI-compatible; we just override the base URL.

import OpenAI from 'openai';

export function createAllTokenClient(apiKey = process.env.ALLTOKEN_API_KEY): OpenAI {
  if (!apiKey) throw new Error('ALLTOKEN_API_KEY is not set');
  return new OpenAI({
    apiKey,
    baseURL: 'https://api.alltoken.ai/v1',
  });
}

Step 2 — Agent Core with Hooks

Create src/agent.ts — the standalone agent. It streams via OpenAI's SSE protocol and emits typed events for any UI to consume.

import OpenAI from 'openai';
import type {
  ChatCompletionMessageParam,
  ChatCompletionTool,
  ChatCompletionToolMessageParam,
} from 'openai/resources/chat/completions';
import { EventEmitter } from 'eventemitter3';
import { createAllTokenClient } from './client.js';

export interface Message {
  role: 'user' | 'assistant' | 'system' | 'tool';
  content: string;
  tool_call_id?: string;
  name?: string;
}

export interface AgentEvents {
  'message:user': (message: Message) => void;
  'message:assistant': (message: Message) => void;
  'stream:start': () => void;
  'stream:delta': (delta: string, accumulated: string) => void;
  'stream:end': (fullText: string) => void;
  'tool:call': (name: string, args: unknown, callId: string) => void;
  'tool:result': (name: string, result: unknown, callId: string) => void;
  'thinking:start': () => void;
  'thinking:end': () => void;
  'error': (error: Error) => void;
}

export interface ToolHandler {
  definition: ChatCompletionTool;
  execute: (args: any) => Promise\x3Cunknown> | unknown;
}

export interface AgentConfig {
  apiKey?: string;
  model?: string;                  // e.g. 'minimax-m2.7', 'gpt-5.4'
  instructions?: string;
  tools?: ToolHandler[];
  maxSteps?: number;               // tool-loop step limit
  temperature?: number;
  enableSearch?: boolean;          // AllToken-specific web-search toggle
}

export class Agent extends EventEmitter\x3CAgentEvents> {
  private client: OpenAI;
  private messages: ChatCompletionMessageParam[] = [];
  private cfg: Required\x3COmit\x3CAgentConfig, 'apiKey'>>;
  private toolMap: Map\x3Cstring, ToolHandler>;

  constructor(config: AgentConfig = {}) {
    super();
    this.client = createAllTokenClient(config.apiKey);
    this.cfg = {
      model: config.model ?? 'minimax-m2.7',
      instructions: config.instructions ?? 'You are a helpful assistant.',
      tools: config.tools ?? [],
      maxSteps: config.maxSteps ?? 5,
      temperature: config.temperature ?? 0.7,
      enableSearch: config.enableSearch ?? false,
    };
    this.toolMap = new Map(this.cfg.tools.map((t) => [t.definition.function.name, t]));
    if (this.cfg.instructions) {
      this.messages.push({ role: 'system', content: this.cfg.instructions });
    }
  }

  getMessages(): ChatCompletionMessageParam[] { return [...this.messages]; }
  clearHistory(): void {
    this.messages = this.cfg.instructions
      ? [{ role: 'system', content: this.cfg.instructions }]
      : [];
  }
  setInstructions(text: string): void {
    this.cfg.instructions = text;
    if (this.messages[0]?.role === 'system') this.messages[0] = { role: 'system', content: text };
    else this.messages.unshift({ role: 'system', content: text });
  }
  addTool(t: ToolHandler): void {
    this.cfg.tools.push(t);
    this.toolMap.set(t.definition.function.name, t);
  }

  /** Send a user message, run the tool-loop, stream tokens. Returns the final assistant text. */
  async send(content: string): Promise\x3Cstring> {
    this.messages.push({ role: 'user', content });
    this.emit('message:user', { role: 'user', content });
    this.emit('thinking:start');

    let finalText = '';

    try {
      for (let step = 0; step \x3C this.cfg.maxSteps; step++) {
        const stream = await this.client.chat.completions.create({
          model: this.cfg.model,
          messages: this.messages,
          temperature: this.cfg.temperature,
          tools: this.cfg.tools.length ? this.cfg.tools.map((t) => t.definition) : undefined,
          stream: true,
          // AllToken extension: opt-in web search (model-dependent)
          ...(this.cfg.enableSearch ? ({ enable_search: true } as any) : {}),
        });

        this.emit('stream:start');
        let text = '';
        const toolCalls: Record\x3Cnumber, { id?: string; name?: string; args: string }> = {};
        let finishReason: string | undefined;

        for await (const chunk of stream) {
          const choice = chunk.choices[0];
          if (!choice) continue;
          const delta: any = choice.delta;

          if (delta?.content) {
            text += delta.content;
            this.emit('stream:delta', delta.content, text);
          }
          if (delta?.tool_calls) {
            for (const tc of delta.tool_calls) {
              const slot = toolCalls[tc.index] ?? (toolCalls[tc.index] = { args: '' });
              if (tc.id) slot.id = tc.id;
              if (tc.function?.name) slot.name = tc.function.name;
              if (tc.function?.arguments) slot.args += tc.function.arguments;
            }
          }
          if (choice.finish_reason) finishReason = choice.finish_reason;
        }

        this.emit('stream:end', text);

        // Persist the assistant turn (with tool_calls if any)
        const calls = Object.values(toolCalls).filter((c) => c.id && c.name);
        if (calls.length) {
          this.messages.push({
            role: 'assistant',
            content: text || null,
            tool_calls: calls.map((c) => ({
              id: c.id!,
              type: 'function',
              function: { name: c.name!, arguments: c.args || '{}' },
            })),
          } as any);

          // Execute tools and append results
          for (const c of calls) {
            const handler = this.toolMap.get(c.name!);
            const parsed = safeJson(c.args);
            this.emit('tool:call', c.name!, parsed, c.id!);
            const result = handler
              ? await handler.execute(parsed)
              : { error: `unknown tool: ${c.name}` };
            this.emit('tool:result', c.name!, result, c.id!);
            const toolMsg: ChatCompletionToolMessageParam = {
              role: 'tool',
              tool_call_id: c.id!,
              content: typeof result === 'string' ? result : JSON.stringify(result),
            };
            this.messages.push(toolMsg);
          }
          continue; // next loop step
        }

        // Terminal: regular completion
        this.messages.push({ role: 'assistant', content: text });
        this.emit('message:assistant', { role: 'assistant', content: text });
        finalText = text;
        break;
      }

      return finalText;
    } catch (err) {
      const error = err instanceof Error ? err : new Error(String(err));
      this.emit('error', error);
      throw error;
    } finally {
      this.emit('thinking:end');
    }
  }

  /** Non-streaming convenience method. */
  async sendSync(content: string): Promise\x3Cstring> {
    this.messages.push({ role: 'user', content });
    this.emit('message:user', { role: 'user', content });
    const res = await this.client.chat.completions.create({
      model: this.cfg.model,
      messages: this.messages,
      temperature: this.cfg.temperature,
    });
    const text = res.choices[0]?.message?.content ?? '';
    this.messages.push({ role: 'assistant', content: text });
    this.emit('message:assistant', { role: 'assistant', content: text });
    return text;
  }
}

function safeJson(s: string): unknown {
  try { return JSON.parse(s || '{}'); } catch { return { _raw: s }; }
}

export function createAgent(config: AgentConfig = {}): Agent {
  return new Agent(config);
}

Step 3 — Define Tools

Create src/tools.ts:

import type { ToolHandler } from './agent.js';

export const timeTool: ToolHandler = {
  definition: {
    type: 'function',
    function: {
      name: 'get_current_time',
      description: 'Get the current date and time',
      parameters: {
        type: 'object',
        properties: {
          timezone: { type: 'string', description: 'IANA timezone, e.g. "UTC", "America/New_York"' },
        },
      },
    },
  },
  execute: ({ timezone }: { timezone?: string }) => ({
    time: new Date().toLocaleString('en-US', { timeZone: timezone || 'UTC' }),
    timezone: timezone || 'UTC',
  }),
};

export const calculatorTool: ToolHandler = {
  definition: {
    type: 'function',
    function: {
      name: 'calculate',
      description: 'Evaluate a basic math expression',
      parameters: {
        type: 'object',
        properties: { expression: { type: 'string' } },
        required: ['expression'],
      },
    },
  },
  execute: ({ expression }: { expression: string }) => {
    // Safe arithmetic evaluator — shunting-yard + RPN, no eval/Function.
    const tokens = expression.match(/\d+(?:\.\d+)?|[+\-*/()]/g) ?? [];
    const prec: Record\x3Cstring, number> = { '+': 1, '-': 1, '*': 2, '/': 2 };
    const out: string[] = [];
    const ops: string[] = [];
    for (const t of tokens) {
      if (/^\d/.test(t)) {
        out.push(t);
      } else if (t === '(') {
        ops.push(t);
      } else if (t === ')') {
        while (ops.length && ops[ops.length - 1] !== '(') out.push(ops.pop()!);
        ops.pop();
      } else {
        while (
          ops.length &&
          ops[ops.length - 1] !== '(' &&
          (prec[ops[ops.length - 1]] ?? 0) >= prec[t]
        ) {
          out.push(ops.pop()!);
        }
        ops.push(t);
      }
    }
    while (ops.length) out.push(ops.pop()!);
    const stack: number[] = [];
    for (const t of out) {
      if (/^\d/.test(t)) {
        stack.push(parseFloat(t));
      } else {
        const b = stack.pop()!;
        const a = stack.pop()!;
        stack.push(t === '+' ? a + b : t === '-' ? a - b : t === '*' ? a * b : a / b);
      }
    }
    return { expression, result: stack[0] };
  },
};

export const defaultTools = [timeTool, calculatorTool];

Step 4 — Image & Video helpers

AllToken's image and video endpoints are asynchronous: create a task, poll until completed, then read the result. Create src/media.ts:

import { createAllTokenClient } from './client.js';

const BASE = 'https://api.alltoken.ai/v1';

async function authedFetch(path: string, init: RequestInit = {}) {
  const apiKey = process.env.ALLTOKEN_API_KEY;
  if (!apiKey) throw new Error('ALLTOKEN_API_KEY is not set');
  const res = await fetch(`${BASE}${path}`, {
    ...init,
    headers: {
      'Authorization': `Bearer ${apiKey}`,
      'Content-Type': 'application/json',
      ...(init.headers ?? {}),
    },
  });
  if (!res.ok) {
    const body = await res.text();
    throw new Error(`AllToken ${res.status}: ${body}`);
  }
  return res.json();
}

// ── Images ────────────────────────────────────────────────────────────────
// Result is delivered ONCE: persist `b64_json` immediately. Tasks expire in 30 min.

export interface ImageRequest {
  model?: 'gpt-image-2' | string;          // discover via GET /images/models
  prompt: string;
  size?: '1024x1024' | '1536x1024' | '1024x1536' | 'auto';
  quality?: 'low' | 'medium' | 'high' | 'auto';
  output_format?: 'png' | 'jpeg' | 'webp';
  background?: 'auto' | 'opaque';
  moderation?: 'auto' | 'low';
}

export interface ImageResult {
  id: string;
  status: 'queued' | 'processing' | 'completed' | 'failed' | 'cancelled';
  data?: Array\x3C{ b64_json: string; revised_prompt?: string }>;
  error?: unknown;
}

export async function generateImage(req: ImageRequest, opts: { pollMs?: number } = {}): Promise\x3CImageResult> {
  const created = await authedFetch('/images/generations/async', {
    method: 'POST',
    body: JSON.stringify({ model: 'gpt-image-2', ...req }),
    // Recommended: deduplicate retries with an Idempotency-Key
    headers: { 'Idempotency-Key': crypto.randomUUID() },
  });

  const id = created.id as string;
  const intervalMs = opts.pollMs ?? 2000;
  while (true) {
    const status = await authedFetch(`/images/generations/${id}`);
    if (status.status === 'completed' || status.status === 'failed' || status.status === 'cancelled') {
      return status;
    }
    await new Promise((r) => setTimeout(r, intervalMs));
  }
}

// ── Videos ────────────────────────────────────────────────────────────────

export interface VideoRequest {
  model: 'seedance-1.5-pro' | 'seedance-2.0' | string;
  prompt: string;
  duration?: number;                       // seconds; -1 = model decides
  ratio?: '16:9' | '9:16' | '4:3' | '3:4' | '21:9' | '1:1' | 'adaptive';
  resolution?: '480p' | '720p' | '1080p';
  generate_audio?: boolean;
  seed?: number;
  watermark?: boolean;
  callback_url?: string;
  // Image-to-video: pass `content` with image_url + role: 'first_frame'
  content?: Array\x3C{
    type: 'image_url' | 'video_url' | 'audio_url' | 'draft_task';
    image_url?: { url: string };
    video_url?: { url: string };
    audio_url?: { url: string };
    role?: 'first_frame' | 'last_frame' | 'reference_image' | 'reference_video' | 'reference_audio';
  }>;
}

export async function generateVideo(req: VideoRequest, opts: { pollMs?: number } = {}) {
  const created = await authedFetch('/videos/generations', {
    method: 'POST',
    body: JSON.stringify(req),
  });
  const id = created.id as string;
  const intervalMs = opts.pollMs ?? 3000;
  while (true) {
    const status = await authedFetch(`/videos/generations/${id}`);
    if (['completed', 'failed', 'cancelled', 'expired'].includes(status.status)) return status;
    await new Promise((r) => setTimeout(r, intervalMs));
  }
}

export async function cancelVideo(id: string) {
  return authedFetch(`/videos/generations/${id}/cancel`, { method: 'POST' });
}

Persist b64_json to disk in one shot — re-polling a delivered image returns 410 image_already_retrieved and the result is gone. The 410 envelope:

{"error":{"code":"image_already_retrieved","message":"Image data was already retrieved; please submit a new generation request","request_id":"...","type":"invalid_request_error"}}

Observed latencies (use these to size your retry budget, not as SLAs):

  • Image gpt-image-2 1024×1024 quality=low: ~15–25 s end-to-end (verified 20.6 s on a real submit).
  • Image quality=high or 1536×1024: 30–60 s per docs.
  • Video seedance-1.5-pro 5 s @ 480 p: 30–120 s typical; 1080 p can take 3–5 min.
  • Recommended poll interval: 2 s for images, 3 s for videos.

Submit-response fields (the full shape, not just id):

{"id":"igen_d3b8...","status":"queued","model":"gpt-image-2","created_at":"2026-05-12T13:46:09Z"}

After status==completed, the GET adds: data: [{b64_json}], usage: {input_tokens, output_tokens, total_tokens, input_tokens_details}, size, quality, output_format, completed_at, expires_at. Note: revised_prompt is not present in current responses despite appearing in the docs example — treat it as optional.

Step 5 — Headless usage

Create src/headless.ts:

import { createAgent } from './agent.js';
import { defaultTools } from './tools.js';
import { generateImage } from './media.js';
import { writeFile } from 'node:fs/promises';

async function main() {
  const agent = createAgent({
    model: 'minimax-m2.7',
    instructions: 'You are a helpful assistant with tools.',
    tools: defaultTools,
    enableSearch: false,
  });

  agent.on('thinking:start', () => console.log('\
🤔 Thinking...'));
  agent.on('tool:call', (name, args) => console.log(`🔧 ${name}`, args));
  agent.on('stream:delta', (delta) => process.stdout.write(delta));
  agent.on('stream:end', () => console.log());
  agent.on('error', (e) => console.error('❌', e.message));

  // Chat
  await agent.send('What time is it in Tokyo?');

  // Image (async)
  const img = await generateImage({
    prompt: 'A clean studio product photo of a glass teapot on a walnut table',
    size: '1024x1024',
    quality: 'high',
  });
  if (img.status === 'completed' && img.data?.[0]?.b64_json) {
    await writeFile('teapot.png', Buffer.from(img.data[0].b64_json, 'base64'));
    console.log('\
💾 Saved teapot.png');
  }
}

main().catch(console.error);

Run: ALLTOKEN_API_KEY=sk-... npm run start:headless

Step 6 — Optional Ink TUI

Create src/cli.tsx for a terminal chat UI. Subscribe to stream:delta and tool:call events from the agent and render them. The agent core is UI-agnostic — the same instance can power Hermes, OpenClaw, Discord, or HTTP.

import React, { useState, useEffect, useCallback } from 'react';
import { render, Box, Text, useInput, useApp } from 'ink';
import { createAgent, type Message } from './agent.js';
import { defaultTools } from './tools.js';

const agent = createAgent({
  model: 'minimax-m2.7',
  instructions: 'You are a concise assistant.',
  tools: defaultTools,
});

function App() {
  const { exit } = useApp();
  const [messages, setMessages] = useState\x3CMessage[]>([]);
  const [input, setInput] = useState('');
  const [streaming, setStreaming] = useState('');
  const [loading, setLoading] = useState(false);

  useInput((ch, key) => {
    if (key.escape) exit();
    if (loading) return;
    if (key.return) {
      const text = input.trim();
      if (!text) return;
      setInput('');
      setMessages((m) => [...m, { role: 'user', content: text }]);
      agent.send(text);
    } else if (key.backspace || key.delete) setInput((v) => v.slice(0, -1));
    else if (ch && !key.ctrl && !key.meta) setInput((v) => v + ch);
  });

  useEffect(() => {
    const onStart = () => { setLoading(true); setStreaming(''); };
    const onDelta = (_d: string, acc: string) => setStreaming(acc);
    const onAssistant = (m: Message) => {
      setMessages((prev) => [...prev, m]);
      setStreaming('');
      setLoading(false);
    };
    agent.on('thinking:start', onStart);
    agent.on('stream:delta', onDelta);
    agent.on('message:assistant', onAssistant);
    return () => {
      agent.off('thinking:start', onStart);
      agent.off('stream:delta', onDelta);
      agent.off('message:assistant', onAssistant);
    };
  }, []);

  return (
    \x3CBox flexDirection="column" padding={1}>
      \x3CText bold color="magenta">🤖 AllToken Agent\x3C/Text>
      {messages.map((m, i) => (
        \x3CBox key={i} flexDirection="column" marginTop={1}>
          \x3CText bold color={m.role === 'user' ? 'cyan' : 'green'}>
            {m.role === 'user' ? '▶ You' : '◀ Assistant'}
          \x3C/Text>
          \x3CText wrap="wrap">{m.content}\x3C/Text>
        \x3C/Box>
      ))}
      {streaming && (
        \x3CBox flexDirection="column" marginTop={1}>
          \x3CText bold color="green">◀ Assistant\x3C/Text>
          \x3CText wrap="wrap">{streaming}\x3CText color="gray">▌\x3C/Text>\x3C/Text>
        \x3C/Box>
      )}
      \x3CBox borderStyle="single" borderColor="gray" marginTop={1} paddingX={1}>
        \x3CText color="yellow">{'> '}\x3C/Text>
        \x3CText>{input}\x3C/Text>
        \x3CText color="gray">{loading ? ' ···' : '█'}\x3C/Text>
      \x3C/Box>
    \x3C/Box>
  );
}

render(\x3CApp />);

Python equivalent (one-file)

For Python users — including those embedding the agent inside Hermes or OpenClaw Python tools:

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.environ["ALLTOKEN_API_KEY"],
    base_url="https://api.alltoken.ai/v1",
)

# Streaming chat
stream = client.chat.completions.create(
    model="minimax-m2.7",
    messages=[{"role": "user", "content": "Explain SSE in one sentence."}],
    stream=True,
)
for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)
print()

Async image (poll loop):

import os, time, base64, uuid, requests

BASE = "https://api.alltoken.ai/v1"
H = {"Authorization": f"Bearer {os.environ['ALLTOKEN_API_KEY']}", "Content-Type": "application/json"}

task = requests.post(
    f"{BASE}/images/generations/async",
    headers={**H, "Idempotency-Key": str(uuid.uuid4())},
    json={"model": "gpt-image-2", "prompt": "A cat astronaut, studio light", "size": "1024x1024", "quality": "high"},
).json()

while True:
    res = requests.get(f"{BASE}/images/generations/{task['id']}", headers=H).json()
    if res["status"] in ("completed", "failed", "cancelled"):
        break
    time.sleep(2)

if res["status"] == "completed":
    with open("cat.png", "wb") as f:
        f.write(base64.b64decode(res["data"][0]["b64_json"]))

Using AllToken from inside Hermes / OpenClaw

Both Hermes and OpenClaw load skills from SKILL.md files and can run TypeScript or Python tools at the agent boundary. There are two integration patterns:

Pattern A — AllToken as your agent's model provider

Point your host agent's HTTP client at AllToken. In OpenClaw / Hermes config, set:

provider:
  base_url: https://api.alltoken.ai/v1
  api_key: ${ALLTOKEN_API_KEY}
model: minimax-m2.7

No code changes needed — the OpenAI-compatible endpoint accepts the same requests.

Pattern B — AllToken as a tool inside another agent

Drop the agent.ts / media.ts modules into the host agent's tools directory and expose them as callable tools (chat, generate_image, generate_video). The host agent (running on any model) then delegates multimodal work to AllToken on demand.

// host-agent-tool.ts
import { createAgent } from './agent.js';
import { generateImage, generateVideo } from './media.js';

const alltoken = createAgent({ model: 'minimax-m2.7' });

export const tools = {
  alltoken_chat: (input: { prompt: string }) => alltoken.sendSync(input.prompt),
  alltoken_image: (input: { prompt: string; size?: string }) => generateImage(input as any),
  alltoken_video: (input: { prompt: string; duration?: number }) =>
    generateVideo({ model: 'seedance-1.5-pro', ...input }),
};

Discovering models

Verified-working model IDs as of 2026-05-12 (use these for quick starts; re-confirm via GET /v1/models before production):

Use case IDs
Chat — cheap / fast gpt-5.4-nano, gpt-5.4-mini, claude-haiku-4-5, gemini-3-flash-preview, glm-4.7-flash, qwen3.6-flash, deepseek-v4-flash, minimax-m2.5-highspeed
Chat — flagship gpt-5.4, gpt-5.4-pro, gpt-5.5, claude-opus-4-7, claude-sonnet-4-6, gemini-3.1-pro-preview, glm-5.1, deepseek-v4-pro, qwen3.6-max-preview, kimi-k2.6, minimax-m2.7
Chat — code gpt-5.3-codex, qwen3-coder-next
Image gpt-image-2
Video — text/image to video seedance-1.5-pro, seedance-2.0, happyhorse-1.0-t2v, happyhorse-1.0-i2v
Video — editing / reference happyhorse-1.0-video-edit, happyhorse-1.0-r2v

Available chat models on a fresh key: 38 as of this writing. Image: 1. Video: 7.

Do not hardcode model IDs in production — the catalog evolves. Use the live endpoints:

// OpenAI-compatible list (good for SDK clients)
const list = await fetch('https://api.alltoken.ai/v1/models', {
  headers: { Authorization: `Bearer ${process.env.ALLTOKEN_API_KEY}` },
}).then((r) => r.json());

// Rich catalog with pricing, capabilities, tags (used by the website)
const catalog = await fetch('https://api.alltoken.ai/api-account/models').then((r) => r.json());

// Single model detail page
const detail = await fetch('https://api.alltoken.ai/api-account/models/gpt-5.4').then((r) => r.json());

Pair with the Rankings API (GET /api-account/rankings/all) for live leaderboards by usage, benchmarks, throughput, and category leaders — useful for --auto model selection.

Routing & fallbacks

AllToken handles provider routing internally. Two knobs:

  • Account-level default routing — set routing_mode (code or manual), allowed_models, and a default_models priority list on each API key:
    POST   /api-account/user/api-keys
    PUT    /api-account/user/api-keys/{key_id}/default-models
    
  • Per-request override — pass the exact model ID in the request body to bypass routing for that call.

When a provider returns 502/503, AllToken may automatically fall back to the next provider for the model.

Web search (enable_search)

Pass enable_search: true on a chat completion to opt into AllToken's unified web-search backend. Support is per-provider, not per-request shape — same flag, different effective behavior across model families. Live probe on 2026-05-12 (asking "current Bitcoin price"):

Family Outcome Notes
DeepSeek (deepseek-v3.2, deepseek-v4-pro) ✅ Searches Returns fresh prices with timestamps
Qwen (qwen3.6-flash, qwen3.6-max-preview) ✅ Searches Same fresh data via the unified backend
Claude (claude-opus-4-7, claude-sonnet-4-6) ❌ Silently ignores Model responds "I don't have web search"
GLM (glm-5, glm-5.1) ❌ Silently ignores Same as Claude
Kimi (kimi-k2.6) ❌ Silently ignores
Minimax (minimax-m2.7) ❌ Silently ignores
Gemini (gemini-3.1-pro-preview) ⚠️ Empty / refusal Inconsistent — re-test before relying
OpenAI (gpt-5.4, gpt-5.4-nano, gpt-5.5) 🔴 HTTP 503 all_providers_failed Upstream rejects the flag

Recommendation: when you need search, default to a DeepSeek or Qwen model. If you're on a different family, fall back to a function-calling pattern (model emits a tool call → your tool hits a search API → you re-invoke). The enable_search matrix above is empirical and provider-side support may change — re-test for critical paths.

Note: AllToken does not include search-result citations in the response annotations[] field today, so detecting "did search fire" requires latency heuristics (typically +6 – 15 s vs no-search baseline) or content sniffing for fresh facts.

Health, rate limits, errors

  • Per-key rate limits: set rpm_limit, tpm_limit, monthly_quota, credit_limit when creating the key.
  • Status codes you should handle: 400 invalid params · 401 bad key · 402 insufficient balance · 403 forbidden · 404 not found · 429 rate limited (respect Retry-After) · 5xx upstream — already retried server-side when safe.
  • Error envelope (real wire format):
    {
      "error": {
        "code": "invalid_api_key",
        "message": "Invalid or revoked API key",
        "param": null,
        "type": "auth_error",
        "request_id": "d81itf8gdg1fp5ko4bjg"
      }
    }
    
    Note: code is a string slug (e.g. "invalid_api_key", "image_already_retrieved", "all_providers_failed"), not the numeric HTTP status. type groups errors (auth_error, invalid_request_error, api_error, …). Include request_id when filing support tickets.

Python error-dispatch helper:

import json, time, urllib.request, urllib.error

def call(req):
    try:
        return urllib.request.urlopen(req, timeout=60)
    except urllib.error.HTTPError as e:
        body = e.read()
        try:
            err = json.loads(body).get("error", {})
        except Exception:
            err = {}
        retry_after = e.headers.get("Retry-After")     # integer seconds (AllToken format)
        if e.code == 429 and retry_after:
            time.sleep(int(retry_after))
            return call(req)                            # one retry
        if e.code == 401:    raise RuntimeError(f"auth: {err.get('code')} — rotate API key")
        if e.code == 402:    raise RuntimeError(f"top up credits: {err.get('message')}")
        if e.code == 410 and err.get("code") == "image_already_retrieved":
            raise RuntimeError("re-submit; image was already delivered")
        if 500 \x3C= e.code \x3C 600 and err.get("code") == "all_providers_failed":
            raise RuntimeError("upstream — try fallback model or retry with jitter")
        raise RuntimeError(f"{e.code} {err.get('type')}/{err.get('code')}: {err.get('message')} [req={err.get('request_id')}]")

Retry-After is sent as integer seconds. Always combine an explicit retry-after read with exponential backoff (+ jitter) as the fallback when the header is missing.

  • Health dashboard: GET /api-account/health/summary (returns {"data": {...}} envelope) and /health/routes show live availability, p50/p95 latency, and incident routes — wire this into your runbook.

Cost tracking & budgets

Per-request cost: every chat response includes a usage block:

"usage": {
  "prompt_tokens": 13,
  "completion_tokens": 4,
  "total_tokens": 17,
  "prompt_tokens_details": { "cached_tokens": 0, "cache_creation_input_tokens": 0, "audio_tokens": 0 },
  "completion_tokens_details": { "reasoning_tokens": 0, "audio_tokens": 0, "accepted_prediction_tokens": 0, "rejected_prediction_tokens": 0 }
}

Capture usage from a streaming response: pass stream_options: {"include_usage": true}. The final data: chunk before data: [DONE] will have choices: [] and the populated usage. Without this option, usage is null on every streamed chunk.

stream = client.chat.completions.create(
    model="gpt-5.4-nano", messages=[...], stream=True,
    stream_options={"include_usage": True},
)
usage = None
for chunk in stream:
    if chunk.choices and chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)
    if chunk.usage is not None:
        usage = chunk.usage   # only present on the terminal chunk

Per-request cost telemetry (vendor extension): AllToken also emits one extra SSE comment line after data: [DONE] with a fiat-priced breakdown:

: {"cost":"0.0000188000","input_price":"0.0002000000","output_price":"0.0012500000","prompt_tokens":19,"completion_tokens":12}

Standard OpenAI SDKs drop comment lines (lines beginning with :), so this is invisible when using openai. To capture it, parse the raw SSE stream yourself and do not stop on [DONE]:

# stdlib-only — captures both usage (from data: chunks) AND cost comment (post-DONE)
import urllib.request, json
req = urllib.request.Request(URL, data=BODY, method="POST", headers=H)
r = urllib.request.urlopen(req)
saw_done = False
for raw in iter(r.readline, b""):
    line = raw.decode().rstrip("\
")
    if line.startswith(":"):
        cost = json.loads(line[1:])      # {"cost": "...", ...}
    elif line.startswith("data: "):
        data = line[6:]
        if data == "[DONE]": saw_done = True; continue
        # ... parse chunk

Other useful response metadata: chat responses also carry top-level service_tier (e.g. "default") and x-gateway-request-id (use this when filing support tickets).

Account-wide totals: /api-account/user/{balance,billing,usage,billing/orders,...} exist but are not callable with the API key — they need the web-session token. Check balance and history in Settings → Billing on https://alltoken.ai, or top up via the same dashboard.

Extending the Agent

Custom hooks

const agent = createAgent({ model: 'minimax-m2.7' });

agent.on('message:user',      (m) => db.insert('user', m.content));
agent.on('message:assistant', (m) => db.insert('assistant', m.content));
agent.on('tool:call',         (name, args) => analytics.track('tool', { name, args }));
agent.on('error',             (err) => sentry.capture(err));

HTTP server (one agent per session)

import express from 'express';
import { createAgent, type Agent } from './agent.js';

const app = express(); app.use(express.json());
const sessions = new Map\x3Cstring, Agent>();

app.post('/chat', async (req, res) => {
  const { sessionId, message } = req.body;
  let agent = sessions.get(sessionId);
  if (!agent) { agent = createAgent(); sessions.set(sessionId, agent); }
  res.json({ response: await agent.sendSync(message), history: agent.getMessages() });
});

app.listen(3000);

Agent API Reference

createAgent(config)

Option Type Default Description
apiKey string process.env.ALLTOKEN_API_KEY AllToken API key
model string 'minimax-m2.7' Model ID (see model discovery)
instructions string 'You are a helpful assistant.' System prompt
tools ToolHandler[] [] Function-calling tools
maxSteps number 5 Max tool-loop iterations
temperature number 0.7 Sampling temperature 0–2
enableSearch boolean false AllToken enable_search extension

Methods

Method Returns Description
send(content) Promise\x3Cstring> Streaming send + tool loop
sendSync(content) Promise\x3Cstring> Non-streaming send
getMessages() Message[] Full conversation
clearHistory() void Reset (keeps system prompt)
setInstructions() void Update system prompt
addTool(tool) void Register tool at runtime

Events

Event Payload Notes
message:user Message
message:assistant Message Final turn (post tool loop)
stream:start
stream:delta (delta, accumulated) OpenAI-style token chunks
stream:end fullText
tool:call (name, args, callId)
tool:result (name, result, callId)
thinking:start
thinking:end
error Error

Resources

Core API

Guides (one topic per page)

Live endpoints (callable now)

  • OpenAI-compatible model list: GET https://api.alltoken.ai/v1/models (Bearer)
  • Public catalog (no auth): GET https://api.alltoken.ai/api-account/models
  • Public health: GET https://api.alltoken.ai/api-account/health/summary

Account management: Settings → API Keys / Billing on https://alltoken.ai (web session required; not callable with your Bearer API key)

SDKs

Usage Guidance
This appears safe to install as an instruction-only bootstrap skill if you intend to build an AllToken project. Before using it, choose the target directory carefully, keep ALLTOKEN_API_KEY out of source control, review npm dependencies, and remember that smoke tests and generated apps may spend credits and send prompts to AllToken.
Capability Analysis
Type: OpenClaw Skill Name: alltoken Version: 1.0.0 The 'alltoken' skill is a comprehensive bootstrap template for building AI agents integrated with the alltoken.ai API. The provided code (SKILL.md) follows security best practices, such as implementing a custom shunting-yard algorithm for calculations to avoid using 'eval()' and recommending environment variables for API key management. No evidence of data exfiltration, malicious execution, or prompt injection was found; the skill functions as a legitimate scaffolding tool for developers.
Capability Tags
cryptorequires-oauth-tokenrequires-sensitive-credentials
Capability Assessment
Purpose & Capability
The skill's stated purpose is to scaffold an AllToken client/agent project, and the documented capabilities—chat, image/video generation, model discovery, routing, and smoke testing—fit that purpose.
Instruction Scope
The user-facing manual says invocation produces project files and may run a smoke-test API call; this is disclosed and purpose-aligned, but users should invoke it only in a directory they intend to modify.
Install Mechanism
There is no executable install spec or bundled code, but the instructions tell the host agent/user to install npm packages and create source files. The packages are standard for the stated TypeScript/OpenAI SDK workflow, but versions are not pinned.
Credentials
The skill requires an AllToken API key and network access to api.alltoken.ai, which is proportionate for an AllToken integration. Registry metadata says no required env vars or primary credential even though the docs use ALLTOKEN_API_KEY, so users should not rely only on the registry summary.
Persistence & Privilege
The artifacts do not show hidden persistence, background startup, privileged file paths, or autonomous operation beyond user-directed project scaffolding and generated app scripts.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install alltoken
  3. After installation, invoke the skill by name or use /alltoken
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
alltoken 1.0.1 - Initial release of the alltoken skill for creating modular AI agents. - OpenAI-compatible API routing for chat, image (async), and video (async) completion. - Supports use with Hermes, OpenClaw, Claude Code, Codex CLI, OpenCode, or any compatible runtime. - Includes standalone agent core, provider fallback, and optional terminal UI (Ink TUI). - Public endpoints for model discovery, provider health, and usage/leaderboard information.
Metadata
Slug alltoken
Version 1.0.0
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 1
Frequently Asked Questions

What is AllToken?

Bootstrap a modular AllToken agent — chat, async image+video, model routing, OpenAI-compatible SDK. Works inside Hermes, OpenClaw, Claude Code, Codex CLI, Op... It is an AI Agent Skill for Claude Code / OpenClaw, with 73 downloads so far.

How do I install AllToken?

Run "/install alltoken" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is AllToken free?

Yes, AllToken is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does AllToken support?

AllToken is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created AllToken?

It is built and maintained by AllToken (@alltoken-ai); the current version is v1.0.0.

💬 Comments