/install runapi-gpt
GPT on RunAPI
Use the official OpenAI SDK (Python, TypeScript, Ruby) — or any
OpenAI-compatible HTTP client — and switch the base URL to
https://runapi.ai/v1. The endpoints speak the standard OpenAI protocol:
Chat Completions (POST /v1/chat/completions) and the Responses API
(POST /v1/responses). No client code changes beyond base_url and api_key.
Setup
OPENAI_API_KEY=YOUR_RUNAPI_TOKEN
OPENAI_BASE_URL=https://runapi.ai/v1
Get a RunAPI API Key at \x3Chttps://runapi.ai/api_keys>.
| Language | Init |
|---|---|
| Python | OpenAI(api_key=..., base_url="https://runapi.ai/v1") |
| TypeScript | new OpenAI({ apiKey: ..., baseURL: "https://runapi.ai/v1" }) |
| Ruby | OpenAI::Client.new(access_token: ..., uri_base: "https://runapi.ai/v1") |
| curl | POST https://runapi.ai/v1/chat/completions (or /v1/responses) |
Pick the right endpoint
| Model | Endpoint to use |
|---|---|
gpt-5.2, gpt-5.4, gpt-5.4-mini, gpt-5.4-nano, gpt-5.5, gpt-5.3-codex, gpt-5.3-codex-spark |
Chat Completions or Responses |
gpt-5.2-pro, gpt-5.4-pro, gpt-5.5-pro |
Responses only (per OpenAI) |
Core recipe — Chat Completions
from openai import OpenAI
client = OpenAI(api_key="YOUR_RUNAPI_TOKEN", base_url="https://runapi.ai/v1")
response = client.chat.completions.create(
model="gpt-5.4",
messages=[{"role": "user", "content": "Explain quantum computing simply."}],
reasoning_effort="high",
)
print(response.choices[0].message.content)
print(response.usage)
import OpenAI from "openai";
const client = new OpenAI({
apiKey: "YOUR_RUNAPI_TOKEN",
baseURL: "https://runapi.ai/v1",
});
const response = await client.chat.completions.create({
model: "gpt-5.4",
messages: [{ role: "user", content: "Explain quantum computing simply." }],
});
Core recipe — Responses API
import httpx
response = httpx.post(
"https://runapi.ai/v1/responses",
headers={"x-api-key": "YOUR_RUNAPI_TOKEN"},
json={
"model": "gpt-5.4",
"input": "Explain the theory of relativity.",
"reasoning": {"effort": "medium"},
},
)
print(response.json())
The Responses API takes input (string or structured), reasoning.effort
("low" / "medium" / "high"), and optional include for thinking blocks.
Streaming
stream = client.chat.completions.create(
model="gpt-5.4",
messages=[{"role": "user", "content": "Write a haiku about coding."}],
stream=True,
)
for chunk in stream:
delta = chunk.choices[0].delta.content
if delta:
print(delta, end="", flush=True)
const stream = await client.chat.completions.create({
model: "gpt-5.4",
messages: [{ role: "user", content: "Write a haiku about coding." }],
stream: true,
});
for await (const chunk of stream) {
process.stdout.write(chunk.choices[0].delta.content ?? "");
}
Streaming runs through a regional edge proxy so the request does not hold a Rails/Puma thread. Long generations should always stream.
Vision / multimodal
{
"model": "gpt-5.4",
"messages": [
{
"role": "user",
"content": [
{ "type": "text", "text": "What is in this image?" },
{ "type": "image_url", "image_url": { "url": "https://example.com/img.jpg" } }
]
}
]
}
Standard OpenAI multimodal block — works on both Chat Completions and
Responses (Responses also accepts structured input items).
Tool use / function calling / web search
{
"model": "gpt-5.4",
"messages": [
{ "role": "user", "content": "Find the latest news on RunAPI." }
],
"tools": [
{ "type": "function", "function": { "name": "web_search" } }
]
}
web_search is supported across the GPT models above. Custom function tools
use the standard OpenAI tools schema.
List models
curl https://runapi.ai/v1/models -H "Authorization: Bearer YOUR_RUNAPI_TOKEN"
Returns OpenAI-compatible model objects. If the API Key has
allowed_models restrictions, only permitted models are returned.
Supported models
| Model ID | API | Use when |
|---|---|---|
gpt-5.5 |
Chat, Responses | Latest general model |
gpt-5.5-pro |
Responses only | Reasoning-heavy |
gpt-5.4 |
Chat, Responses | Production default |
gpt-5.4-mini |
Chat, Responses | Cost-optimized |
gpt-5.4-nano |
Chat, Responses | Smallest, fastest |
gpt-5.4-pro |
Responses only | Reasoning |
gpt-5.3-codex |
Chat, Responses | Code generation |
gpt-5.3-codex-spark |
Chat, Responses | Faster Codex variant |
gpt-5.2 |
Chat, Responses | Cost-effective |
gpt-5.2-pro |
Responses only | Reasoning |
Connect Codex CLI itself
export OPENAI_BASE_URL=https://runapi.ai/v1
export OPENAI_API_KEY=YOUR_RUNAPI_TOKEN
codex
Agent rules
- Pro models (
gpt-5.*-pro) reject Chat Completions — always use Responses for them. Other models accept either endpoint. - Use streaming for any response longer than a few hundred tokens. Do not hold the agent on a long blocking request.
reasoning_effortis supported on every GPT model above; default is usually"high"for non-Pro models.- Pricing, rate limits, quotas — link to \x3Chttps://runapi.ai/models/gpt.md>, not this skill file.
Routing
- Model page: \x3Chttps://runapi.ai/models/gpt.md>
- Provider page: \x3Chttps://runapi.ai/providers/openai.md>
- Catalog: \x3Chttps://runapi.ai/models.md>
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install runapi-gpt - After installation, invoke the skill by name or use
/runapi-gpt - Provide required inputs per the skill's parameter spec and get structured output
What is gpt?
Call the GPT API (gpt-5.2, gpt-5.4, gpt-5.4-mini, gpt-5.5, gpt-5.3-codex) through RunAPI using the official OpenAI SDK or any OpenAI-compatible client. Use w... It is an AI Agent Skill for Claude Code / OpenClaw, with 40 downloads so far.
How do I install gpt?
Run "/install runapi-gpt" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is gpt free?
Yes, gpt is completely free, licensed under MIT-0. You can download, install and use it at no cost.
Which platforms does gpt support?
gpt is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created gpt?
It is built and maintained by RunAPI (@runapi-ai); the current version is v0.2.4.