Description

Community skill (unofficial) for DigitalOcean Gradient AI Serverless Inference. Discover available models and pricing, run chat completions or the Responses...

README (SKILL.md)

🦞 Gradient AI — Serverless Inference

Name: Gradient Inference
Author: simondelorean

⚠️ This is an unofficial community skill, not maintained by DigitalOcean. Use at your own risk.

"Why manage GPUs when the ocean provides?" — ancient lobster proverb

Use DigitalOcean's Gradient Serverless Inference to call large language models without managing infrastructure. The API is OpenAI-compatible, so standard SDKs and patterns work — just point at https://inference.do-ai.run/v1 and swim.

Authentication

All requests need a Model Access Key in the Authorization: Bearer header.

export GRADIENT_API_KEY="your-model-access-key"

Where to get one: DigitalOcean Console → Gradient AI → Model Access Keys → Create Key.

📖 Full auth docs

Tools

🔍 List Available Models

Window-shop for LLMs before you swipe the card.

python3 gradient_models.py                    # Pretty table
python3 gradient_models.py --json             # Machine-readable
python3 gradient_models.py --filter "llama"   # Search by name

Use this before hardcoding model IDs — models are added and deprecated over time.

Direct API call:

curl -s https://inference.do-ai.run/v1/models \
  -H "Authorization: Bearer $GRADIENT_API_KEY" | python3 -m json.tool

📖 Models reference

💬 Chat Completions

The classic. Send structured messages (system/user/assistant roles), get a response. OpenAI-compatible, so you probably already know how this works.

python3 gradient_chat.py \
  --model "openai-gpt-oss-120b" \
  --system "You are a helpful assistant." \
  --prompt "Explain serverless inference in one paragraph."

# Different model
python3 gradient_chat.py \
  --model "llama3.3-70b-instruct" \
  --prompt "Write a haiku about cloud computing."

Direct API call:

curl -s https://inference.do-ai.run/v1/chat/completions \
  -H "Authorization: Bearer $GRADIENT_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai-gpt-oss-120b",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Hello!"}
    ],
    "temperature": 0.7,
    "max_tokens": 1000
  }'

📖 Chat Completions docs

⚡ Responses API (Recommended)

DigitalOcean's recommended endpoint for new integrations. Simpler request format and supports prompt caching — a.k.a. "stop paying twice for the same context."

# Basic usage
python3 gradient_chat.py \
  --model "openai-gpt-oss-120b" \
  --prompt "Summarize this earnings report." \
  --responses-api

# With prompt caching (saves cost on follow-up queries)
python3 gradient_chat.py \
  --model "openai-gpt-oss-120b" \
  --prompt "Now compare it to last quarter." \
  --responses-api --cache

Direct API call:

curl -s https://inference.do-ai.run/v1/responses \
  -H "Authorization: Bearer $GRADIENT_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai-gpt-oss-120b",
    "input": "Explain prompt caching.",
    "store": true
  }'

When to use which:

	Chat Completions	Responses API
Request format	Array of messages with roles	Single `input` string
Prompt caching	❌	✅ via `store: true`
Multi-step tool use	Manual	Built-in
Best for	Structured conversations	Simple queries, cost savings

📖 Responses API docs

🖼️ Generate Images

Turn text prompts into images. Because sometimes a chart isn't enough.

python3 gradient_image.py --prompt "A lobster trading stocks on Wall Street"
python3 gradient_image.py --prompt "Sunset over the NYSE" --output sunset.png
python3 gradient_image.py --prompt "Fintech logo" --json

Direct API call:

curl -s https://inference.do-ai.run/v1/images/generations \
  -H "Authorization: Bearer $GRADIENT_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "dall-e-3",
    "prompt": "A lobster analyzing candlestick charts",
    "n": 1
  }'

📖 Image generation docs

🧠 Model Selection Guide

Not all models are created equal. Choose wisely, young crustacean:

Model	Best For	Speed	Quality	Context
`openai-gpt-oss-120b`	Complex reasoning, analysis, writing	Medium	★★★★★	128K
`llama3.3-70b-instruct`	General tasks, instruction following	Fast	★★★★	128K
`deepseek-r1-distill-llama-70b`	Math, code, step-by-step reasoning	Slow	★★★★★	128K
`qwen3-32b`	Quick triage, short tasks	Fastest	★★★	32K

🦞 Pro tip: Cost-aware routing. Use a fast model (e.g., qwen3-32b) to score or triage, then only escalate to a strong model (e.g., openai-gpt-oss-120b) when depth is needed. Enable prompt caching for repeated context.

Always run python3 gradient_models.py to check what's currently available — the menu changes.

📖 Available models

💰 Model Pricing Lookup

Check what models cost before you rack up a bill. Scrapes the official DigitalOcean pricing page — no API key needed.

python3 gradient_pricing.py                    # Pretty table
python3 gradient_pricing.py --json             # Machine-readable
python3 gradient_pricing.py --model "llama"    # Filter by model name
python3 gradient_pricing.py --no-cache         # Skip cache, fetch live

How it works:

Fetches live pricing from DigitalOcean's docs (public page, no auth)
Caches results for 24 hours in /tmp/gradient_pricing_cache.json
Falls back to a bundled snapshot if the live fetch fails

🦞 Pro tip: Run python3 gradient_pricing.py --model "gpt-oss" before choosing a model to see the cost difference between gpt-oss-120b ($0.10/$0.70) and gpt-oss-20b ($0.05/$0.45) per 1M tokens.

📖 Pricing docs

CLI Reference

All scripts accept --json for machine-readable output.

gradient_models.py   [--json] [--filter QUERY]
gradient_chat.py     --prompt TEXT [--model ID] [--system TEXT]
                     [--responses-api] [--cache] [--temperature F]
                     [--max-tokens N] [--json]
gradient_image.py    --prompt TEXT [--model ID] [--output PATH]
                     [--size WxH] [--json]
gradient_pricing.py  [--json] [--model QUERY] [--no-cache]

External Endpoints

Endpoint	Purpose
`https://inference.do-ai.run/v1/models`	List available models
`https://inference.do-ai.run/v1/chat/completions`	Chat Completions API
`https://inference.do-ai.run/v1/responses`	Responses API (recommended)
`https://inference.do-ai.run/v1/images/generations`	Image generation
`https://docs.digitalocean.com/.../pricing/`	Pricing page (scraped, public)

Security & Privacy

All requests go to inference.do-ai.run — DigitalOcean's own endpoint
Your GRADIENT_API_KEY is sent as a Bearer token in the Authorization header
No other credentials or local data leave the machine
Model Access Keys are scoped to inference only — they can't manage your DO account
Prompt caching entries are scoped to your account and automatically expire

Trust Statement

By using this skill, prompts and data are sent to DigitalOcean's Gradient Inference API. Only install if you trust DigitalOcean with the content you send to their LLMs.

Important Notes

Run python3 gradient_models.py before assuming a model exists — they rotate
All scripts exit with code 1 and print errors to stderr on failure

Usage Guidance

This is an unofficial community client for DigitalOcean Gradient. It appears coherent and limited to its stated purpose, but review these before installing: (1) Only provide a Gradient Model Access Key — do not reuse higher-privilege or unrelated credentials. Consider creating a limited model-access key in the DigitalOcean console. (2) The pricing tool scrapes the public docs and writes a cache to /tmp; if you run this in a shared environment be aware of the cache file. (3) The code depends on requests and beautifulsoup4 — install vetted packages from PyPI. (4) Because this is community-maintained (not official), audit the code if you plan to run it in a sensitive environment or grant it autonomous invocation. If you only need basic inference, running the included scripts locally with a dedicated model-access key is a low-risk option.

Capability Analysis

Type: OpenClaw Skill Name: gradient-inference Version: 0.1.3 The skill bundle is benign. It provides tools to interact with DigitalOcean's Gradient AI Serverless Inference API for chat completions, image generation, and model discovery, as well as scraping public pricing information. All network requests are directed to official DigitalOcean domains (`inference.do-ai.run`, `docs.digitalocean.com`). File system operations, such as saving generated images in `scripts/gradient_image.py`, include robust path traversal protection. Caching of pricing data in `scripts/gradient_pricing.py` uses the standard `/tmp` directory for non-sensitive information. There is no evidence of malicious intent, unauthorized data exfiltration, persistence mechanisms, or prompt injection against the agent in SKILL.md.

Capability Assessment

✓ Purpose & Capability

Name/description match the implemented functionality. Required binary (python3) and primary env var (GRADIENT_API_KEY) are appropriate for calling the Gradient inference endpoints. Declared pip deps (requests, beautifulsoup4) align with the included scripts.

✓ Instruction Scope

SKILL.md and the scripts only instruct the agent to call the documented Gradient endpoints (inference.do-ai.run) or to scrape the public pricing docs. The scripts access only the declared env var (GRADIENT_API_KEY) for inference calls. The pricing script scrapes a public docs page and falls back to a bundled snapshot; it writes a cache to /tmp and reads the bundled JSON from the repo.

✓ Install Mechanism

There is no remote install/download step; this is an instruction/code-only skill. The pip packages referenced are standard (requests, beautifulsoup4). No installer downloads arbitrary archives or executes code from unknown URLs.

✓ Credentials

Only one sensitive credential is required: GRADIENT_API_KEY (the Model Access Key) — which is exactly what an inference client needs. The scripts do not request unrelated secrets or config paths.

✓ Persistence & Privilege

always:false (normal). The skill writes a cache file to /tmp and may save generated images to the current working directory (the image saver includes a path-traversal check). It does not modify other skills or system-wide agent settings.

Version History

v0.1.3

Fix path traversal vulnerability in save_image (VirusTotal finding)

v0.1.2

Declared pip dependencies (requests, beautifulsoup4) per security scan feedback

v0.1.1

Added tags, republish to unstick security scan

v0.1.0

Initial release: model listing, chat completions, Responses API, image generation, and pricing lookup for DigitalOcean Gradient AI.

Metadata

Slug gradient-inference

Version 0.1.3

License —

All-time Installs 0

Active Installs 0

Total Versions 4

Frequently Asked Questions

What is Gradient Inference?

Community skill (unofficial) for DigitalOcean Gradient AI Serverless Inference. Discover available models and pricing, run chat completions or the Responses... It is an AI Agent Skill for Claude Code / OpenClaw, with 727 downloads so far.

How do I install Gradient Inference?

Run "/install gradient-inference" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Gradient Inference free?

Yes, Gradient Inference is completely free (open-source). You can download, install and use it at no cost.

Which platforms does Gradient Inference support?

Gradient Inference is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Gradient Inference?

It is built and maintained by Simon DeLorean (@simondelorean); the current version is v0.1.3.

More Skills

Gradient Inference

🦞 Gradient AI — Serverless Inference

Authentication

Tools

🔍 List Available Models

💬 Chat Completions

⚡ Responses API (Recommended)

🖼️ Generate Images

🧠 Model Selection Guide

💰 Model Pricing Lookup

CLI Reference

External Endpoints

Security & Privacy

Trust Statement

Important Notes

What is Gradient Inference?

How do I install Gradient Inference?

Is Gradient Inference free?

Which platforms does Gradient Inference support?

Who created Gradient Inference?

💬 Comments