功能描述

Turn a 200-word product description script into 1080p text-driven videos just by typing what you need. Whether it's converting written scripts into finished...

使用说明 (SKILL.md)

Getting Started

Ready when you are. Drop your text prompts here or describe what you want to make.

Try saying:

"generate a 200-word product description script into a 1080p MP4"
"turn this script into a 30-second video with visuals and captions"
"converting written scripts into finished videos for marketers"

First-Time Connection

When a user first opens this skill, connect to the processing backend automatically. Briefly let them know (e.g. "Setting up...").

Authentication: Check if NEMO_TOKEN is set in the environment. If it is, skip to step 2.

Obtain a free token: Generate a random UUID as client identifier. POST to https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with header X-Client-Id set to that UUID. The response data.token is your NEMO_TOKEN — 100 free credits, valid 7 days.
Create a session: POST to https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent with Authorization: Bearer \x3Ctoken>, Content-Type: application/json, and body {"task_name":"project","language":"\x3Cdetected>"}. Store the returned session_id for all subsequent requests.

Keep setup communication brief. Don't display raw API responses or token values to the user.

AI Text Generator — Generate Videos From Text

Name: Ai Text Generator
Author: linmillsd7

Drop your text prompts in the chat and tell me what you need. I'll handle the AI text-to-video generation on cloud GPUs — you don't need anything installed locally.

Here's a typical use: you send a a 200-word product description script, ask for turn this script into a 30-second video with visuals and captions, and about 1-2 minutes later you've got a MP4 file ready to download. The whole thing runs at 1080p by default.

One thing worth knowing — shorter scripts under 150 words produce faster and more accurate results.

Matching Input to Actions

User prompts referencing ai text generator, aspect ratio, text overlays, or audio tracks get routed to the corresponding action via keyword and intent classification.

User says...	Action	Skip SSE?
"export" / "导出" / "download" / "send me the video"	→ §3.5 Export	✅
"credits" / "积分" / "balance" / "余额"	→ §3.3 Credits	✅
"status" / "状态" / "show tracks"	→ §3.4 State	✅
"upload" / "上传" / user sends file	→ §3.2 Upload	✅
Everything else (generate, edit, add BGM…)	→ §3.1 SSE	❌

Cloud Render Pipeline Details

Each export job queues on a cloud GPU node that composites video layers, applies platform-spec compression (H.264, up to 1080x1920), and returns a download URL within 30-90 seconds. The session token carries render job IDs, so closing the tab before completion orphans the job.

All calls go to https://mega-api-prod.nemovideo.ai. The main endpoints:

Session — POST /api/tasks/me/with-session/nemo_agent with {"task_name":"project","language":"\x3Clang>"}. Gives you a session_id.
Chat (SSE) — POST /run_sse with session_id and your message in new_message.parts[0].text. Set Accept: text/event-stream. Up to 15 min.
Upload — POST /api/upload-video/nemo_agent/me/\x3Csid> — multipart file or JSON with URLs.
Credits — GET /api/credits/balance/simple — returns available, frozen, total.
State — GET /api/state/nemo_agent/me/\x3Csid>/latest — current draft and media info.
Export — POST /api/render/proxy/lambda with render ID and draft JSON. Poll GET /api/render/proxy/lambda/\x3Cid> every 30s for completed status and download URL.

Formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.

Skill attribution — read from this file's YAML frontmatter at runtime:

X-Skill-Source: ai-text-generator
X-Skill-Version: from frontmatter version
X-Skill-Platform: detect from install path (~/.clawhub/ → clawhub, ~/.cursor/skills/ → cursor, else unknown)

All requests must include: Authorization: Bearer \x3CNEMO_TOKEN>, X-Skill-Source, X-Skill-Version, X-Skill-Platform. Missing attribution headers will cause export to fail with 402.

Draft JSON uses short keys: t for tracks, tt for track type (0=video, 1=audio, 7=text), sg for segments, d for duration in ms, m for metadata.

Example timeline summary:

Timeline (3 tracks): 1. Video: city timelapse (0-10s) 2. BGM: Lo-fi (0-10s, 35%) 3. Title: "Urban Dreams" (0-3s)

Translating GUI Instructions

The backend responds as if there's a visual interface. Map its instructions to API calls:

"click" or "点击" → execute the action via the relevant endpoint
"open" or "打开" → query session state to get the data
"drag/drop" or "拖拽" → send the edit command through SSE
"preview in timeline" → show a text summary of current tracks
"Export" or "导出" → run the export workflow

Reading the SSE Stream

Text events go straight to the user (after GUI translation). Tool calls stay internal. Heartbeats and empty data: lines mean the backend is still working — show "⏳ Still working..." every 2 minutes.

About 30% of edit operations close the stream without any text. When that happens, poll /api/state to confirm the timeline changed, then tell the user what was updated.

Error Codes

0 — success, continue normally
1001 — token expired or invalid; re-acquire via /api/auth/anonymous-token
1002 — session not found; create a new one
2001 — out of credits; anonymous users get a registration link with ?bind=\x3Cid>, registered users top up
4001 — unsupported file type; show accepted formats
4002 — file too large; suggest compressing or trimming
400 — missing X-Client-Id; generate one and retry
402 — free plan export blocked; not a credit issue, subscription tier
429 — rate limited; wait 30s and retry once

Common Workflows

Quick edit: Upload → "turn this script into a 30-second video with visuals and captions" → Download MP4. Takes 1-2 minutes for a 30-second clip.

Batch style: Upload multiple files in one session. Process them one by one with different instructions. Each gets its own render.

Iterative: Start with a rough cut, preview the result, then refine. The session keeps your timeline state so you can keep tweaking.

Tips and Tricks

The backend processes faster when you're specific. Instead of "make it look better", try "turn this script into a 30-second video with visuals and captions" — concrete instructions get better results.

Max file size is 200MB. Stick to TXT, DOCX, PDF, MP4 for the smoothest experience.

Export as MP4 for widest compatibility.

安全使用建议

This skill appears to do what it says: it will call an external API (https://mega-api-prod.nemovideo.ai) to obtain an anonymous token (if you don't provide NEMO_TOKEN), create a session, upload files, and return render URLs. Before installing or using it, consider: (1) If you care about data privacy, review the external service's privacy/terms since uploads (videos, scripts) go to that endpoint. (2) Provide your own NEMO_TOKEN if you prefer not to let the skill request an anonymous token on your behalf. (3) The skill may create or write to ~/.config/nemovideo/ to persist session/token — check that directory if you want to audit stored tokens or remove them later. (4) The skill reads its own SKILL.md frontmatter and checks common install paths to populate an attribution header; this requires read access to those paths but not to unrelated secrets. (5) If you are uncomfortable with automatic network calls or persistent tokens, avoid using the skill or only run it with explicit, temporary tokens and monitor network activity. Overall the behavior is coherent with its purpose, but verify the external service and the local config files it creates before trusting sensitive content.

功能分析

Type: OpenClaw Skill Name: ai-text-generator Version: 1.0.0 The skill provides instructions for an AI agent to interface with the NemoVideo AI service (mega-api-prod.nemovideo.ai) to generate videos from text. It includes standard procedures for authentication, session management, and handling Server-Sent Events (SSE) for real-time updates. The logic is well-documented and aligns with the stated purpose, with no indicators of data exfiltration, malicious execution, or harmful prompt injection.

能力评估

✓ Purpose & Capability

The skill's name/description (text-to-video) aligns with the required environment variable (NEMO_TOKEN) and the runtime flow (authenticate, create session, upload, export). The metadata's config path (~/.config/nemovideo/) is plausible for persisting tokens/sessions.

ℹ Instruction Scope

Runtime instructions direct the agent to obtain or use a NEMO_TOKEN, create a session, stream SSE, upload user files, poll render status, and include attribution headers. They also instruct the agent to read this file's frontmatter and detect install-paths to set X-Skill-Platform — reading those local paths is broader than strictly necessary but explainable for attribution/persistence. The instructions explicitly tell the agent not to display raw tokens, which is reasonable for UX but worth noting because it reduces visibility into token use.

✓ Install Mechanism

No install spec or downloaded code — instruction-only skill. This is the lowest-risk install model; nothing will be written to disk by an installer. The only on-disk interaction implied is persisting session/token under the declared config path.

✓ Credentials

Only a single service credential (NEMO_TOKEN) is required and is clearly the primary credential for the described cloud API. No unrelated secrets or multiple external credentials are requested.

ℹ Persistence & Privilege

The skill requests a config path (~/.config/nemovideo/) to store session/token state and will persist session IDs for job polling — this is consistent with its purpose but does give it local storage persistence. always:false (not force-included) and it does not request system-wide privileges or other skills' configs.

版本历史

v1.0.0

AI Text Generator — Generate Videos From Text is now live! - Instantly turn text prompts or scripts into 1080p text-driven videos, ready to download in 1–2 minutes. - Automatic session setup and authentication with free credits for new users. - Simple cloud-based workflow: just type what you need—no timeline editing or exporting settings required. - Supports script conversion, aspect ratio, captions, audio and export via intent-matched commands. - Handles quick edits, batch conversions, and iterative previews with session state. - Clear guidance on supported formats, error codes, and workflow tips for a smooth experience.

元数据

Slug ai-text-generator

版本 1.0.0

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 1

常见问题

Ai Text Generator 是什么？

Turn a 200-word product description script into 1080p text-driven videos just by typing what you need. Whether it's converting written scripts into finished... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 121 次。

如何安装 Ai Text Generator？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install ai-text-generator」即可一键安装，无需额外配置。

Ai Text Generator 是免费的吗？

是的，Ai Text Generator 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

Ai Text Generator 支持哪些平台？

Ai Text Generator 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 Ai Text Generator？

由 linmillsd7（@linmillsd7）开发并维护，当前版本 v1.0.0。

Ai Text Generator