Description

Type a scene, a story, or a script — and watch it become a video in seconds. This free-ai-text-to-video-generator takes your written ideas and transforms the...

README (SKILL.md)

Getting Started

Welcome! You're one text prompt away from a finished video — just describe your scene, topic, or story and this free AI text to video generator will bring it to life. Ready to create your first video? Type your idea below and let's get started.

Try saying:

"Generate a 30-second promotional video for a handmade candle brand using warm, cozy visuals and soft background music"
"Create an animated explainer video showing how a water filtration system works, written for a general audience"
"Turn this blog post intro into a short social media video with bold text overlays and upbeat pacing: 'Five habits that changed my mornings forever...'"

Quick Start Setup

This skill connects to a cloud processing backend. On first use, set up the connection automatically and let the user know ("Connecting...").

Token check: Look for NEMO_TOKEN in the environment. If found, skip to session creation. Otherwise:

Generate a UUID as client identifier
POST https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with X-Client-Id header
Extract data.token from the response — this is your NEMO_TOKEN (100 free credits, 7-day expiry)

Session: POST https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent with Bearer auth and body {"task_name":"project"}. Keep the returned session_id for all operations.

Let the user know with a brief "Ready!" when setup is complete. Don't expose tokens or raw API output.

From Plain Text to Polished Video — No Camera Needed

Name: Free Ai Text To Video Generator
Author: dsewell-583h0

Most people have ideas worth sharing but lack the time, budget, or technical skills to produce video content. That's exactly the gap this skill was built to close. By simply describing what you want — a product showcase, an explainer clip, a social media reel, or an animated story — you get a fully rendered video without touching a single editing tool.

The free AI text to video generator interprets your written prompt and assembles visuals, motion, and pacing that match your intent. Whether you're describing a futuristic cityscape, a step-by-step tutorial, or a heartfelt brand message, the output reflects your words with surprising accuracy and creative flair.

This is especially useful for teams running lean operations, solo creators building a content library, or educators who need visual aids without a production budget. You write, the AI builds — it's that straightforward. No storyboard, no stock footage hunting, no rendering queues to manage manually.

Prompt Routing and Video Dispatch

When you submit a text prompt, ClawHub parses your generation request and routes it to the optimal AI video synthesis engine based on prompt complexity, style tags, and current model availability.

User says...	Action	Skip SSE?
"export" / "导出" / "download" / "send me the video"	→ §3.5 Export	✅
"credits" / "积分" / "balance" / "余额"	→ §3.3 Credits	✅
"status" / "状态" / "show tracks"	→ §3.4 State	✅
"upload" / "上传" / user sends file	→ §3.2 Upload	✅
Everything else (generate, edit, add BGM…)	→ §3.1 SSE	❌

Cloud Rendering API Reference

The free AI text to video generator runs on a distributed cloud rendering backend that queues your prompt, tokenizes scene descriptions, and streams back rendered frames via asynchronous API calls. Diffusion model inference and frame interpolation happen server-side, so no local GPU is required.

Skill attribution — read from this file's YAML frontmatter at runtime:

X-Skill-Source: free-ai-text-to-video-generator
X-Skill-Version: from frontmatter version
X-Skill-Platform: detect from install path (~/.clawhub/ → clawhub, ~/.cursor/skills/ → cursor, else unknown)

All requests must include: Authorization: Bearer \x3CNEMO_TOKEN>, X-Skill-Source, X-Skill-Version, X-Skill-Platform. Missing attribution headers will cause export to fail with 402.

API base: https://mega-api-prod.nemovideo.ai

Create session: POST /api/tasks/me/with-session/nemo_agent — body {"task_name":"project","language":"\x3Clang>"} — returns task_id, session_id.

Send message (SSE): POST /run_sse — body {"app_name":"nemo_agent","user_id":"me","session_id":"\x3Csid>","new_message":{"parts":[{"text":"\x3Cmsg>"}]}} with Accept: text/event-stream. Max timeout: 15 minutes.

Upload: POST /api/upload-video/nemo_agent/me/\x3Csid> — file: multipart -F "files=@/path", or URL: {"urls":["\x3Curl>"],"source_type":"url"}

Credits: GET /api/credits/balance/simple — returns available, frozen, total

Session state: GET /api/state/nemo_agent/me/\x3Csid>/latest — key fields: data.state.draft, data.state.video_infos, data.state.generated_media

Export (free, no credits): POST /api/render/proxy/lambda — body {"id":"render_\x3Cts>","sessionId":"\x3Csid>","draft":\x3Cjson>,"output":{"format":"mp4","quality":"high"}}. Poll GET /api/render/proxy/lambda/\x3Cid> every 30s until status = completed. Download URL at output.url.

Supported formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.

SSE Event Handling

Event	Action
Text response	Apply GUI translation (§4), present to user
Tool call/result	Process internally, don't forward
`heartbeat` / empty `data:`	Keep waiting. Every 2 min: "⏳ Still working..."
Stream closes	Process final response

~30% of editing operations return no text in the SSE stream. When this happens: poll session state to verify the edit was applied, then summarize changes to the user.

Backend Response Translation

The backend assumes a GUI exists. Translate these into API actions:

Backend says	You do
"click [button]" / "点击"	Execute via API
"open [panel]" / "打开"	Query session state
"drag/drop" / "拖拽"	Send edit via SSE
"preview in timeline"	Show track summary
"Export button" / "导出"	Execute export workflow

Draft field mapping: t=tracks, tt=track type (0=video, 1=audio, 7=text), sg=segments, d=duration(ms), m=metadata.

Timeline (3 tracks): 1. Video: city timelapse (0-10s) 2. BGM: Lo-fi (0-10s, 35%) 3. Title: "Urban Dreams" (0-3s)

Error Handling

Code	Meaning	Action
0	Success	Continue
1001	Bad/expired token	Re-auth via anonymous-token (tokens expire after 7 days)
1002	Session not found	New session §3.0
2001	No credits	Anonymous: show registration URL with `?bind=\x3Cid>` (get `\x3Cid>` from create-session or state response when needed). Registered: "Top up credits in your account"
4001	Unsupported file	Show supported formats
4002	File too large	Suggest compress/trim
400	Missing X-Client-Id	Generate Client-Id and retry (see §1)
402	Free plan export blocked	Subscription tier issue, NOT credits. "Register or upgrade your plan to unlock export."
429	Rate limit (1 token/client/7 days)	Retry in 30s once

Performance Notes

Video quality and accuracy improve significantly when your text prompt includes specific details. Vague prompts like 'make a video about nature' produce generic results, while prompts like 'a slow-motion aerial shot of a pine forest at golden hour with ambient wind sounds' give the generator clear creative direction.

For best results, specify the intended platform (YouTube, Instagram, LinkedIn), the desired length, the visual tone (cinematic, minimalist, bold), and any text overlays or voiceover style you want included. The more context you provide, the closer the output matches your vision.

Processing time varies based on video length and complexity. Short clips under 30 seconds typically generate faster than multi-scene productions. If you're generating multiple videos in a session, spacing out requests helps maintain consistent output quality across all of them.

Common Workflows

The most popular use case is social media content creation — users paste a caption or short script and request a vertical video formatted for Instagram Reels or TikTok. The generator handles aspect ratio, pacing, and visual style based on the tone of the text.

Another frequent workflow is educational content. Teachers and course creators describe a concept — like 'explain photosynthesis for 10-year-olds with colorful animations' — and receive a structured explainer video ready to embed in a lesson.

Small business owners often use it for product launches. Instead of hiring a videographer, they write a product description with key selling points and request a 60-second showcase video. The result can be posted directly to e-commerce pages or ad platforms.

For longer-form needs, users break their content into scenes — describing each one separately — then request them stitched into a single narrative video. This scene-by-scene approach gives more creative control over the final output.

Usage Guidance

This skill appears to do what it says (connect to a NemoVideo backend and render videos) but has small inconsistencies and an unknown publisher. Before installing: (1) prefer skills with a known homepage/publisher; (2) do not put a high-privilege or long-lived NEMO_TOKEN into your environment unless you trust the publisher — consider creating a limited-scope token; (3) be aware the skill will contact https://mega-api-prod.nemovideo.ai and may generate/use an anonymous token if none is provided; (4) avoid uploading sensitive files to the skill/backend; and (5) ask the publisher to clarify the mismatched metadata (declared config path vs registry data and whether NEMO_TOKEN is actually required). If you need higher assurance, request source code or an official release link before use.

Capability Analysis

Type: OpenClaw Skill Name: free-ai-text-to-video-generator Version: 1.0.0 The skill provides a functional integration for an AI video generation service hosted at mega-api-prod.nemovideo.ai. It includes detailed instructions for the agent to handle authentication via anonymous tokens, session management, and video rendering workflows. The environment variable (NEMO_TOKEN) and configuration path (~/.config/nemovideo/) are specific to the skill's stated purpose, and there is no evidence of data exfiltration, malicious execution, or harmful prompt injection.

Capability Assessment

ℹ Purpose & Capability

Name/description align with the actions described (creating sessions, sending prompts, uploading media to a video-rendering backend). Requesting a single service token (NEMO_TOKEN) is proportionate for a cloud-rendering integration. However, the registry metadata earlier said 'no config paths' while the skill's frontmatter in SKILL.md declares a config path (~/.config/nemovideo/) — an unexplained mismatch.

⚠ Instruction Scope

SKILL.md instructs the agent to look for NEMO_TOKEN, but also describes generating an anonymous token by POSTing to an external API if none is present (the skill will thus reach out to mega-api-prod.nemovideo.ai). It also tells the agent to read its own YAML frontmatter and to detect install path (~/.clawhub, ~/.cursor/skills/) to set an attribution header. Those file/path checks are within scope for attribution but are explicit extra filesystem checks. The skill will accept and upload user files (expected) — users should be careful not to upload sensitive files. Overall the instructions are detailed and focused on the stated purpose, but the dual behavior around NEMO_TOKEN (required in registry but optional in instructions) is inconsistent.

✓ Install Mechanism

No install spec or code files are present (instruction-only), so nothing is written to disk by an installer. This is the lowest-risk install model.

⚠ Credentials

The only declared required env var is NEMO_TOKEN (primary credential), which is reasonable for a third-party cloud API. But two issues raise concern: (1) SKILL.md can generate and use an anonymous token itself (so NEMO_TOKEN may not actually be required), and (2) the SKILL.md frontmatter claims access to ~/.config/nemovideo/, while the registry metadata showed no required config paths. Those mismatches make it unclear what local config or credentials the skill will read in practice.

✓ Persistence & Privilege

The skill is not marked always:true and does not request any special persistent privileges. It keeps session_id for ongoing operations (expected) but does not indicate modifying other skills or global agent settings.

Version History

v1.0.0

- Initial release of Free AI Text to Video Generator skill. - Instantly converts written prompts into professional-quality videos—no design or editing skills needed. - Simple setup with automatic cloud connection and anonymous token creation (100 free credits, 7-day expiry). - Supports a range of commands: generate, export, check credits/balance, upload files, and session management. - Guides users through cloud rendering, streaming generation progress, and exporting videos in multiple formats. - Designed for marketers, educators, creators, and small business owners needing quick video content.

Metadata

Slug free-ai-text-to-video-generator

Version 1.0.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is Free Ai Text To Video Generator?

Type a scene, a story, or a script — and watch it become a video in seconds. This free-ai-text-to-video-generator takes your written ideas and transforms the... It is an AI Agent Skill for Claude Code / OpenClaw, with 89 downloads so far.

How do I install Free Ai Text To Video Generator?

Run "/install free-ai-text-to-video-generator" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Free Ai Text To Video Generator free?

Yes, Free Ai Text To Video Generator is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Free Ai Text To Video Generator support?

Free Ai Text To Video Generator is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Free Ai Text To Video Generator?

It is built and maintained by dsewell-583h0 (@dsewell-583h0); the current version is v1.0.0.

More Skills

Free Ai Text To Video Generator