Free Video Generation Kling
/install free-video-generation-kling
Getting Started
Got text or images to work with? Send it over and tell me what you need — I'll take care of the AI video generation.
Try saying:
- "generate a short text prompt describing a scene into a 1080p MP4"
- "generate a 5-second video of a sunset over the ocean from a text description"
- "generating short AI videos from text or image prompts for free for content creators, social media users, marketers"
First-Time Connection
When a user first opens this skill, connect to the processing backend automatically. Briefly let them know (e.g. "Setting up...").
Authentication: Check if NEMO_TOKEN is set in the environment. If it is, skip to step 2.
- Obtain a free token: Generate a random UUID as client identifier. POST to
https://mega-api-prod.nemovideo.ai/api/auth/anonymous-tokenwith headerX-Client-Idset to that UUID. The responsedata.tokenis your NEMO_TOKEN — 100 free credits, valid 7 days. - Create a session: POST to
https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agentwithAuthorization: Bearer \x3Ctoken>,Content-Type: application/json, and body{"task_name":"project","language":"\x3Cdetected>"}. Store the returnedsession_idfor all subsequent requests.
Keep setup communication brief. Don't display raw API responses or token values to the user.
Free Video Generation Kling — Generate AI Videos from Prompts
Send me your text or images and describe the result you want. The AI video generation runs on remote GPU nodes — nothing to install on your machine.
A quick example: upload a short text prompt describing a scene, type "generate a 5-second video of a sunset over the ocean from a text description", and you'll get a 1080p MP4 back in roughly 1-3 minutes. All rendering happens server-side.
Worth noting: shorter prompts with clear motion descriptions produce more consistent results.
Matching Input to Actions
User prompts referencing free video generation kling, aspect ratio, text overlays, or audio tracks get routed to the corresponding action via keyword and intent classification.
| User says... | Action | Skip SSE? |
|---|---|---|
| "export" / "导出" / "download" / "send me the video" | → §3.5 Export | ✅ |
| "credits" / "积分" / "balance" / "余额" | → §3.3 Credits | ✅ |
| "status" / "状态" / "show tracks" | → §3.4 State | ✅ |
| "upload" / "上传" / user sends file | → §3.2 Upload | ✅ |
| Everything else (generate, edit, add BGM…) | → §3.1 SSE | ❌ |
Cloud Render Pipeline Details
Each export job queues on a cloud GPU node that composites video layers, applies platform-spec compression (H.264, up to 1080x1920), and returns a download URL within 30-90 seconds. The session token carries render job IDs, so closing the tab before completion orphans the job.
All calls go to https://mega-api-prod.nemovideo.ai. The main endpoints:
- Session —
POST /api/tasks/me/with-session/nemo_agentwith{"task_name":"project","language":"\x3Clang>"}. Gives you asession_id. - Chat (SSE) —
POST /run_ssewithsession_idand your message innew_message.parts[0].text. SetAccept: text/event-stream. Up to 15 min. - Upload —
POST /api/upload-video/nemo_agent/me/\x3Csid>— multipart file or JSON with URLs. - Credits —
GET /api/credits/balance/simple— returnsavailable,frozen,total. - State —
GET /api/state/nemo_agent/me/\x3Csid>/latest— current draft and media info. - Export —
POST /api/render/proxy/lambdawith render ID and draft JSON. PollGET /api/render/proxy/lambda/\x3Cid>every 30s forcompletedstatus and download URL.
Formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.
Skill attribution — read from this file's YAML frontmatter at runtime:
X-Skill-Source:free-video-generation-klingX-Skill-Version: from frontmatterversionX-Skill-Platform: detect from install path (~/.clawhub/→clawhub,~/.cursor/skills/→cursor, elseunknown)
All requests must include: Authorization: Bearer \x3CNEMO_TOKEN>, X-Skill-Source, X-Skill-Version, X-Skill-Platform. Missing attribution headers will cause export to fail with 402.
Draft field mapping: t=tracks, tt=track type (0=video, 1=audio, 7=text), sg=segments, d=duration(ms), m=metadata.
Timeline (3 tracks): 1. Video: city timelapse (0-10s) 2. BGM: Lo-fi (0-10s, 35%) 3. Title: "Urban Dreams" (0-3s)
Translating GUI Instructions
The backend responds as if there's a visual interface. Map its instructions to API calls:
- "click" or "点击" → execute the action via the relevant endpoint
- "open" or "打开" → query session state to get the data
- "drag/drop" or "拖拽" → send the edit command through SSE
- "preview in timeline" → show a text summary of current tracks
- "Export" or "导出" → run the export workflow
SSE Event Handling
| Event | Action |
|---|---|
| Text response | Apply GUI translation (§4), present to user |
| Tool call/result | Process internally, don't forward |
heartbeat / empty data: |
Keep waiting. Every 2 min: "⏳ Still working..." |
| Stream closes | Process final response |
~30% of editing operations return no text in the SSE stream. When this happens: poll session state to verify the edit was applied, then summarize changes to the user.
Error Codes
0— success, continue normally1001— token expired or invalid; re-acquire via/api/auth/anonymous-token1002— session not found; create a new one2001— out of credits; anonymous users get a registration link with?bind=\x3Cid>, registered users top up4001— unsupported file type; show accepted formats4002— file too large; suggest compressing or trimming400— missingX-Client-Id; generate one and retry402— free plan export blocked; not a credit issue, subscription tier429— rate limited; wait 30s and retry once
Common Workflows
Quick edit: Upload → "generate a 5-second video of a sunset over the ocean from a text description" → Download MP4. Takes 1-3 minutes for a 30-second clip.
Batch style: Upload multiple files in one session. Process them one by one with different instructions. Each gets its own render.
Iterative: Start with a rough cut, preview the result, then refine. The session keeps your timeline state so you can keep tweaking.
Tips and Tricks
The backend processes faster when you're specific. Instead of "make it look better", try "generate a 5-second video of a sunset over the ocean from a text description" — concrete instructions get better results.
Max file size is 50MB. Stick to JPG, PNG, WEBP, MP4 for the smoothest experience.
Export as MP4 for widest compatibility across social platforms.
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install free-video-generation-kling - After installation, invoke the skill by name or use
/free-video-generation-kling - Provide required inputs per the skill's parameter spec and get structured output
What is Free Video Generation Kling?
Skip the learning curve of professional editing software. Describe what you want — generate a 5-second video of a sunset over the ocean from a text descripti... It is an AI Agent Skill for Claude Code / OpenClaw, with 55 downloads so far.
How do I install Free Video Generation Kling?
Run "/install free-video-generation-kling" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is Free Video Generation Kling free?
Yes, Free Video Generation Kling is completely free, licensed under MIT-0. You can download, install and use it at no cost.
Which platforms does Free Video Generation Kling support?
Free Video Generation Kling is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created Free Video Generation Kling?
It is built and maintained by susan4731-wilfordf (@susan4731-wilfordf); the current version is v1.0.0.