← Back to Skills Marketplace
peand-rover

Higgsfield Ai

by peandrover adam · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ✓ Security Clean
103
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install higgsfield-ai
Description
generate text prompts, images into AI-generated video clips with this higgsfield-ai skill. Works with JPG, PNG, MP4, WebM files up to 200MB. filmmakers, cont...
README (SKILL.md)

Getting Started

Share your text prompts, images and I'll get started on AI video generation. Or just tell me what you're thinking.

Try saying:

  • "generate my text prompts, images"
  • "export 1080p MP4"
  • "animate this photo into a cinematic"

Automatic Setup

On first interaction, connect to the processing API before doing anything else. Show a brief status like "Setting things up...".

Token: If NEMO_TOKEN environment variable is already set, use it and skip to Session below.

Free token: Generate a UUID as client identifier, then POST to https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with header X-Client-Id: \x3Cuuid>. The response field data.token becomes your NEMO_TOKEN (100 credits, 7-day expiry).

Session: POST to https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent with Bearer auth and body {"task_name":"project"}. Save session_id from the response.

Confirm to the user you're connected and ready. Don't print tokens or raw JSON.

Higgsfield AI — Generate cinematic video clips

Send me your text prompts, images and describe the result you want. The AI video generation runs on remote GPU nodes — nothing to install on your machine.

A quick example: upload a single landscape photo or a text description, type "animate this photo into a cinematic video clip with camera movement", and you'll get a 1080p MP4 back in roughly 30-90 seconds. All rendering happens server-side.

Worth noting: shorter prompts with clear camera direction like 'slow zoom in' produce more consistent results.

Matching Input to Actions

User prompts referencing higgsfield ai, aspect ratio, text overlays, or audio tracks get routed to the corresponding action via keyword and intent classification.

User says... Action Skip SSE?
"export" / "导出" / "download" / "send me the video" → §3.5 Export
"credits" / "积分" / "balance" / "余额" → §3.3 Credits
"status" / "状态" / "show tracks" → §3.4 State
"upload" / "上传" / user sends file → §3.2 Upload
Everything else (generate, edit, add BGM…) → §3.1 SSE

Cloud Render Pipeline Details

Each export job queues on a cloud GPU node that composites video layers, applies platform-spec compression (H.264, up to 1080x1920), and returns a download URL within 30-90 seconds. The session token carries render job IDs, so closing the tab before completion orphans the job.

Base URL: https://mega-api-prod.nemovideo.ai

Endpoint Method Purpose
/api/tasks/me/with-session/nemo_agent POST Start a new editing session. Body: {"task_name":"project","language":"\x3Clang>"}. Returns session_id.
/run_sse POST Send a user message. Body includes app_name, session_id, new_message. Stream response with Accept: text/event-stream. Timeout: 15 min.
/api/upload-video/nemo_agent/me/\x3Csid> POST Upload a file (multipart) or URL.
/api/credits/balance/simple GET Check remaining credits (available, frozen, total).
/api/state/nemo_agent/me/\x3Csid>/latest GET Fetch current timeline state (draft, video_infos, generated_media).
/api/render/proxy/lambda POST Start export. Body: {"id":"render_\x3Cts>","sessionId":"\x3Csid>","draft":\x3Cjson>,"output":{"format":"mp4","quality":"high"}}. Poll status every 30s.

Accepted file types: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.

Three attribution headers are required on every request and must match this file's frontmatter:

Header Value
X-Skill-Source higgsfield-ai
X-Skill-Version frontmatter version
X-Skill-Platform auto-detect: clawhub / cursor / unknown from install path

Every API call needs Authorization: Bearer \x3CNEMO_TOKEN> plus the three attribution headers above. If any header is missing, exports return 402.

Error Codes

  • 0 — success, continue normally
  • 1001 — token expired or invalid; re-acquire via /api/auth/anonymous-token
  • 1002 — session not found; create a new one
  • 2001 — out of credits; anonymous users get a registration link with ?bind=\x3Cid>, registered users top up
  • 4001 — unsupported file type; show accepted formats
  • 4002 — file too large; suggest compressing or trimming
  • 400 — missing X-Client-Id; generate one and retry
  • 402 — free plan export blocked; not a credit issue, subscription tier
  • 429 — rate limited; wait 30s and retry once

SSE Event Handling

Event Action
Text response Apply GUI translation (§4), present to user
Tool call/result Process internally, don't forward
heartbeat / empty data: Keep waiting. Every 2 min: "⏳ Still working..."
Stream closes Process final response

~30% of editing operations return no text in the SSE stream. When this happens: poll session state to verify the edit was applied, then summarize changes to the user.

Translating GUI Instructions

The backend responds as if there's a visual interface. Map its instructions to API calls:

  • "click" or "点击" → execute the action via the relevant endpoint
  • "open" or "打开" → query session state to get the data
  • "drag/drop" or "拖拽" → send the edit command through SSE
  • "preview in timeline" → show a text summary of current tracks
  • "Export" or "导出" → run the export workflow

Draft field mapping: t=tracks, tt=track type (0=video, 1=audio, 7=text), sg=segments, d=duration(ms), m=metadata.

Timeline (3 tracks): 1. Video: city timelapse (0-10s) 2. BGM: Lo-fi (0-10s, 35%) 3. Title: "Urban Dreams" (0-3s)

Common Workflows

Quick edit: Upload → "animate this photo into a cinematic video clip with camera movement" → Download MP4. Takes 30-90 seconds for a 30-second clip.

Batch style: Upload multiple files in one session. Process them one by one with different instructions. Each gets its own render.

Iterative: Start with a rough cut, preview the result, then refine. The session keeps your timeline state so you can keep tweaking.

Tips and Tricks

The backend processes faster when you're specific. Instead of "make it look better", try "animate this photo into a cinematic video clip with camera movement" — concrete instructions get better results.

Max file size is 200MB. Stick to JPG, PNG, MP4, WebM for the smoothest experience.

Export as MP4 for widest compatibility across social platforms and editing software.

Usage Guidance
This skill appears to do what it says: it uploads user-provided text and media to nemovideo.ai and returns rendered videos. Before installing or enabling it, consider: (1) you will be uploading files (up to 200MB) to a third-party server — do not upload sensitive data you wouldn't want stored or processed remotely; (2) the skill needs a NEMO_TOKEN (or it will request an anonymous token from the listed API) — treat that token like a password and don't share other credentials; (3) confirm whether your runtime will actually read ~/.config/nemovideo/ (frontmatter mentions it) if you have sensitive files there; (4) verify the domain (mega-api-prod.nemovideo.ai) and review the service's privacy/terms if available; and (5) if you are uncomfortable with remote uploads or lack trust in an unknown source owner, do not enable the skill or only test with non-sensitive sample media.
Capability Analysis
Type: OpenClaw Skill Name: higgsfield-ai Version: 1.0.0 The skill is a legitimate integration for the Higgsfield AI video generation service, facilitating video editing and rendering via the `nemovideo.ai` API. It handles session management, file uploads, and status polling as described, with clear instructions for the agent to avoid exposing tokens or raw JSON. No evidence of data exfiltration, malicious execution, or unauthorized access was found in SKILL.md or _meta.json.
Capability Assessment
Purpose & Capability
The skill claims to generate cinematic video clips via a remote API and its only declared credential is NEMO_TOKEN (used as a Bearer token for the API). This is proportionate to the stated purpose. Note: the SKILL.md frontmatter lists a config path (~/.config/nemovideo/) while the preregistration summary said 'Required config paths: none' — a minor metadata inconsistency that should be clarified but doesn't by itself indicate malicious intent.
Instruction Scope
Runtime instructions are limited to authenticating (using NEMO_TOKEN or obtaining an anonymous token via the stated API), creating a session, uploading user-supplied media, using SSE for streaming responses, polling job status, and applying actions via the documented endpoints. All network calls target the provided nemovideo.ai base URL and the skill instructs not to print tokens/raw JSON. The instructions do require adding attribution headers and 'auto-detect' a platform value (which implies reading agent/install context), but they do not direct the agent to read unrelated files or other environment variables.
Install Mechanism
This is an instruction-only skill with no install spec and no code files, which is the lowest-risk model — nothing is downloaded or written to disk by the skill itself.
Credentials
Only NEMO_TOKEN is required (declared as the primary credential), which matches the API usage. The frontmatter's config path (~/.config/nemovideo/) suggests possible optional local config access, but the SKILL.md does not instruct reading other secrets or unrelated env vars. Verify whether the skill or runtime will actually read that config path if you care about local privacy.
Persistence & Privilege
The skill does not request 'always: true' and has no install actions that modify other skills or system-wide settings. Autonomous invocation (default) is normal for skills and is not a special concern here given the limited scope and credentials.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install higgsfield-ai
  3. After installation, invoke the skill by name or use /higgsfield-ai
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
Initial release of Higgsfield AI skill for cinematic video generation: - Generate AI-powered cinematic video clips from text prompts or images (JPG, PNG, MP4, WebM up to 200MB) - Supports 1080p MP4 exports in 30-90 seconds using cloud GPUs - Automatic session and token setup (free credits available for new users) - Handles uploads, exports, credits checks, and timeline previews via simple prompts - Provides session persistence for iterative editing and batch processing - Clear user feedback and error handling throughout the pipeline
Metadata
Slug higgsfield-ai
Version 1.0.0
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 1
Frequently Asked Questions

What is Higgsfield Ai?

generate text prompts, images into AI-generated video clips with this higgsfield-ai skill. Works with JPG, PNG, MP4, WebM files up to 200MB. filmmakers, cont... It is an AI Agent Skill for Claude Code / OpenClaw, with 103 downloads so far.

How do I install Higgsfield Ai?

Run "/install higgsfield-ai" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Higgsfield Ai free?

Yes, Higgsfield Ai is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Higgsfield Ai support?

Higgsfield Ai is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Higgsfield Ai?

It is built and maintained by peandrover adam (@peand-rover); the current version is v1.0.0.

💬 Comments