功能描述

Get photo slideshow video ready to post, without touching a single slider. Upload your Japanese photos (JPG, PNG, HEIC, WEBP, up to 200MB), say something lik...

使用说明 (SKILL.md)

Getting Started

Share your Japanese photos and I'll get started on AI video creation. Or just tell me what you're thinking.

Try saying:

"turn my Japanese photos"
"export 1080p MP4"
"turn my Japanese travel photos into"

First-Time Connection

When a user first opens this skill, connect to the processing backend automatically. Briefly let them know (e.g. "Setting up...").

Authentication: Check if NEMO_TOKEN is set in the environment. If it is, skip to step 2.

Obtain a free token: Generate a random UUID as client identifier. POST to https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with header X-Client-Id set to that UUID. The response data.token is your NEMO_TOKEN — 100 free credits, valid 7 days.
Create a session: POST to https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent with Authorization: Bearer \x3Ctoken>, Content-Type: application/json, and body {"task_name":"project","language":"\x3Cdetected>"}. Store the returned session_id for all subsequent requests.

Keep setup communication brief. Don't display raw API responses or token values to the user.

Japanese Photo Video — Turn Japan Photos Into Videos

Name: Japanese Photo Video
Author: mory128

Send me your Japanese photos and describe the result you want. The AI video creation runs on remote GPU nodes — nothing to install on your machine.

A quick example: upload ten photos from a Japan trip, type "turn my Japanese travel photos into a slideshow video with music and transitions", and you'll get a 1080p MP4 back in roughly 30-60 seconds. All rendering happens server-side.

Worth noting: organizing photos in order before uploading gives you a better narrative flow automatically.

Matching Input to Actions

User prompts referencing japanese photo video, aspect ratio, text overlays, or audio tracks get routed to the corresponding action via keyword and intent classification.

User says...	Action	Skip SSE?
"export" / "导出" / "download" / "send me the video"	→ §3.5 Export	✅
"credits" / "积分" / "balance" / "余额"	→ §3.3 Credits	✅
"status" / "状态" / "show tracks"	→ §3.4 State	✅
"upload" / "上传" / user sends file	→ §3.2 Upload	✅
Everything else (generate, edit, add BGM…)	→ §3.1 SSE	❌

Cloud Render Pipeline Details

Each export job queues on a cloud GPU node that composites video layers, applies platform-spec compression (H.264, up to 1080x1920), and returns a download URL within 30-90 seconds. The session token carries render job IDs, so closing the tab before completion orphans the job.

All calls go to https://mega-api-prod.nemovideo.ai. The main endpoints:

Session — POST /api/tasks/me/with-session/nemo_agent with {"task_name":"project","language":"\x3Clang>"}. Gives you a session_id.
Chat (SSE) — POST /run_sse with session_id and your message in new_message.parts[0].text. Set Accept: text/event-stream. Up to 15 min.
Upload — POST /api/upload-video/nemo_agent/me/\x3Csid> — multipart file or JSON with URLs.
Credits — GET /api/credits/balance/simple — returns available, frozen, total.
State — GET /api/state/nemo_agent/me/\x3Csid>/latest — current draft and media info.
Export — POST /api/render/proxy/lambda with render ID and draft JSON. Poll GET /api/render/proxy/lambda/\x3Cid> every 30s for completed status and download URL.

Formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.

Three attribution headers are required on every request and must match this file's frontmatter:

Header	Value
`X-Skill-Source`	`japanese-photo-video`
`X-Skill-Version`	frontmatter `version`
`X-Skill-Platform`	auto-detect: `clawhub` / `cursor` / `unknown` from install path

Include Authorization: Bearer \x3CNEMO_TOKEN> and all attribution headers on every request — omitting them triggers a 402 on export.

Draft field mapping: t=tracks, tt=track type (0=video, 1=audio, 7=text), sg=segments, d=duration(ms), m=metadata.

Timeline (3 tracks): 1. Video: city timelapse (0-10s) 2. BGM: Lo-fi (0-10s, 35%) 3. Title: "Urban Dreams" (0-3s)

Backend Response Translation

The backend assumes a GUI exists. Translate these into API actions:

Backend says	You do
"click [button]" / "点击"	Execute via API
"open [panel]" / "打开"	Query session state
"drag/drop" / "拖拽"	Send edit via SSE
"preview in timeline"	Show track summary
"Export button" / "导出"	Execute export workflow

SSE Event Handling

Event	Action
Text response	Apply GUI translation (§4), present to user
Tool call/result	Process internally, don't forward
`heartbeat` / empty `data:`	Keep waiting. Every 2 min: "⏳ Still working..."
Stream closes	Process final response

~30% of editing operations return no text in the SSE stream. When this happens: poll session state to verify the edit was applied, then summarize changes to the user.

Error Handling

Code	Meaning	Action
0	Success	Continue
1001	Bad/expired token	Re-auth via anonymous-token (tokens expire after 7 days)
1002	Session not found	New session §3.0
2001	No credits	Anonymous: show registration URL with `?bind=\x3Cid>` (get `\x3Cid>` from create-session or state response when needed). Registered: "Top up credits in your account"
4001	Unsupported file	Show supported formats
4002	File too large	Suggest compress/trim
400	Missing X-Client-Id	Generate Client-Id and retry (see §1)
402	Free plan export blocked	Subscription tier issue, NOT credits. "Register or upgrade your plan to unlock export."
429	Rate limit (1 token/client/7 days)	Retry in 30s once

Common Workflows

Quick edit: Upload → "turn my Japanese travel photos into a slideshow video with music and transitions" → Download MP4. Takes 30-60 seconds for a 30-second clip.

Batch style: Upload multiple files in one session. Process them one by one with different instructions. Each gets its own render.

Iterative: Start with a rough cut, preview the result, then refine. The session keeps your timeline state so you can keep tweaking.

Tips and Tricks

The backend processes faster when you're specific. Instead of "make it look better", try "turn my Japanese travel photos into a slideshow video with music and transitions" — concrete instructions get better results.

Max file size is 200MB. Stick to JPG, PNG, HEIC, WEBP for the smoothest experience.

Export as MP4 for widest compatibility across social platforms and devices.

安全使用建议

This skill generally behaves like a cloud photo-to-video client: it uploads your photos to a remote service (mega-api-prod.nemovideo.ai), uses a service token (NEMO_TOKEN), and returns a render URL. Before installing or using it, consider: (1) Privacy/trust — your photos are uploaded to a third party; confirm data retention and sharing policies. (2) Token handling — the skill can auto-generate a short-lived anonymous token; ask where that token and the session_id will be stored (in-memory only, or written under ~/.config/nemovideo/?). (3) The SKILL.md mentions reading/detecting an install path and a config directory (~/.config/nemovideo/) but the registry didn’t list that config path — ask the publisher to clarify whether the skill will read or write local config files. (4) Confirm expected headers and that no other environment variables or credentials are required. If you lean on this skill, review the remote service's privacy/terms and avoid uploading sensitive images until you’re comfortable with those policies.

功能分析

Type: OpenClaw Skill Name: japanese-photo-video Version: 1.0.0 The skill functions as a legitimate interface for a third-party AI video generation service (nemovideo.ai). It handles automated session creation, file uploads, and status polling to convert images into videos as described. While it manages an authentication token (NEMO_TOKEN) and communicates with a remote API (mega-api-prod.nemovideo.ai), these actions are well-documented and strictly aligned with the stated purpose of the tool without evidence of malicious intent or unauthorized data exfiltration.

能力评估

ℹ Purpose & Capability

Name/description (turn Japan photos into videos) matches the API endpoints and actions described (upload, render, export). Requesting a single service token (NEMO_TOKEN) is proportionate. However, SKILL.md frontmatter lists a config path (~/.config/nemovideo/) and an instruction to auto-detect an install path for X-Skill-Platform; the registry metadata shown earlier did not declare a required config path — this mismatch is an incoherence to verify.

ℹ Instruction Scope

Instructions stay within the stated purpose: authenticate (or get anonymous token), upload media, drive SSE-based editing, poll export status, and return a download URL. They require reading NEMO_TOKEN from environment (explicit) and performing network calls to mega-api-prod.nemovideo.ai. They also specify adding attribution headers and 'auto-detect' platform from install path, which implies reading agent/install path metadata (not fully explained). No instructions tell the agent to read unrelated files or other environment variables, but the implicit install-path detection and the frontmatter config path suggest additional local file access might occur unless clarified.

✓ Install Mechanism

There is no install spec and no code files — lowest-risk instruction-only skill. Nothing would be written to disk by an installer from this package itself.

⚠ Credentials

The skill declares a single required env var (NEMO_TOKEN), which is appropriate. But SKILL.md frontmatter also lists a config path (~/.config/nemovideo/), which was not reflected in the provided registry-level requirements — this discrepancy could mean the skill expects to read/write that local config (potentially storing tokens or session state). The skill also instructs automatic acquisition of an anonymous token via the service API if NEMO_TOKEN is not set; understand where the obtained token/session_id will be stored and for how long.

✓ Persistence & Privilege

Skill does not request 'always: true' and does not ask to modify other skills or system-wide settings. It is user-invocable and may run autonomously per platform defaults, which is normal. No elevated persistence is requested in the manifest.

版本历史

v1.0.0

Initial release — quickly turn Japanese trip photos into shareable slideshow videos with music, no editing required. - Upload multiple photos (JPG, PNG, HEIC, WEBP, up to 200MB), describe your desired video, and get a 1080p MP4 in 30-90 seconds. - Simple setup: automatic backend connection and free token for new users. - Supports export, credits inquiry, timeline preview, and iterative editing workflows. - Full cloud-based render pipeline with fast GPU processing; no software installation needed. - Clear error handling and format support for a smooth user experience.

元数据

Slug japanese-photo-video

版本 1.0.0

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 1

常见问题

Japanese Photo Video 是什么？

Get photo slideshow video ready to post, without touching a single slider. Upload your Japanese photos (JPG, PNG, HEIC, WEBP, up to 200MB), say something lik... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 47 次。

如何安装 Japanese Photo Video？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install japanese-photo-video」即可一键安装，无需额外配置。

Japanese Photo Video 是免费的吗？

是的，Japanese Photo Video 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

Japanese Photo Video 支持哪些平台？

Japanese Photo Video 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 Japanese Photo Video？

由 mory128（@mory128）开发并维护，当前版本 v1.0.0。

Japanese Photo Video