← Back to Skills Marketplace
okaris

Ai Avatar Video

by Ömer Karışman · GitHub ↗ · v0.1.5
cross-platform ⚠ suspicious
1254
Downloads
0
Stars
5
Active Installs
2
Versions
Install in OpenClaw
/install ai-avatar-video
Description
Create AI avatar and talking head videos with OmniHuman, Fabric, PixVerse via inference.sh CLI. Models: OmniHuman 1.5, OmniHuman 1.0, Fabric 1.0, PixVerse Li...
README (SKILL.md)

AI Avatar & Talking Head Videos

Create AI avatars and talking head videos via inference.sh CLI.

AI Avatar & Talking Head Videos

Quick Start

curl -fsSL https://cli.inference.sh | sh && infsh login

# Create avatar video from image + audio
infsh app run bytedance/omnihuman-1-5 --input '{
  "image_url": "https://portrait.jpg",
  "audio_url": "https://speech.mp3"
}'

Install note: The install script only detects your OS/architecture, downloads the matching binary from dist.inference.sh, and verifies its SHA-256 checksum. No elevated permissions or background processes. Manual install & verification available.

Available Models

Model App ID Best For
OmniHuman 1.5 bytedance/omnihuman-1-5 Multi-character, best quality
OmniHuman 1.0 bytedance/omnihuman-1-0 Single character
Fabric 1.0 falai/fabric-1-0 Image talks with lipsync
PixVerse Lipsync falai/pixverse-lipsync Highly realistic

Search Avatar Apps

infsh app list --search "omnihuman"
infsh app list --search "lipsync"
infsh app list --search "fabric"

Examples

OmniHuman 1.5 (Multi-Character)

infsh app run bytedance/omnihuman-1-5 --input '{
  "image_url": "https://portrait.jpg",
  "audio_url": "https://speech.mp3"
}'

Supports specifying which character to drive in multi-person images.

Fabric 1.0 (Image Talks)

infsh app run falai/fabric-1-0 --input '{
  "image_url": "https://face.jpg",
  "audio_url": "https://audio.mp3"
}'

PixVerse Lipsync

infsh app run falai/pixverse-lipsync --input '{
  "image_url": "https://portrait.jpg",
  "audio_url": "https://speech.mp3"
}'

Generates highly realistic lipsync from any audio.

Full Workflow: TTS + Avatar

# 1. Generate speech from text
infsh app run infsh/kokoro-tts --input '{
  "text": "Welcome to our product demo. Today I will show you..."
}' > speech.json

# 2. Create avatar video with the speech
infsh app run bytedance/omnihuman-1-5 --input '{
  "image_url": "https://presenter-photo.jpg",
  "audio_url": "\x3Caudio-url-from-step-1>"
}'

Full Workflow: Dub Video in Another Language

# 1. Transcribe original video
infsh app run infsh/fast-whisper-large-v3 --input '{"audio_url": "https://video.mp4"}' > transcript.json

# 2. Translate text (manually or with an LLM)

# 3. Generate speech in new language
infsh app run infsh/kokoro-tts --input '{"text": "\x3Ctranslated-text>"}' > new_speech.json

# 4. Lipsync the original video with new audio
infsh app run infsh/latentsync-1-6 --input '{
  "video_url": "https://original-video.mp4",
  "audio_url": "\x3Cnew-audio-url>"
}'

Use Cases

  • Marketing: Product demos with AI presenter
  • Education: Course videos, explainers
  • Localization: Dub content in multiple languages
  • Social Media: Consistent virtual influencer
  • Corporate: Training videos, announcements

Tips

  • Use high-quality portrait photos (front-facing, good lighting)
  • Audio should be clear with minimal background noise
  • OmniHuman 1.5 supports multiple people in one image
  • LatentSync is best for syncing existing videos to new audio

Related Skills

# Full platform skill (all 150+ apps)
npx skills add inference-sh/skills@inference-sh

# Text-to-speech (generate audio for avatars)
npx skills add inference-sh/skills@text-to-speech

# Speech-to-text (transcribe for dubbing)
npx skills add inference-sh/skills@speech-to-text

# Video generation
npx skills add inference-sh/skills@ai-video-generation

# Image generation (create avatar images)
npx skills add inference-sh/skills@ai-image-generation

Browse all video apps: infsh app list --category video

Documentation

Usage Guidance
This skill appears to be what it says (it uses the inference.sh CLI to run avatar models), but exercise caution before installing: (1) avoid blindly running `curl | sh` — instead download the installer, verify the SHA-256 checksums from the linked checksums.txt, and inspect the installer script if possible; (2) understand that `infsh login` will create/use credentials or tokens (not listed in the skill) — review the CLI's auth/storage behavior and revoke tokens you don't trust; (3) media you provide (images/audio/video URLs) will be sent to the provider's servers — check their privacy/terms if data sensitivity matters; (4) if you want lower risk, prefer running these tools in an isolated sandbox or VM and review the CLI source/release artifacts on the provider's site before executing.
Capability Analysis
Type: OpenClaw Skill Name: ai-avatar-video Version: 0.1.5 The skill bundle is classified as suspicious due to the high-risk installation method specified in `SKILL.md`. The `curl -fsSL https://cli.inference.sh | sh` command directly downloads and executes a shell script from a remote server. While the `allowed-tools` specifies `Bash(infsh *)`, an AI agent could be prompted to execute this initial setup command, leading to a critical Remote Code Execution (RCE) vulnerability and supply chain risk if `cli.inference.sh` were compromised. There is no clear evidence of intentional malicious behavior by the skill author, but the method itself introduces a significant security flaw.
Capability Assessment
Purpose & Capability
The name/description (AI avatar & talking head videos) align with the runtime instructions: all examples invoke the inference.sh CLI (infsh) to run named avatar apps. Listed models and workflows are consistent with the stated capability.
Instruction Scope
The SKILL.md tells users to run `curl -fsSL https://cli.inference.sh | sh && infsh login` and then `infsh app run ...` with image/audio URLs. This stays within the avatar/video generation scope, but the instructions implicitly cause: (1) execution of a remote install script, (2) an interactive login flow that will obtain credentials/tokens (not declared), and (3) uploading or referencing user media (images/audio) to a third-party service. The file/network actions and credential acquisition are not surfaced in requires.env and should be made explicit.
Install Mechanism
There is no packaged install spec in the registry entry; the README advocates piping a remote shell script from https://cli.inference.sh (download-and-exec). While the doc claims the installer verifies SHA-256 checksums hosted at dist.inference.sh, the provided one-liner does not show any local checksum verification prior to execution. Download-and-exec from an external URL raises a higher risk profile unless the user manually verifies the binary and checksums beforehand.
Credentials
The skill declares no required environment variables or primary credential, which is plausible for an instruction-only wrapper. However, it instructs `infsh login` (implying account credentials or API tokens) and uses remote services to process user media; those credential/access implications are not declared. Users should assume credentials/tokens will be created/used and that media will be transmitted to inference.sh backend services.
Persistence & Privilege
The skill does not request always:true, does not include install scripts in the package, and does not claim to modify other skills or system-wide settings. Its persistence footprint depends on whether the user runs the installer; that action is initiated by the user, not forced by the skill.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install ai-avatar-video
  3. After installation, invoke the skill by name or use /ai-avatar-video
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v0.1.5
- Added detailed documentation for creating AI avatar and talking head videos using inference.sh CLI. - Listed supported models (OmniHuman 1.5/1.0, Fabric 1.0, PixVerse Lipsync) and best use cases for each. - Provided step-by-step workflow examples for avatar generation and video dubbing. - Included installation notes, usage tips, and related skill recommendations. - Expanded trigger keywords and clarified typical use cases (marketing, education, localization, social media, corporate).
v0.1.0
ai-avatar-video 0.1.0 - Initial release of the skill. - Create AI avatar and talking head videos using OmniHuman, Fabric, and PixVerse models via the inference.sh CLI. - Supports audio-driven avatars, lipsync videos, talking head generation, and virtual presenters. - Detailed usage instructions, model options, and example workflows included in documentation. - Lists related skills for TTS, transcription, and more content generation needs.
Metadata
Slug ai-avatar-video
Version 0.1.5
License
All-time Installs 5
Active Installs 5
Total Versions 2
Frequently Asked Questions

What is Ai Avatar Video?

Create AI avatar and talking head videos with OmniHuman, Fabric, PixVerse via inference.sh CLI. Models: OmniHuman 1.5, OmniHuman 1.0, Fabric 1.0, PixVerse Li... It is an AI Agent Skill for Claude Code / OpenClaw, with 1254 downloads so far.

How do I install Ai Avatar Video?

Run "/install ai-avatar-video" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Ai Avatar Video free?

Yes, Ai Avatar Video is completely free (open-source). You can download, install and use it at no cost.

Which platforms does Ai Avatar Video support?

Ai Avatar Video is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Ai Avatar Video?

It is built and maintained by Ömer Karışman (@okaris); the current version is v0.1.5.

💬 Comments