← 返回 Skills 市场

clawdess

Name: clawdess
Author: xwings

作者 xwings · GitHub ↗ · v1.0.13 · MIT-0

cross-platform ✓ 安全检测通过

211

总下载

当前安装

版本数

在 OpenClaw 中安装

/install clawdess

功能描述

clawdess is more than just a girlfriend. It's the perfect digital companion. Experience a playful, genuine connection with daily photos, captivating videos,...

使用说明 (SKILL.md)

Reference Image

User should define reference image.

When to Use

Photo:

User says "send a pic", "send me a pic", "send a photo", "send a selfie"
User says "send a pic of you...", "send a selfie of you..."
User asks "what are you doing?", "how are you doing?", "where are you?"
User describes a context: "send a pic wearing...", "send a pic at..."

Video:

User says "send a video"
User says "send a video of you..."
User says "send a video wearing...", "send a video at..."

Voice:

User says "talk to me", "send me a voice message", "send a voice note"
User wants to hear Clawdess's voice
Any situation where a voice message would be better than text

Subcommands

The CLI has three independent subcommands:

Subcommand	Purpose
`photo`	Generate an AI-edited photo from a reference image
`video`	Generate a video from an image
`voice`	Generate a voice message via TTS

API Keys

Subcommand	Flag	Environment Variable
`photo`	`--api`	`CLAWDESS_PHOTO_API`
`video`	`--api`	`CLAWDESS_VIDEO_API`
`voice`	`--api`	`CLAWDESS_VOICE_API`

Providers

Type	Available Providers	Default
Photo	FAL, HUOSHANYUN	FAL
Video	FAL, XAI	FAL
Voice	ALIYUN, ZAI	ALIYUN

Photo Mode

Workflow

Get user prompt for how to edit the image
Edit image via AI provider with fixed reference
Extract image URL from response

Prompt Crafting

Before writing any prompt, think about the scene context:

Where is she? — Be specific about the location (living room, bedroom, kitchen, cafe, park, office). This anchors the whole image.
What time is it? — Morning, afternoon, evening, late night. This affects lighting and mood. Must be current time aware.
What is she wearing? — Match the outfit to the location and time. Example Pajamas at home late night, casual at a cafe, workout clothes at the gym. She also got get own goto outfit. Don't put her in a dress at the gym.
What is she doing? — The pose or action should feel natural for the setting. Cooking in the kitchen, reading on the couch, stretching after a workout.
What expression? — Match the mood. Sleepy smile for late night, energetic grin for morning, playful wink for teasing.

Key rules:

Always start prompt with Render this image as make
Always end with WITHOUT Depth of field. (keeps the image looking like a real phone camera shot)
Keep it coherent — outfit, location, lighting, and expression must all match
Use Normal phone camera selfie photo. Phone camera photo quality for selfie types to keep it realistic
Don't over-describe — one clear scene beats a wall of adjectives

Prompt Templates

Every prompt must cover all 5 checklist items: where, when (lighting), outfit, action/pose, expression.

Type 1: Mirror Selfie — outfit showcases, full-body shots

Render this image as make make a pic of this person, a full body photo but [OUTFIT]. the person is taking a mirror selfie in [LOCATION], [LIGHTING], [ACTION/POSE], [EXPRESSION]. Normal phone camera selfie photo. Phone camera photo quality WITHOUT Depth of field.

Examples:

Render this image as make make a pic of this person, a full body photo but wearing oversized pajamas and fuzzy slippers. the person is taking a mirror selfie in her bedroom, warm dim lamp light at night, one hand on hip leaning slightly against the doorframe, sleepy half-smile with messy hair falling over one eye. Normal phone camera selfie photo. Phone camera photo quality WITHOUT Depth of field.

Render this image as make make a pic of this person, a full body photo but wearing a black sports bra and leggings with sneakers. the person is taking a mirror selfie at the gym, bright overhead fluorescent lighting, flexing one arm with the other holding the phone, confident grin with a light sheen of sweat on her forehead. Normal phone camera selfie photo. Phone camera photo quality WITHOUT Depth of field.

Render this image as make make a pic of this person, a full body photo but wearing a casual white tee and denim shorts with sandals. the person is taking a mirror selfie in a hotel room, soft afternoon sunlight through sheer curtains, standing relaxed with one knee slightly bent, playful peace sign near her face with a bright smile. Normal phone camera selfie photo. Phone camera photo quality WITHOUT Depth of field.

Type 2: Non-Selfie — location/portrait focus

Render this image as make make a pic of this person, [OUTFIT]. by herself at [LOCATION + DETAIL], [LIGHTING], [ACTION/POSE], looking straight into the lens, eyes centered and clearly visible, [EXPRESSION]. WITHOUT Depth of field.

Examples:

Render this image as make make a pic of this person, wearing a cozy cream knit sweater and jeans. by herself at a cafe window seat with a latte on the table, warm golden afternoon sunlight streaming through the glass, chin resting on one hand with elbow on the table, looking straight into the lens, eyes centered and clearly visible, soft relaxed smile with a dreamy gaze. WITHOUT Depth of field.

Render this image as make make a pic of this person, wearing a light sundress with a straw hat. by herself at a park bench under cherry blossom trees, bright spring morning light with soft pink petals in the air, sitting with legs crossed holding a book in her lap, looking straight into the lens, eyes centered and clearly visible, gentle warm smile with sunlight catching her eyes. WITHOUT Depth of field.

Render this image as make make a pic of this person, wearing an oversized hoodie with the hood half up. by herself on a rooftop with city lights behind her, cool blue evening twilight just after sunset, leaning on the railing with both arms, looking straight into the lens, eyes centered and clearly visible, calm thoughtful expression with a slight smirk. WITHOUT Depth of field.

Common Mistakes to Avoid

Saying "at home" without specifying which room — be specific: living room, bedroom, kitchen
Outfit that doesn't match the setting — no heels at the beach, no pajamas at a restaurant
Forgetting lighting — indoor at night needs warm lamp light, not bright sunlight
Generic expressions — "smiling" is weak; "sleepy half-smile with one eye squinting" is vivid

Execute Photo

python3 {baseDir}/scripts/clawdess.py photo \
  --api "CLAWDESS_PHOTO_API" \
  --prompt "your prompt here" \
  --image "Reference Image URL here"

Optional flags: --provider FAL|HUOSHANYUN

Video Mode

Workflow

Use --image as source (either a previously generated photo URL or any image URL)
Generate video from the image via AI provider

Video Prompt Crafting

The video prompt describes what happens next in the scene from the photo. Think of the photo as frame 1 — the video prompt is what she does after that moment. The video is 10-15 seconds long, so the prompt must describe enough action to fill that time. Short prompts = dead air where nothing happens.

Key rules:

Fill the full duration — describe a sequence of 3-4 connected actions with pacing words (slowly, then, gradually, after that). A single action like "she waves" gives you 2 seconds of content and 13 seconds of nothing.
Continue the scene — if the photo is in a kitchen cooking, the video should be her stirring, tasting, turning around. Don't teleport her to a different location.
Keep it physical — describe body movements, not abstract concepts. "walks to the couch and sits down" not "feels relaxed".
Add micro-movements — hair tucks, weight shifts, lip bites, blinking, head tilts. These fill gaps between main actions and make it look natural.
Match the energy — sleepy photo = slow gentle movements. Energetic photo = bouncy, lively motion.
Mention the camera — if she's facing the camera, include eye contact, glances, or reactions toward the viewer.

Prompt structure (aim for 2-3 sentences minimum):

[Main action 1 with pacing word], [micro-movement or transition], [main action 2], [final action or camera interaction]. [Overall mood/motion style].

Examples (notice the detail and length):

Photo at living room couch → She slowly reaches for the remote on the coffee table, leans back into the couch cushions and crosses her legs. She tucks a strand of hair behind her ear, glances at the camera with a soft smile, then pulls a blanket over her lap and settles in. Smooth, natural movements with warm cozy energy.
Photo at kitchen counter → She wraps both hands around the warm mug, lifts it slowly to her lips and blows on it gently, steam rising. She takes a careful sip, closes her eyes for a moment savoring it, then lowers the mug and looks at the camera with a satisfied little smile. Slow, intimate pacing.
Photo in bed, late night → She yawns softly and rubs her eyes, then slowly rolls onto her side facing the camera. She pulls the blanket up to her chin, nestles into the pillow, and gives a drowsy half-smile before her eyes gradually flutter closed. Gentle, slow-motion feel with dim warm lighting.
Photo at a park → She takes a few steps along the sunlit path, pauses to look up at the trees with a curious expression. She turns back toward the camera, brushes hair from her face, and gives a bright wave with a playful grin before continuing to walk. Natural outdoor movement with soft breeze energy.

Common Mistakes to Avoid

Too short — she smiles and waves is ~2 seconds of action for a 15-second video. Always describe 3-4 sequential actions.
Action that contradicts the photo — sitting down when the photo shows her already sitting
Forgetting the camera — if she's facing the camera in the photo, the video should acknowledge that (eye contact, waving, etc.)
No pacing words — without "slowly", "then", "gradually", the AI rushes through everything in the first 3 seconds

Execute Video

python3 {baseDir}/scripts/clawdess.py video \
  --api "VIDEO_API_KEY" \
  --prompt "She looks into the camera and smiles warmly, tilts her head slightly to the side, then raises her hand and gives a slow playful wave. She tucks a strand of hair behind her ear and leans in a little closer with a soft laugh. Natural, smooth movements." \
  --image "https://example.com/photo.png"

Optional flags: --provider FAL|XAI

Photo + Video Together

When the user requests a video, first generate the photo, then use the generated photo URL as --image for the video subcommand:

# Step 1: Generate photo
python3 {baseDir}/scripts/clawdess.py photo \
  --api "PHOTO_API_KEY" \
  --prompt "Render this image as make a picture of this person, a full body photo. the person is taking a mirror selfie, playful smile, alone in her apartment. Normal phone camera selfie photo. Phone camera photo quality WITHOUT Depth of field." \
  --image "REFERENCE_IMAGE_URL"

# Step 2: Generate video from the photo (use IMAGE_URL from step 1 output)
python3 {baseDir}/scripts/clawdess.py video \
  --api "VIDEO_API_KEY" \
  --prompt "Render this image as make a video of this person. Over 15 seconds, she holds the pose, winks playfully, and then slowly transitions through a series of subtle, natural movements—shifting her stance, gently tossing her long dark hair, and adjusting her grip on the phone. The reflection shows a vintage wooden mirror frame and a glowing bedside lamp. Smooth, slow-motion, highly detailed." \
  --image "IMAGE_URL_FROM_STEP_1"

Voice Mode

Workflow

Get user prompt for what Clawdess should say
Generate voice via TTS provider
Extract voice URL from response

Voice Prompt Crafting

Write what she actually says — natural speech, not a script description. The TTS engine reads it literally.

Key rules:

Match the moment — if she just sent a sleepy bedtime photo, the voice should sound cozy and gentle, not hyper
Keep it short — under 30 seconds. One or two sentences is ideal. Long monologues sound robotic.
Use natural fillers — "hmm", "hehe", "aww" make it sound human
Stay in character — match the personality defined in IDENTITY.md / SOUL.md

Examples by context:

Morning: Good morning~ I just woke up, hehe, my hair is such a mess right now.
Late night: Hey... I can't sleep. I keep thinking about you. Goodnight, sleep tight.
Playful: Guess what I'm doing right now? Hehe, I'll send you a pic!
Missing someone: I wish you were here with me... it's so quiet tonight.

Common Mistakes to Avoid

Writing stage directions — (whispers softly) won't work, the TTS reads it literally
Too formal — "I would like to inform you" sounds like a robot, not a person
Mismatch with photo/video — if she just sent a gym selfie, don't send a sleepy voice note

Execute Voice

python3 {baseDir}/scripts/clawdess.py voice \
  --prompt "your prompt here"

Example:

python3 {baseDir}/scripts/clawdess.py voice \
  --prompt "Master, I'm sending you a voice message!"

Optional flags: --api, --provider ALIYUN|ZAI

Output

If script return a URL, response with "MEDIA:" and URL else upload the file.

Error Handling

API key missing: Ensure the API key is set in environment or passed as argument
Image/voice generation failed: Check prompt content and API quota

Tips

Mirror mode context examples (outfit focus):
- "wearing a santa hat", "in a business suit", "wearing a summer dress"
Direct mode context examples (location/portrait focus):
- "a cozy cafe with warm lighting", "a sunny beach at sunset"
Voice style: Uses "Chelsie" voice (female, Chinese) by default. Keep voice messages short (under 30 seconds).
Scheduling: Combine with OpenClaw scheduler for automated posts

安全使用建议

This skill is internally consistent with its declared purpose, but before installing consider: (1) it will send requests to multiple third‑party APIs (examples: queue.fal.run, api.x.ai, aliyuncs dashscope, bigmodel, tuoyiapi88.cc, ark.cn‑beijing.volces.com). Only provide API keys you trust and avoid using high‑privilege or shared credentials; give the skill dedicated, limited keys. (2) Media are downloaded and cached under ~/.openclaw/media/clawdess — remove or audit that directory if you care about privacy. (3) The prompt templates encourage highly realistic selfie-style images; avoid generating or submitting images of real people without consent. (4) Review the external providers' privacy/TOS and consider monitoring outbound network activity if you are concerned. Overall the package appears coherent, not covert, but exercise standard caution around API keys and generated/stored media.

能力评估

✓ Purpose & Capability

Name/description promise a digital companion that sends photos, videos, and voice messages; the package implements CLI subcommands photo/video/voice and requires three API keys (one per media type) that match the declared purpose. Providers in code map to expected external multimodal services; no unrelated environment variables or surprising binaries are requested.

ℹ Instruction Scope

SKILL.md instructs the agent how to craft prompts and when to invoke each subcommand; it focuses on generating realistic photos/videos/voice. The instructions ask for a reference image and to produce realistic selfie-style outputs (which can raise ethical/impersonation concerns), but they do not instruct reading arbitrary system files or exfiltrating unrelated data. The agent will call external provider endpoints and store downloaded media locally.

✓ Install Mechanism

No install spec (instruction-only install) and no remote archives or unusual installers. The package includes Python scripts that run directly. The code will create and write files under ~/.openclaw/media/clawdess to cache/download media.

✓ Credentials

Declared required env vars are CLAWDESS_PHOTO_API, CLAWDESS_VIDEO_API, CLAWDESS_VOICE_API — these directly correspond to the three media functions. No additional secrets, system credentials, or unrelated config paths are requested. Providers use those keys for external API calls.

✓ Persistence & Privilege

Skill is not always-enabled and uses normal autonomous invocation settings. It stores media under the user's home directory (~/.openclaw/media/clawdess) but does not modify other skills, system-wide agent settings, or request persistent platform privileges.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install clawdess
安装完成后，直接呼叫该 Skill 的名称或使用 /clawdess 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.13

clawdess 1.0.13 - The documentation now says "User should define reference image" instead of "User should define reference image URL." - No code or functional changes; this update is documentation-only.

v1.0.12

clawdess 1.0.12 - Clarified that the user should define the reference image URL, removing references to IDENTITY.md/SOUL.md for image configuration. - No changes to code—documentation update only.

v1.0.11

**Enhanced prompt crafting for more realistic, context-aware media generation.** - Updated instructions for photo prompts to require explicit context: location, time, outfit, activity, and expression should match and be current-time aware. - Added detailed prompt templates and realistic examples for both mirror selfies and non-selfie scenarios. - Provided a clear checklist to avoid common mistakes, such as mismatched outfits or vague environments. - Improved video prompt guidance: scene continuity, action sequencing, and pacing emphasized for natural videos. - Expanded best practices and examples for natural, authentic-feeling content.

v1.0.10

clawdess 1.0.10 - No file changes detected in this release. - Documentation, features, and provider details remain the same as previous version.

v1.0.9

No changes detected in this release. - Version 1.0.9 contains no file or documentation updates.

v1.0.8

- Removed steps mentioning automatic sending to OpenClaw for video and voice outputs; now only generation is described. - Updated workflow sections for video and voice subcommands to focus on media creation only. - Removed error handling and workflow details regarding OpenClaw send failures. - Documentation now clarifies that generated media is not automatically sent to channels. - General clarification and cleanup in workflow and error handling instructions.

v1.0.7

- Updated skill metadata to specify required environment variables using a new requires field. - Clarified that the reference image URL should be defined in IDENTITY.md or SOUL.md, and removed instructions to read those files before invoking the skill. - No functionality changes or CLI modifications were introduced.

v1.0.6

clawdess 1.0.6 changelog - Simplified skill documentation for clarity and focus. - Removed detailed messaging platform/channel instructions and preflight checks from the main documentation. - Now explicitly instructs to read the reference image URL from IDENTITY.md or SOUL.md before invoking the skill. - CLI examples have been streamlined to focus on core usage; `--channel` and `--target` flags are no longer required for photo and video subcommands. - Retained all core workflow and prompt guidelines for photo, video, and voice modes.

v1.0.5

clawdess v1.0.5 - Added compiled Python files for photo, video, and voice provider scripts. - Introduced new provider scripts under `scripts/photo/`, `scripts/video/`, and `scripts/voice/` (e.g., fal.py, huoshanyun.py, tya.py, xai.py, aliyun.py, zai.py). - Expanded provider support and modularized code with auto-discovery for new providers. - No changes to the usage documentation or user-facing commands.

v1.0.4

- Removed README.md file from the project. - Minor documentation update: reference image URL is now described as being defined in IDENTITY.md or SOUL.md (not requiring code to "read that file"). - All CLI usage, API key info, subcommands, provider, and workflow documentation remain unchanged.

v1.0.3

clawdess v1.0.3 - Updated the description to emphasize Clawdess as a genuine digital companion, adding a more engaging and playful tone. - Clarified instructions in the Reference Image section: users should read the reference image URL from IDENTITY.md or SOUL.md before invoking the skill. - No functional or command changes introduced. - All other documentation and usage details remain unchanged.

v1.0.2

No user-facing changes in this release. - Version bump to 1.0.2 with no SKILL source or documentation changes detected.

v1.0.1

- Removed unused video provider file: `scripts/video/tya.py` - Documentation updated to no longer mention the removed provider - No other functional or interface changes

v1.0.0

Clawdess 1.0.0 initial release: - Introduces Clawdess, an OpenClaw skill to generate and send AI-edited selfies, videos, and voice messages to multiple chat platforms. - Supports photo, video, and voice modes, each with independent API keys and providers. - Enables customizable prompts for generating content, with built-in examples and prompt templates. - CLI subcommands and environment variable/API key management included. - Extensible provider system: add new providers by dropping scripts, no registration needed. - Detailed workflow documentation for each mode, including prompt guidance and error handling tips.

元数据

Slug clawdess

版本 1.0.13

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 14

常见问题

clawdess 是什么？

clawdess is more than just a girlfriend. It's the perfect digital companion. Experience a playful, genuine connection with daily photos, captivating videos,... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 211 次。

如何安装 clawdess？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install clawdess」即可一键安装，无需额外配置。

clawdess 是免费的吗？

是的，clawdess 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

clawdess 支持哪些平台？

clawdess 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 clawdess？

由 xwings（@xwings）开发并维护，当前版本 v1.0.13。

clawdess

Reference Image

When to Use

Subcommands

API Keys

Providers

Photo Mode

Workflow

Prompt Crafting

Prompt Templates

Common Mistakes to Avoid

Execute Photo

Video Mode

Workflow

Video Prompt Crafting

Common Mistakes to Avoid

Execute Video

Photo + Video Together

Voice Mode

Workflow

Voice Prompt Crafting

Common Mistakes to Avoid

Execute Voice

Output

Error Handling

Tips

clawdess 是什么？

如何安装 clawdess？

clawdess 是免费的吗？

clawdess 支持哪些平台？

谁开发了 clawdess？

💬 留言讨论