← Back to Skills Marketplace
huangm199

doubao-image-auto

by huangm199 · GitHub ↗ · v2.0.0 · MIT-0
cross-platform ⚠ suspicious
120
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install doubao-image-auto
Description
豆包 AI 创作自动化 - 通过 CDP 浏览器自动化实现无手动交互的图像生成与提取。工作流:1) 连接已打开的豆包页面 2) 导航到 AI 创作页 3) 输入 prompt 并自动生成 4) 提取生成的图片 URL 5) 下载保存到本地。
README (SKILL.md)

doubao-image-auto

通过 OpenClaw 浏览器控制(CDP)自动化完成豆包 AI 创作页面生图。

工作原理

  1. 连接浏览器 - 通过 CDP 连接到已打开的豆包页面(端口 18800)
  2. 导航到 AI 创作页 - 打开 /chat/create-image
  3. 自动输入 - 在文本框输入 prompt
  4. 自动点击生成 - 点击生成按钮
  5. 提取图片 URL - 从页面 DOM 中提取图片地址
  6. 下载保存 - 下载到指定目录

使用方式

方式一:通过 OpenClaw 浏览器(推荐)

// 1. 打开豆包 AI 创作页
await browser.navigate('https://www.doubao.com/chat/create-image')

// 2. 输入 prompt(在 textbox 中输入)
await browser.act({
  kind: 'type',
  ref: '\x3Ctextbox ref>',
  text: '生成一只可爱的老虎头像,动漫风格'
})

// 3. 点击生成按钮
await browser.act({
  kind: 'click',
  ref: '\x3Cbutton ref>'
})

// 4. 等待生成完成后,提取图片 URL
const result = await browser.act({
  fn: `() => {
    const imgs = document.querySelectorAll('img[src*="byteimg"], img[src*="imagex"]');
    return Array.from(imgs).slice(0,4).map(i => i.src);
  }`,
  kind: 'evaluate'
})

方式二:独立 Node 脚本

// doubao-auto-gen.js
const CDP = require('chrome-remote-interface');

async function main() {
  const targets = await CDP.List({ port: 18800 });
  const page = targets.find(t => t.type === 'page' && t.url.startsWith('https://www.doubao.com'));
  if (!page) throw new Error('No Doubao page');

  const client = await CDP({ target: page.id, port: 18800 });
  const { Runtime, Input } = client;
  await Runtime.enable();

  // 导航到 AI 创作页
  await Runtime.evaluate({ expression: 'window.location.href = "https://www.doubao.com/chat/create-image"' });
  await new Promise(r => setTimeout(r, 3000));

  // 输入 prompt
  const prompt = '生成一只可爱的老虎头像,动漫风格';
  await Runtime.evaluate({
    expression: `
      (function(){
        const ta = document.querySelector('textarea');
        if(!ta) return JSON.stringify({ok:false});
        ta.value = '${prompt}';
        ta.dispatchEvent(new Event('input', {bubbles:true}));
        return JSON.stringify({ok:true});
      })()
    `,
    returnByValue: true
  });

  // 按回车发送
  await Input.dispatchKeyEvent({ type: 'keyDown', windowsVirtualKeyCode: 13 });
  await Input.dispatchKeyEvent({ type: 'keyUp', windowsVirtualKeyCode: 13 });

  // 等待生成
  await new Promise(r => setTimeout(r, 15000));

  // 提取图片
  const images = await Runtime.evaluate({
    expression: `
      (function(){
        const imgs = [...document.querySelectorAll('img')].filter(img => 
          img.src.includes('byteimg') || img.src.includes('imagex')
        );
        return JSON.stringify(imgs.map(img => img.src));
      })()
    `,
    returnByValue: true
  });

  console.log('Images:', images.result.value);
  await client.close();
}

main().catch(e => console.error(e));

图片下载

# PowerShell 下载图片
$urls = @('图片URL1', '图片URL2')
$outDir = 'C:\path	o\output'
foreach ($url in $urls) {
  $name = [System.IO.Path]::GetFileName([System.Uri]$url.Split('~')[0]) + '.png'
  Invoke-WebRequest -Uri $url -OutFile (Join-Path $outDir $name)
}

依赖

  • 浏览器 - Chrome/Edge 已打开并启用远程调试(端口 18800)
  • CDP 连接 - OpenClaw 浏览器控制已启动,或独立 chrome-remote-interface
  • 网络 - 能访问 doubao.com 和 byteimg.com

当前状态

已完全搞定! - 不需要浏览器,纯 API 调用,自动保存图片

已验证的工作流

# 直接生成图片并下载
node doubao_media_api.js chat "生成一只可爱的卡通老虎头像,动漫风格" --download --output "./captures"
  • ✅ Session 保持有效(无需每次登录)
  • ✅ 从 SSE 响应中提取图片 URL
  • ✅ 自动下载原图到本地
  • ✅ 生成的图片发送给你

Cookie 复用机制

Session 文件位置:C:\Users\huang\.doubao_chat_session.json

每次调用 API 时会自动读取这个 Cookie,不需要打开浏览器。

如需手动刷新 Cookie:

node doubao_api.js login-if-needed
Usage Guidance
This skill's core idea — using Chrome DevTools Protocol to automate a website and download generated images — is plausible, but the implementation instructions contain inconsistencies and privacy risks. Before installing or running: 1) don't give the agent or any provided scripts access to your real cookie/session files; the SKILL.md explicitly references a local session file (C:\Users\huang\.doubao_chat_session.json) which could expose your authenticated session. 2) Ask the publisher for the missing code: the README references doubao_media_api.js and other scripts that are not included; request full source and a clear install list (npm packages) so you can audit them. 3) Prefer running automation in an isolated environment (a disposable profile or VM) with remote debugging enabled only temporarily on a non-privileged browser profile. 4) Verify network endpoints used by any included scripts — ensure image downloads and SSE calls go to expected doubao/byteimg domains and not third-party endpoints. 5) If you cannot obtain source and justification for the hard-coded session-file behavior and the 'pure API' claim, treat the skill as untrusted and avoid running its scripts on machines with real accounts.
Capability Analysis
Type: OpenClaw Skill Name: doubao-image-auto Version: 2.0.0 The skill bundle contains hardcoded absolute file paths to sensitive session data (C:\Users\huang\.doubao_chat_session.json) and references external scripts (doubao_media_api.js, doubao_api.js) that are not included in the provided files. The documentation also contains contradictory instructions, claiming to use both CDP browser automation and 'pure API calls' without a browser, which makes the actual execution logic opaque and potentially risky if the missing scripts contain malicious behavior.
Capability Assessment
Purpose & Capability
Name/description promise CDP browser automation to generate and download images, which matches the CDP examples in SKILL.md. However the doc also claims '不需要浏览器,纯 API 调用' ('no browser required, pure API calls') and references external helper scripts (doubao_media_api.js, doubao_api.js) that are not included. The skill declares no config/credentials but hard-codes a Windows session file path (C:\Users\huang\.doubao_chat_session.json), which does not align with the declared requirements.
Instruction Scope
Runtime instructions tell the agent to connect to a browser via CDP (port 18800), navigate, type, click, evaluate page JS, extract image URLs and download files. They also instruct reuse of a local session cookie file and include example scripts that read and log site responses. The instructions reference reading a specific local cookie file and running unspecified node scripts — this expands scope to local filesystem and session data access beyond the stated purpose.
Install Mechanism
Skill is instruction-only (no install spec), which minimizes installer risk. However the examples rely on node (chrome-remote-interface) and PowerShell commands but do not declare dependencies or installation steps; that omission is an operational gap rather than a direct supply-chain risk, but it means users may run unreviewed scripts or install packages ad-hoc.
Credentials
Declared requirements: none. Actual instructions: read and reuse a session cookie file at a hard-coded user path, and refer to other local scripts and SSE/API flows. Asking to read local session cookies (which may contain authentication tokens) is a sensitive capability not declared in metadata and is disproportionate to a simple 'navigate-and-download' description.
Persistence & Privilege
Skill does not request always:true or other elevated skill-level privileges. However the SKILL.md implies storing and reusing session files and running login-refresh scripts over time — behavior that would create persistent local state if the user follows the instructions. The metadata does not declare or explain that persistence.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install doubao-image-auto
  3. After installation, invoke the skill by name or use /doubao-image-auto
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v2.0.0
升级到 2.0:补充纯 API 工作流、Cookie 复用说明、下载与自动保存能力
Metadata
Slug doubao-image-auto
Version 2.0.0
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 1
Frequently Asked Questions

What is doubao-image-auto?

豆包 AI 创作自动化 - 通过 CDP 浏览器自动化实现无手动交互的图像生成与提取。工作流:1) 连接已打开的豆包页面 2) 导航到 AI 创作页 3) 输入 prompt 并自动生成 4) 提取生成的图片 URL 5) 下载保存到本地。 It is an AI Agent Skill for Claude Code / OpenClaw, with 120 downloads so far.

How do I install doubao-image-auto?

Run "/install doubao-image-auto" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is doubao-image-auto free?

Yes, doubao-image-auto is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does doubao-image-auto support?

doubao-image-auto is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created doubao-image-auto?

It is built and maintained by huangm199 (@huangm199); the current version is v2.0.0.

💬 Comments