功能描述

Automate Midjourney Alpha web image generation from Claude using the authenticated https://alpha.midjourney.com session. Use this skill whenever the user wan...

使用说明 (SKILL.md)

Auto Midjourney

Name: Auto Midjourney
Author: standed

Use the user's own Midjourney Alpha web session to submit imagine jobs and optionally poll for results.

This skill is intended for conservative, user-triggered assistance rather than unattended bulk automation.

What this skill does

Submits prompts to https://alpha.midjourney.com/api/submit-jobs
Defaults to Midjourney v8 for new prompts unless the user explicitly requests another version
Keeps credentials in environment variables instead of hardcoding them into the skill
Supports a one-command flow through scripts/run_imagine.py
Can read https://alpha.midjourney.com/api/user-mutable-state to inspect current web settings
Can infer user_id and singleplayer_\x3Cmidjourney_id> from the authenticated cookie
Applies local conservative throttling to reduce accidental request bursts
Includes scripts/mj_doctor.py for setup validation
Includes an experimental recent-jobs reader
Supports browser transport backed by Chrome DevTools Protocol, with Playwright-over-CDP preferred when installed
Supports optional polling once a job-status endpoint has been confirmed
Includes prompt-craft guidance and reusable scenario presets for better prompt writing
Includes a structured prompt builder and opt-in quality profiles
Includes dedicated guidance for character sheets, split-view turnarounds, and reusable design assets

Current scope

This version focuses on:

imagine
Midjourney v8 as the default version
Alpha web flow, not Discord bot flow
Safe local configuration via .env
Easier operation through inferred IDs, presets, and doctor checks

Implemented:

imagine submit through the Alpha web flow
browser-backed verification using low-impact CDP network watching plus page asset fallback
local download of the 4 returned image assets
optional browser-side conversion from webp to png during download
sequential batch generation

Not implemented yet:

Upscale / variation / reroll button actions
Image upload / reference-image workflow
Automatic result download from a confirmed final image endpoint

Those should be added only after capturing stable request samples from the browser.

Safety posture

Do not optimize this skill for bypassing restrictions, hiding automation, rotating accounts, or mass unattended generation.

Use these guardrails instead:

trigger requests manually
keep request frequency low
leave local throttling enabled
validate config with mj_doctor.py
prefer one human action to one live submit

The goal is risk reduction through conservative usage, not evasion.

Trigger rules

Use this skill proactively when the user asks to:

“用 Midjourney / MJ 出图”
“用 Midjourney v8 生成”
“帮我提交 imagine”
“优化 Midjourney prompt”
“做角色设定稿 / 四视图 / 角色资产图”
“抓 Midjourney Alpha 网站请求”
“轮询 Midjourney job 状态”
“把网页版 Midjourney 做成自动化能力”

Required configuration

Read these values from .env or shell environment:

Variable	Required	Purpose
`MJ_COOKIE`	Yes	Full authenticated Cookie header copied from browser
`MJ_CHANNEL_ID`	Yes	Alpha web singleplayer channel ID
`MJ_STATUS_URL_TEMPLATE`	No	Job status endpoint template containing `{job_id}`
`MJ_USER_STATE_PATH`	No	Defaults to `/api/user-mutable-state`
`MJ_RECENT_JOBS_URL`	No	Experimental recent-jobs endpoint
`MJ_MODE`	No	`fast` by default
`MJ_PRIVATE`	No	`true` by default
`MJ_MIN_SUBMIT_INTERVAL_SECONDS`	No	Local minimum spacing between submits. Default is `3` seconds
`MJ_MAX_SUBMITS_PER_HOUR`	No	Local hourly cap. Set `0` to disable, which is now the default
`MJ_MAX_SUBMITS_PER_DAY`	No	Local daily cap. Set `0` to disable, which is now the default
`MJ_USER_ID`	No	Usually inferred from the auth cookie
`MJ_METRICS_TOKEN`	No	Optional token observed on telemetry requests
`MJ_BROWSER_BACKEND`	No	`auto` by default. Set `playwright` or `cdp` to force a backend

Never write real cookies or tokens into SKILL.md, reference files, git-tracked scripts, or user-facing summaries.

System requirements

For platform and device requirements, read system-requirements.md.

Workflow

Scenario 0: Check config first

Run:

python3 scripts/mj_doctor.py --fetch-user-state --transport browser

This shows:

whether the cookie exists
inferred midjourney_id
inferred channel_id
current server-side speed and visibility
current local safe-limit settings

Scenario 1: Submit one prompt

Run:

python3 scripts/run_imagine.py "1 girl --ar 16:9" --transport browser

Default behavior:

Appends --v 8 if the prompt does not already specify a version
Appends --raw by default unless the user disables it
Uses MJ_MODE and MJ_PRIVATE from the environment
Can sync server-side defaults before submitting
Records the submit locally and enforces conservative pacing
Prints structured JSON with the request payload, submission response, and extracted job_id

For simplest live use:

python3 scripts/run_imagine.py "cinematic portrait of a fox astronaut" --transport browser --sync-user-state --wait-page-assets --download --convert-to png

When --wait-page-assets is enabled, the browser transport now prefers watching Midjourney's existing in-page network traffic for the submitted job_id. It falls back to page asset probing only if the low-impact watcher does not yield 4 images.

Scenario 2: Submit and wait

If a working status endpoint has been captured and stored in MJ_STATUS_URL_TEMPLATE, run:

python3 scripts/run_imagine.py "cinematic portrait of a fox astronaut --ar 16:9" --wait

Scenario 3: Low-risk debugging

Use dry-run first when changing payload structure:

python3 scripts/run_imagine.py "robot barista in tokyo alley" --dry-run

This validates prompt normalization and payload generation without sending a live request.

Scenario 3b: Use a preset

Run:

python3 scripts/run_imagine.py "silver perfume bottle on black glass" --preset product --sync-user-state

Preset definitions live in config/presets.example.json.

When the user wants better prompt wording, templates, or parameter tradeoffs, read prompt-craft.md.

Scenario 3c: Build a prompt from a template

Run:

python3 scripts/mj_prompt_helper.py --template product --subject "premium silver perfume bottle" --camera "front three-quarter angle" --surface "black glass surface" --lighting "controlled softbox rim light" --background "dark charcoal background" --mood "minimal luxury beauty campaign" --preset product_clean_square --quality-profile final_v8_q4 --json

This produces a V8-friendly prompt string and a ready-to-run run_imagine.py command.

Scenario 4: Read current web settings

Run:

python3 scripts/get_user_state.py --transport browser

This reads the same user-mutable-state endpoint the web app uses and returns values such as:

settings.speed
settings.visibility
abilities
saved macros

Command reference

Submit only

python3 scripts/submit_job.py "minimalist glass monolith --ar 16:9 --v 8"

Poll one job

python3 scripts/poll_job.py "\x3Cjob_id>"

Inspect current server-side settings

python3 scripts/get_user_state.py --transport browser

Validate config and inferred identity

python3 scripts/mj_doctor.py --fetch-user-state --transport browser

Experimental recent jobs lookup

python3 scripts/list_recent_jobs.py --amount 10

Submit, verify, and download locally

python3 scripts/run_imagine.py "fashion editorial, silver fabric, studio light --ar 3:4" --transport browser --sync-user-state --wait-page-assets --download --convert-to png

Batch generate and store PNGs

python3 scripts/batch_generate.py config/prompts.example.txt --transport browser --sync-user-state --convert-to png

Use conservative fixed spacing and batch cooldowns when needed:

python3 scripts/batch_generate.py config/prompts.example.txt --transport browser --sync-user-state --convert-to png --batch-size 5 --submit-interval-seconds 120 --batch-cooldown-seconds 600

Apply an opt-in quality profile to the whole batch:

python3 scripts/batch_generate.py config/prompts.example.txt --transport browser --sync-user-state --quality-profile final_v8_q4 --convert-to png

Convert existing WEBP downloads to PNG

python3 scripts/convert_downloads.py outputs

Or write converted files into a separate directory:

python3 scripts/convert_downloads.py outputs --output-dir outputs/png-converted

If you want to rebuild PNGs from a saved result manifest instead of local files:

python3 scripts/convert_downloads.py outputs/live-test/recent-job.json --output-dir outputs/png-rebuilt

This waits for the submitted job_id to appear in the browser page resource list, verifies that 4 Midjourney CDN image URLs exist, and downloads the returned image files into outputs/\x3Cjob_id>/.

Batch generate sequentially

Create a text file with one prompt per line, then run:

python3 scripts/batch_generate.py prompts.txt --transport browser --sync-user-state --download-dir outputs/batch

This submits prompts one by one, waits for each job_id to produce page resource URLs, and downloads the returned image files before moving to the next prompt.

Full flow

python3 scripts/run_imagine.py "fashion editorial, silver fabric, studio light --ar 3:4" --sync-user-state --wait-recent-jobs --download

Prompt defaults

When the user does not specify MJ flags:

Default to --v 8
Default to --raw
Preserve any explicit aspect ratio or style flags already present
Do not auto-add --hd
Treat --q 4 as an opt-in final-pass override rather than a default for the current V8-focused flow

Do not silently override explicit user flags.

Output format

When you use this skill, report back in this structure:

Prompt: \x3Cfinal prompt sent>
Job ID: \x3Cjob_id or "not returned">
Mode: \x3Cfast/relax/etc>
Visibility: \x3Cprivate/public>
Status: \x3Csubmitted / polled / failed / dry-run>
Notes: \x3Cmissing status endpoint, saved JSON path, or next step>

Simpler usage model

For day-to-day use, prefer this sequence:

python3 scripts/mj_doctor.py --fetch-user-state --transport browser
python3 scripts/run_imagine.py "\x3Cprompt>" --transport browser --sync-user-state
python3 scripts/run_imagine.py "\x3Cprompt>" --transport browser --sync-user-state --wait-page-assets --download

This reduces manual mistakes and keeps you in a low-frequency workflow. On this machine, browser transport is the preferred live path because raw HTTP requests are blocked by Cloudflare.

Success criteria for a usable generation

Treat a generation as verified only when all of these are true:

submit response contains a job_id
page resource entries contain 4 CDN image URLs for the same job_id
the image files are downloaded successfully to local disk

Normal Midjourney imagine behavior is a 4-image grid. Depending on the endpoint response, you may get one grid image file rather than four separately cropped files. This skill currently verifies and downloads the returned image assets as provided by Midjourney.

Extension path

After the user captures more browser requests, extend in this order:

Confirm job-status endpoint and final-image fields
Add result downloader
Add button-action support for upscale / variation
Add reference image upload flow
Add prompt-template helpers for common MJ styles

Confirmed non-status endpoints

POST https://proxima.midjourney.com/ is currently treated as telemetry ingestion
POST /api/v1/traces is currently treated as tracing/observability data
GET /api/user-mutable-state is useful for reading current speed and visibility defaults

Do not mistake telemetry endpoints for job-status APIs.

GitHub-informed but experimental path

The skill includes an experimental recent-jobs reader based on public GitHub reverse-engineering notes. Treat it as best-effort support rather than a stable API contract.

安全使用建议

Do not install or run this skill until you accept the risk of supplying a full authenticated Midjourney web cookie and channel ID to local scripts. Key points to consider: - The registry metadata omitted required credentials: SKILL.md and the code require MJ_COOKIE (the full authenticated Cookie header) and MJ_CHANNEL_ID. A full cookie allows the code to act as you on the Midjourney web app — treat it like a password. - The repo contains Python and Node scripts (Playwright dependency). You will need to run npm install and provide a Python runtime; Playwright can download browsers and execute code that interacts with remote sites and local files. - If you proceed, run the code in an isolated environment (dedicated VM/container) and avoid using your primary Midjourney account. Prefer a disposable/test account if possible. - Inspect the key scripts (mj_alpha.py, mj_browser.*, run_imagine.py, mj_doctor.py) yourself or have a trusted developer review them; verify they do not leak your cookie to third-party endpoints. The references mention telemetry endpoints (proxima.midjourney.com) — the skill says it doesn't need them, but review the runtime behavior. - Corrective action: ask the publisher/registry to update the skill metadata to explicitly declare the required env vars and primary credential, and include a clear install spec and minimal required commands. If that is not fixed, treat the omission as a red flag and be cautious.

功能分析

Type: OpenClaw Skill Name: auto-midjourney Version: 0.1.2 The skill automates the Midjourney Alpha web interface by requiring the user's full authentication cookie (MJ_COOKIE) and controlling a local Chrome instance via Chrome DevTools Protocol (CDP) or Playwright. It employs high-risk execution patterns, including subprocess calls to curl, node, and osascript (scripts/mj_browser.py, scripts/mj_alpha.py), and executes JavaScript within the browser context using eval (scripts/mj_playwright_bridge.mjs). While the logic appears aligned with its stated purpose of image generation and lacks evidence of intentional exfiltration to third-party domains, the requirement for sensitive session cookies and the use of powerful browser automation tools represent a significant security risk and a broad attack surface.

能力评估

⚠ Purpose & Capability

The name/description match the implementation: the repo submits Midjourney Alpha web 'imagine' jobs and uses browser/CDP for verification. However the registry metadata lists no required environment variables or primary credential while SKILL.md and the scripts require a full authenticated cookie (MJ_COOKIE) and channel ID (MJ_CHANNEL_ID). That omission is an incoherence: a skill that acts on behalf of a logged-in web session legitimately needs those values and should have declared them.

⚠ Instruction Scope

Runtime instructions explicitly tell the agent to read .env or shell vars for the full authenticated Cookie header, infer user_id from the cookie, call /api/submit-jobs and /api/user-mutable-state, attach to a live Chrome session via CDP/Playwright, and download/convert assets. Those actions are within the stated purpose but involve highly sensitive credentials and the ability to perform actions as the user — the SKILL.md gives the agent broad discretion to use the cookie and browser session, which increases risk if misused.

ℹ Install Mechanism

The skill is marked instruction-only (no install spec), yet the repo includes package.json/package-lock and Python + Node scripts that expect 'playwright-core' and Python runtime. Users will need to run npm install and install/launch browsers and Python deps manually; the absence of an explicit install spec in the registry is an inconsistency that can lead to surprise (Playwright can download browsers and has network/exec behavior).

⚠ Credentials

SKILL.md requires MJ_COOKIE (full authenticated Cookie header) and MJ_CHANNEL_ID as required configuration and lists other optional tokens (MJ_METRICS_TOKEN, MJ_USER_ID, MJ_BROWSER_BACKEND). Requesting a full session cookie is proportionate to the stated functionality but is high-sensitivity and MUST be declared in the registry metadata — its omission is a serious mismatch. The number of env settings is reasonable for web automation, but the required sensitive credential was not surfaced in the declared requirements.

✓ Persistence & Privilege

The skill does not request 'always: true' and does not attempt to modify other skills or global agent settings. It runs local scripts and writes downloads to local directories, which is expected for this use case.

版本历史

v0.1.2

Add explicit lead-generation contact copy for Xiyangshi AI Video team to the listing README so the CTA remains visible even when images fail to render.

v0.1.1

Move public distribution to a standalone GitHub repo and update README to use a GitHub-hosted hook image for GitHub and ClawHub listings.

v0.1.0

Initial public release with browser-backed Midjourney Alpha workflow, batch generation, PNG download, and character-sheet prompt guidance.

元数据

Slug auto-midjourney

版本 0.1.2

许可证 MIT-0

累计安装 1

当前安装数 1

历史版本数 3

常见问题

Auto Midjourney 是什么？

Automate Midjourney Alpha web image generation from Claude using the authenticated https://alpha.midjourney.com session. Use this skill whenever the user wan... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 144 次。

如何安装 Auto Midjourney？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install auto-midjourney」即可一键安装，无需额外配置。

Auto Midjourney 是免费的吗？

是的，Auto Midjourney 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

Auto Midjourney 支持哪些平台？

Auto Midjourney 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 Auto Midjourney？

由 Standed（@standed）开发并维护，当前版本 v0.1.2。

Auto Midjourney

Auto Midjourney

What this skill does

Current scope

Safety posture

Trigger rules

Required configuration

System requirements

Workflow

Scenario 0: Check config first

Scenario 1: Submit one prompt

Scenario 2: Submit and wait

Scenario 3: Low-risk debugging

Scenario 3b: Use a preset

Scenario 3c: Build a prompt from a template

Scenario 4: Read current web settings

Command reference

Submit only

Poll one job

Inspect current server-side settings

Validate config and inferred identity

Experimental recent jobs lookup

Submit, verify, and download locally

Batch generate and store PNGs

Convert existing WEBP downloads to PNG

Batch generate sequentially

Full flow

Prompt defaults

Output format

Simpler usage model

Success criteria for a usable generation

Extension path

Confirmed non-status endpoints

GitHub-informed but experimental path

Auto Midjourney 是什么？

如何安装 Auto Midjourney？

Auto Midjourney 是免费的吗？

Auto Midjourney 支持哪些平台？

谁开发了 Auto Midjourney？

💬 留言讨论