← 返回 Skills 市场

mm-output

Name: mm-output
Author: yuriyanzexuan

作者 yuriYanZeXuan · GitHub ↗ · v1.0.1 · MIT-0

cross-platform ⚠ suspicious

243

总下载

当前安装

版本数

在 OpenClaw 中安装

/install mm-output

功能描述

Parse PDF/Markdown files into structured HTML posters with multi-modal output (PDF, PNG, DOCX, PPTX), or generate poster/slides images via Gemini image gener...

安全使用建议

This package looks functionally consistent with a poster/slide generator that uses LLMs and Gemini-style image APIs, but there are several red flags you should check before installing or running: 1) The registry lists no required environment variables, yet the code and SKILL.md expect API keys (OpenAI / RUNWAY / IMAGE_GEN / TEXT_MODEL). Do not supply keys until you confirm which are actually required. 2) Inspect files that call external services (paper2slides/image_generator.py, mm_output/integrate.py, renderer_unit.py) to see exact HTTP endpoints and what data is sent. The example .env contains non-public domains (runway.devops.rednote.life, runway.devops.xiaohongshu.com) — confirm those endpoints are trusted. 3) install.sh will run apt-get and download/install tools (UV, Python 3.12, Playwright). Run it only in an isolated environment (container or VM) and review the script first. 4) The repo contains hard-coded example test paths referencing internal network mounts — ignore or update those before running tests. 5) If you must try it, run in a disposable container, do not expose real API keys (use limited-scope/test keys), and review the network calls (e.g., with a proxy) to ensure no unexpected exfiltration. If you want, provide the contents of paper2slides/image_generator.py and mm_output/integrate.py and I can check exactly which endpoints and parameters the code posts so you can judge trustworthiness.

功能分析

Type: OpenClaw Skill Name: mm-output Version: 1.0.1 The skill bundle is a legitimate and highly functional tool for converting PDF and Markdown documents into multi-modal outputs like HTML posters, PDFs, PNGs, and PPTX slides. It utilizes standard industry libraries such as marker-pdf for parsing, Playwright for browser-based rendering, and python-docx/python-pptx for document generation. While the installation script (install.sh) requires root privileges to install system dependencies (fonts and Chromium libraries) and the code interacts with external LLM APIs (OpenAI and Gemini), all behaviors are strictly aligned with the stated purpose. No evidence of data exfiltration, credential theft, or malicious prompt injection was found.

能力评估

⚠ Purpose & Capability

The code, README, and SKILL.md all describe LLM-driven rendering and Gemini image generation which legitimately requires API keys and network access. However, the registry metadata claims no required environment variables or credentials while the project clearly expects OPENAI_API_KEY / RUNWAY_API_KEY / IMAGE_GEN_API_KEY / TEXT_MODEL, etc. That mismatch (declared none vs. actual required) is an incoherence that should be resolved before trusting the skill.

⚠ Instruction Scope

SKILL.md and run.py instruct the agent to install system packages, create a local .env file, and call LLM/image generation backends. The included .env.txt points at non-standard endpoints (e.g., runway.devops.rednote.life and runway.devops.xiaohongshu.com). The instructions do not try to read unrelated host credentials, but they do direct potentially sensitive content (parsed document text and images) to external LLM/image endpoints — expected for this tool but worth verifying which endpoints and with what keys.

ℹ Install Mechanism

There is no platform 'install' metadata, but the bundle includes install.sh that: runs apt-get to install system libraries (requires root), downloads the UV installer via curl, installs Python 3.12 with UV, and installs Playwright browsers. Those are standard but invasive operations (system package installs, network downloads). The curl installer is from astral.sh (UV project) which is a known source; pip/uv dependencies also include a GitHub package (git+https://github.com/Hadlay-Zhang/marker.git). No obviously obfuscated or random download URLs in the install script, but running install.sh will materially change the host system.

⚠ Credentials

The skill needs multiple API keys (OpenAI/Gemini/Qwen/RUNWAY/IMAGE_GEN) according to SKILL.md and .env.txt, but the registry did not declare them. The example .env includes IMAGE_GEN_ENDPOINT pointing to a non-public domain (runway.devops.rednote.life) and README references runway.devops.xiaohongshu.com — both look like internal/service endpoints rather than a public vendor endpoint. Requiring multiple secrets and an internal endpoint without declaring them or explaining trust boundaries is disproportionate and raises exfiltration/third-party trust concerns.

✓ Persistence & Privilege

The skill is not marked always:true and does not request system-wide privileges in metadata. It will create a virtualenv and a .env file inside the project directory (normal). The included install.sh does require root to apt-get packages — that is a user action and not an automatic privilege escalation by the skill itself.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install mm-output
安装完成后，直接呼叫该 Skill 的名称或使用 /mm-output 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.1

Version 1.0.1: Major update — local Python skill, multi-LLM, slides & poster image generation. - Renamed and refactored skill to "postergen-parser" for local Python usage (not Docker-only). - Added support for multiple LLM APIs: OpenAI, Gemini, and Qwen (MAAS). - New features: generate poster images and slides images using Gemini's image generation API. - Added PPTX export and improved multi-modal conversion options (PDF/PNG/DOCX/PPTX). - New comprehensive setup scripts and dependency management (UV, install.sh, playwright). - Expanded documentation: detailed configuration, usage examples, and troubleshooting instructions.

v1.0.0

mm-output 1.0.0 initial release - Run the Docker image yuriyzx/mm-output:latest to convert PDF or Markdown into HTML and multi-modal files (PDF, PNG, DOCX). - Supports OpenAI-compatible API configuration for LLM-powered rendering. - Provides Docker-based usage instructions and environment setup. - Input options: PDF, Markdown; output: HTML posters, PDF, PNG, DOCX. - Includes troubleshooting tips and advice for image naming and local aliases.

元数据

Slug mm-output

版本 1.0.1

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 2

常见问题

mm-output 是什么？

Parse PDF/Markdown files into structured HTML posters with multi-modal output (PDF, PNG, DOCX, PPTX), or generate poster/slides images via Gemini image gener... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 243 次。

如何安装 mm-output？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install mm-output」即可一键安装，无需额外配置。

mm-output 是免费的吗？

是的，mm-output 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

mm-output 支持哪些平台？

mm-output 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 mm-output？

由 yuriYanZeXuan（@yuriyanzexuan）开发并维护，当前版本 v1.0.1。