← 返回 Skills 市场
aiwithabidi

Gemini Video Analyzer

作者 aiwithabidi · GitHub ↗ · v1.0.0
cross-platform ⚠ suspicious
392
总下载
0
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install gemini-video-analyzer
功能描述
Native video analysis using Google Gemini API. Upload and analyze video files — describe scenes, extract text/UI, answer questions about content, transcribe...
使用说明 (SKILL.md)

Gemini Video Analyzer

Analyze videos natively using Google Gemini's multimodal API. No frame extraction needed — Gemini processes video at 1 FPS with full motion, audio, and visual understanding.

Quick Start

# Analyze a video with default prompt (full description)
GOOGLE_AI_API_KEY=$GOOGLE_AI_API_KEY python3 {baseDir}/scripts/analyze.py /path/to/video.mp4

# Ask a specific question
GOOGLE_AI_API_KEY=$GOOGLE_AI_API_KEY python3 {baseDir}/scripts/analyze.py /path/to/video.mp4 "What text is visible on screen?"

# Manage uploaded files
GOOGLE_AI_API_KEY=$GOOGLE_AI_API_KEY python3 {baseDir}/scripts/manage_files.py list
GOOGLE_AI_API_KEY=$GOOGLE_AI_API_KEY python3 {baseDir}/scripts/manage_files.py cleanup

Supported Formats

MP4, AVI, MOV, MKV, WebM, FLV, MPEG, MPG, WMV, 3GP — up to 2GB per file.

How It Works

  1. Video uploads to Google's Files API (temporary, auto-deletes after 48h)
  2. Gemini processes at 1 frame/sec — understands motion, transitions, audio context
  3. Model generates response based on your prompt
  4. Way better than frame extraction for understanding temporal content

Use Cases

Task Example Prompt
General description (default — no prompt needed)
UI/text extraction "What text and UI elements are visible?"
Tutorial summary "Summarize the steps shown in this tutorial"
Bug report from video "Describe what went wrong in this screen recording"
Meeting notes "Summarize the key points discussed"
Content comparison Upload 2 videos, ask for differences

Configuration

Set GOOGLE_AI_API_KEY in your environment or .env file. Get a free key at aistudio.google.com.

Default model: gemini-2.5-flash (fast, cheap, excellent vision). Override with --model gemini-2.5-pro for complex analysis.

API Reference

See references/gemini-files-api.md for file upload limits, processing details, and advanced options.

Credits

Built by M. Abidi · LinkedIn · YouTube · GitHub · Book a Call

安全使用建议
This skill appears to do what it says: it uploads videos to Google's generativelanguage Files API and asks Gemini to analyze them. Before installing or using it, consider the following: (1) Privacy: videos are uploaded to Google and may be retained up to ~48 hours — do not upload sensitive or regulated content unless your policy allows it. (2) API key scope: use a minimally privileged API key, monitor/rotate it, and be aware requests may incur costs; test with small files first. (3) Implementation notes: the scripts send the API key as a query parameter and load entire video files into memory (file_data = f.read()), which can use large amounts of RAM for big files and may fail for very large uploads; you may prefer chunked/resumable uploads and passing credentials via secure headers. (4) Minor inconsistency: the skill declares curl as a required binary but never uses it; that's harmless but unnecessary. (5) Trust & provenance: the homepage is listed but source author is not a known official Google package — you already have the full scripts in the skill bundle (no obfuscated code), so review them if you need to be extra cautious. If you plan to use it in production, consider auditing/patching the upload logic (streaming/chunking, avoid exposing keys in logs/URLs) and limit the API key's permissions and quota.
功能分析
Type: OpenClaw Skill Name: gemini-video-analyzer Version: 1.0.0 The `scripts/manage_files.py` script exhibits a potential URL injection vulnerability. The `delete_file` function directly interpolates `sys.argv[2]` (intended to be a Google file resource `name`) into a URL path without explicit sanitization. This could allow a malicious or prompt-injected agent to craft a `name` parameter (e.g., `files/12345?param=value` or `files/../other_resource`) to potentially alter the API request or target unintended endpoints, although the Google API itself might reject malformed resource names. No clear evidence of intentional malicious behavior like unauthorized data exfiltration or backdoors was found; the primary `analyze.py` script appears benign and interacts solely with the legitimate Google Gemini API.
能力评估
Purpose & Capability
Name/description say: upload video and analyze via Google Gemini. The included scripts call generativelanguage.googleapis.com, use the GOOGLE_AI_API_KEY, and perform upload/analysis/cleanup — these are coherent. One minor mismatch: the metadata and requires list python3 and curl, but the shipped scripts only call python (urllib). curl is not used anywhere in SKILL.md or the code, so declaring it as required is unnecessary.
Instruction Scope
SKILL.md and the scripts instruct only to read the user-supplied video file and the declared GOOGLE_AI_API_KEY, upload to Google's Files API, poll for processing, and request analysis. There are no instructions to read unrelated host files, secrets, or to send data to third-party endpoints outside the stated Google API domain. The skill will transmit whole video files to Google's servers (expected for this purpose) and may leave them for up to 48 hours per the docs.
Install Mechanism
This is instruction-only with bundled Python scripts and no install spec — nothing is downloaded from arbitrary URLs and no packages are installed automatically. Risk from install mechanisms is low.
Credentials
Only the GOOGLE_AI_API_KEY is required (declared as the primary credential), which is appropriate for accessing Google Generative Language Files API. No unrelated credentials or secrets are requested.
Persistence & Privilege
The skill does not request always:true, does not modify other skills or system-wide configs, and is user-invocable. It runs only when invoked and uses the provided API key for network calls — typical and proportionate.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install gemini-video-analyzer
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /gemini-video-analyzer 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
Initial release of gemini-video-analyzer. - Native video analysis using Google Gemini API with support for full scene description, text/UI extraction, object/action identification, and question answering. - Supports multiple video formats (MP4, AVI, MOV, etc.) up to 2GB per file. - Processes videos at 1 FPS with motion, audio, and visual understanding—no manual frame extraction needed. - Includes command-line scripts for analysis, file management, and prompt-based queries. - Requires a Google AI API key; configurable via environment variable. - Suitable for summarizing, extracting information, comparing videos, and analyzing tutorials or walkthroughs.
元数据
Slug gemini-video-analyzer
版本 1.0.0
许可证
累计安装 0
当前安装数 0
历史版本数 1
常见问题

Gemini Video Analyzer 是什么?

Native video analysis using Google Gemini API. Upload and analyze video files — describe scenes, extract text/UI, answer questions about content, transcribe... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 392 次。

如何安装 Gemini Video Analyzer?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install gemini-video-analyzer」即可一键安装,无需额外配置。

Gemini Video Analyzer 是免费的吗?

是的,Gemini Video Analyzer 完全免费(开源免费),可自由下载、安装和使用。

Gemini Video Analyzer 支持哪些平台?

Gemini Video Analyzer 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Gemini Video Analyzer?

由 aiwithabidi(@aiwithabidi)开发并维护,当前版本 v1.0.0。

💬 留言讨论