← 返回 Skills 市场

Gemini Video Analyzer

Name: Gemini Video Analyzer
Author: aiwithabidi

作者 aiwithabidi · GitHub ↗ · v1.0.0

cross-platform ⚠ suspicious

392

总下载

当前安装

版本数

在 OpenClaw 中安装

/install gemini-video-analyzer

功能描述

Native video analysis using Google Gemini API. Upload and analyze video files — describe scenes, extract text/UI, answer questions about content, transcribe...

使用说明 (SKILL.md)

Gemini Video Analyzer

Analyze videos natively using Google Gemini's multimodal API. No frame extraction needed — Gemini processes video at 1 FPS with full motion, audio, and visual understanding.

Quick Start

# Analyze a video with default prompt (full description)
GOOGLE_AI_API_KEY=$GOOGLE_AI_API_KEY python3 {baseDir}/scripts/analyze.py /path/to/video.mp4

# Ask a specific question
GOOGLE_AI_API_KEY=$GOOGLE_AI_API_KEY python3 {baseDir}/scripts/analyze.py /path/to/video.mp4 "What text is visible on screen?"

# Manage uploaded files
GOOGLE_AI_API_KEY=$GOOGLE_AI_API_KEY python3 {baseDir}/scripts/manage_files.py list
GOOGLE_AI_API_KEY=$GOOGLE_AI_API_KEY python3 {baseDir}/scripts/manage_files.py cleanup

Supported Formats

MP4, AVI, MOV, MKV, WebM, FLV, MPEG, MPG, WMV, 3GP — up to 2GB per file.

How It Works

Video uploads to Google's Files API (temporary, auto-deletes after 48h)
Gemini processes at 1 frame/sec — understands motion, transitions, audio context
Model generates response based on your prompt
Way better than frame extraction for understanding temporal content

Use Cases

Task	Example Prompt
General description	(default — no prompt needed)
UI/text extraction	`"What text and UI elements are visible?"`
Tutorial summary	`"Summarize the steps shown in this tutorial"`
Bug report from video	`"Describe what went wrong in this screen recording"`
Meeting notes	`"Summarize the key points discussed"`
Content comparison	Upload 2 videos, ask for differences

Configuration

Set GOOGLE_AI_API_KEY in your environment or .env file. Get a free key at aistudio.google.com.

Default model: gemini-2.5-flash (fast, cheap, excellent vision). Override with --model gemini-2.5-pro for complex analysis.

API Reference

See references/gemini-files-api.md for file upload limits, processing details, and advanced options.

Credits

Built by M. Abidi · LinkedIn · YouTube · GitHub · Book a Call

安全使用建议

This skill appears to do what it says: it uploads videos to Google's generativelanguage Files API and asks Gemini to analyze them. Before installing or using it, consider the following: (1) Privacy: videos are uploaded to Google and may be retained up to ~48 hours — do not upload sensitive or regulated content unless your policy allows it. (2) API key scope: use a minimally privileged API key, monitor/rotate it, and be aware requests may incur costs; test with small files first. (3) Implementation notes: the scripts send the API key as a query parameter and load entire video files into memory (file_data = f.read()), which can use large amounts of RAM for big files and may fail for very large uploads; you may prefer chunked/resumable uploads and passing credentials via secure headers. (4) Minor inconsistency: the skill declares curl as a required binary but never uses it; that's harmless but unnecessary. (5) Trust & provenance: the homepage is listed but source author is not a known official Google package — you already have the full scripts in the skill bundle (no obfuscated code), so review them if you need to be extra cautious. If you plan to use it in production, consider auditing/patching the upload logic (streaming/chunking, avoid exposing keys in logs/URLs) and limit the API key's permissions and quota.

功能分析

Type: OpenClaw Skill Name: gemini-video-analyzer Version: 1.0.0 The `scripts/manage_files.py` script exhibits a potential URL injection vulnerability. The `delete_file` function directly interpolates `sys.argv[2]` (intended to be a Google file resource `name`) into a URL path without explicit sanitization. This could allow a malicious or prompt-injected agent to craft a `name` parameter (e.g., `files/12345?param=value` or `files/../other_resource`) to potentially alter the API request or target unintended endpoints, although the Google API itself might reject malformed resource names. No clear evidence of intentional malicious behavior like unauthorized data exfiltration or backdoors was found; the primary `analyze.py` script appears benign and interacts solely with the legitimate Google Gemini API.

能力评估

✓ Purpose & Capability

Name/description say: upload video and analyze via Google Gemini. The included scripts call generativelanguage.googleapis.com, use the GOOGLE_AI_API_KEY, and perform upload/analysis/cleanup — these are coherent. One minor mismatch: the metadata and requires list python3 and curl, but the shipped scripts only call python (urllib). curl is not used anywhere in SKILL.md or the code, so declaring it as required is unnecessary.

✓ Instruction Scope

SKILL.md and the scripts instruct only to read the user-supplied video file and the declared GOOGLE_AI_API_KEY, upload to Google's Files API, poll for processing, and request analysis. There are no instructions to read unrelated host files, secrets, or to send data to third-party endpoints outside the stated Google API domain. The skill will transmit whole video files to Google's servers (expected for this purpose) and may leave them for up to 48 hours per the docs.

✓ Install Mechanism

This is instruction-only with bundled Python scripts and no install spec — nothing is downloaded from arbitrary URLs and no packages are installed automatically. Risk from install mechanisms is low.

✓ Credentials

Only the GOOGLE_AI_API_KEY is required (declared as the primary credential), which is appropriate for accessing Google Generative Language Files API. No unrelated credentials or secrets are requested.

✓ Persistence & Privilege

The skill does not request always:true, does not modify other skills or system-wide configs, and is user-invocable. It runs only when invoked and uses the provided API key for network calls — typical and proportionate.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install gemini-video-analyzer
安装完成后，直接呼叫该 Skill 的名称或使用 /gemini-video-analyzer 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.0

Initial release of gemini-video-analyzer. - Native video analysis using Google Gemini API with support for full scene description, text/UI extraction, object/action identification, and question answering. - Supports multiple video formats (MP4, AVI, MOV, etc.) up to 2GB per file. - Processes videos at 1 FPS with motion, audio, and visual understanding—no manual frame extraction needed. - Includes command-line scripts for analysis, file management, and prompt-based queries. - Requires a Google AI API key; configurable via environment variable. - Suitable for summarizing, extracting information, comparing videos, and analyzing tutorials or walkthroughs.

元数据

Slug gemini-video-analyzer

版本 1.0.0

许可证 —

累计安装 0

当前安装数 0

历史版本数 1

常见问题