← Back to Skills Marketplace
aiwithabidi

Gemini Video Analyzer

by aiwithabidi · GitHub ↗ · v1.0.0
cross-platform ⚠ suspicious
392
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install gemini-video-analyzer
Description
Native video analysis using Google Gemini API. Upload and analyze video files — describe scenes, extract text/UI, answer questions about content, transcribe...
README (SKILL.md)

Gemini Video Analyzer

Analyze videos natively using Google Gemini's multimodal API. No frame extraction needed — Gemini processes video at 1 FPS with full motion, audio, and visual understanding.

Quick Start

# Analyze a video with default prompt (full description)
GOOGLE_AI_API_KEY=$GOOGLE_AI_API_KEY python3 {baseDir}/scripts/analyze.py /path/to/video.mp4

# Ask a specific question
GOOGLE_AI_API_KEY=$GOOGLE_AI_API_KEY python3 {baseDir}/scripts/analyze.py /path/to/video.mp4 "What text is visible on screen?"

# Manage uploaded files
GOOGLE_AI_API_KEY=$GOOGLE_AI_API_KEY python3 {baseDir}/scripts/manage_files.py list
GOOGLE_AI_API_KEY=$GOOGLE_AI_API_KEY python3 {baseDir}/scripts/manage_files.py cleanup

Supported Formats

MP4, AVI, MOV, MKV, WebM, FLV, MPEG, MPG, WMV, 3GP — up to 2GB per file.

How It Works

  1. Video uploads to Google's Files API (temporary, auto-deletes after 48h)
  2. Gemini processes at 1 frame/sec — understands motion, transitions, audio context
  3. Model generates response based on your prompt
  4. Way better than frame extraction for understanding temporal content

Use Cases

Task Example Prompt
General description (default — no prompt needed)
UI/text extraction "What text and UI elements are visible?"
Tutorial summary "Summarize the steps shown in this tutorial"
Bug report from video "Describe what went wrong in this screen recording"
Meeting notes "Summarize the key points discussed"
Content comparison Upload 2 videos, ask for differences

Configuration

Set GOOGLE_AI_API_KEY in your environment or .env file. Get a free key at aistudio.google.com.

Default model: gemini-2.5-flash (fast, cheap, excellent vision). Override with --model gemini-2.5-pro for complex analysis.

API Reference

See references/gemini-files-api.md for file upload limits, processing details, and advanced options.

Credits

Built by M. Abidi · LinkedIn · YouTube · GitHub · Book a Call

Usage Guidance
This skill appears to do what it says: it uploads videos to Google's generativelanguage Files API and asks Gemini to analyze them. Before installing or using it, consider the following: (1) Privacy: videos are uploaded to Google and may be retained up to ~48 hours — do not upload sensitive or regulated content unless your policy allows it. (2) API key scope: use a minimally privileged API key, monitor/rotate it, and be aware requests may incur costs; test with small files first. (3) Implementation notes: the scripts send the API key as a query parameter and load entire video files into memory (file_data = f.read()), which can use large amounts of RAM for big files and may fail for very large uploads; you may prefer chunked/resumable uploads and passing credentials via secure headers. (4) Minor inconsistency: the skill declares curl as a required binary but never uses it; that's harmless but unnecessary. (5) Trust & provenance: the homepage is listed but source author is not a known official Google package — you already have the full scripts in the skill bundle (no obfuscated code), so review them if you need to be extra cautious. If you plan to use it in production, consider auditing/patching the upload logic (streaming/chunking, avoid exposing keys in logs/URLs) and limit the API key's permissions and quota.
Capability Analysis
Type: OpenClaw Skill Name: gemini-video-analyzer Version: 1.0.0 The `scripts/manage_files.py` script exhibits a potential URL injection vulnerability. The `delete_file` function directly interpolates `sys.argv[2]` (intended to be a Google file resource `name`) into a URL path without explicit sanitization. This could allow a malicious or prompt-injected agent to craft a `name` parameter (e.g., `files/12345?param=value` or `files/../other_resource`) to potentially alter the API request or target unintended endpoints, although the Google API itself might reject malformed resource names. No clear evidence of intentional malicious behavior like unauthorized data exfiltration or backdoors was found; the primary `analyze.py` script appears benign and interacts solely with the legitimate Google Gemini API.
Capability Assessment
Purpose & Capability
Name/description say: upload video and analyze via Google Gemini. The included scripts call generativelanguage.googleapis.com, use the GOOGLE_AI_API_KEY, and perform upload/analysis/cleanup — these are coherent. One minor mismatch: the metadata and requires list python3 and curl, but the shipped scripts only call python (urllib). curl is not used anywhere in SKILL.md or the code, so declaring it as required is unnecessary.
Instruction Scope
SKILL.md and the scripts instruct only to read the user-supplied video file and the declared GOOGLE_AI_API_KEY, upload to Google's Files API, poll for processing, and request analysis. There are no instructions to read unrelated host files, secrets, or to send data to third-party endpoints outside the stated Google API domain. The skill will transmit whole video files to Google's servers (expected for this purpose) and may leave them for up to 48 hours per the docs.
Install Mechanism
This is instruction-only with bundled Python scripts and no install spec — nothing is downloaded from arbitrary URLs and no packages are installed automatically. Risk from install mechanisms is low.
Credentials
Only the GOOGLE_AI_API_KEY is required (declared as the primary credential), which is appropriate for accessing Google Generative Language Files API. No unrelated credentials or secrets are requested.
Persistence & Privilege
The skill does not request always:true, does not modify other skills or system-wide configs, and is user-invocable. It runs only when invoked and uses the provided API key for network calls — typical and proportionate.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install gemini-video-analyzer
  3. After installation, invoke the skill by name or use /gemini-video-analyzer
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
Initial release of gemini-video-analyzer. - Native video analysis using Google Gemini API with support for full scene description, text/UI extraction, object/action identification, and question answering. - Supports multiple video formats (MP4, AVI, MOV, etc.) up to 2GB per file. - Processes videos at 1 FPS with motion, audio, and visual understanding—no manual frame extraction needed. - Includes command-line scripts for analysis, file management, and prompt-based queries. - Requires a Google AI API key; configurable via environment variable. - Suitable for summarizing, extracting information, comparing videos, and analyzing tutorials or walkthroughs.
Metadata
Slug gemini-video-analyzer
Version 1.0.0
License
All-time Installs 0
Active Installs 0
Total Versions 1
Frequently Asked Questions

What is Gemini Video Analyzer?

Native video analysis using Google Gemini API. Upload and analyze video files — describe scenes, extract text/UI, answer questions about content, transcribe... It is an AI Agent Skill for Claude Code / OpenClaw, with 392 downloads so far.

How do I install Gemini Video Analyzer?

Run "/install gemini-video-analyzer" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Gemini Video Analyzer free?

Yes, Gemini Video Analyzer is completely free (open-source). You can download, install and use it at no cost.

Which platforms does Gemini Video Analyzer support?

Gemini Video Analyzer is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Gemini Video Analyzer?

It is built and maintained by aiwithabidi (@aiwithabidi); the current version is v1.0.0.

💬 Comments