← Back to Skills Marketplace
aiwithabidi

Gemini Video Analyzer

by aiwithabidi · GitHub ↗ · v1.0.0
cross-platform ⚠ suspicious
728
Downloads
0
Stars
1
Active Installs
1
Versions
Install in OpenClaw
/install a6-gemini-video-analyzer
Description
Native video analysis using Google Gemini API. Upload and analyze video files — describe scenes, extract text/UI, answer questions about content, transcribe...
README (SKILL.md)

Gemini Video Analyzer

Analyze videos natively using Google Gemini's multimodal API. No frame extraction needed — Gemini processes video at 1 FPS with full motion, audio, and visual understanding.

Quick Start

# Analyze a video with default prompt (full description)
GOOGLE_AI_API_KEY=$GOOGLE_AI_API_KEY python3 {baseDir}/scripts/analyze.py /path/to/video.mp4

# Ask a specific question
GOOGLE_AI_API_KEY=$GOOGLE_AI_API_KEY python3 {baseDir}/scripts/analyze.py /path/to/video.mp4 "What text is visible on screen?"

# Manage uploaded files
GOOGLE_AI_API_KEY=$GOOGLE_AI_API_KEY python3 {baseDir}/scripts/manage_files.py list
GOOGLE_AI_API_KEY=$GOOGLE_AI_API_KEY python3 {baseDir}/scripts/manage_files.py cleanup

Supported Formats

MP4, AVI, MOV, MKV, WebM, FLV, MPEG, MPG, WMV, 3GP — up to 2GB per file.

How It Works

  1. Video uploads to Google's Files API (temporary, auto-deletes after 48h)
  2. Gemini processes at 1 frame/sec — understands motion, transitions, audio context
  3. Model generates response based on your prompt
  4. Way better than frame extraction for understanding temporal content

Use Cases

Task Example Prompt
General description (default — no prompt needed)
UI/text extraction "What text and UI elements are visible?"
Tutorial summary "Summarize the steps shown in this tutorial"
Bug report from video "Describe what went wrong in this screen recording"
Meeting notes "Summarize the key points discussed"
Content comparison Upload 2 videos, ask for differences

Configuration

Set GOOGLE_AI_API_KEY in your environment or .env file. Get a free key at aistudio.google.com.

Default model: gemini-2.5-flash (fast, cheap, excellent vision). Override with --model gemini-2.5-pro for complex analysis.

API Reference

See references/gemini-files-api.md for file upload limits, processing details, and advanced options.

Usage Guidance
This skill appears to do what it says: it uploads videos to Google's Generative Language/Files API and asks Gemini to analyze them. Before installing or running: (1) Be aware that videos will be uploaded off your machine to Google — avoid uploading sensitive footage unless you accept that. (2) Use a restricted API key (limit to the specific project/APIs, set quotas, and rotate or revoke when done) to reduce blast radius if the key is leaked. (3) The declared requirement lists curl though the shipped scripts use python3; you may want to inspect the scripts yourself (they're included) to confirm behavior. (4) Verify billing/quota implications for large or frequent analysis and confirm the skill's publisher/homepage if provenance matters. If you require stronger guarantees (no external uploads), do not use this skill.
Capability Analysis
Type: OpenClaw Skill Name: a6-gemini-video-analyzer Version: 1.0.0 The `scripts/analyze.py` skill is vulnerable to local file disclosure/exfiltration. It reads the content of any file path provided as an argument and uploads it to Google's Gemini Files API. While intended for video analysis, the script lacks validation to ensure the input file is a video, allowing a malicious prompt to the OpenClaw agent to exfiltrate arbitrary sensitive files (e.g., `/etc/passwd`, `~/.ssh/id_rsa`) to a third-party service (Google). The `SKILL.md` also lists `curl` as a required binary, which is not used by the provided Python scripts, indicating a potential for unstated capabilities.
Capability Assessment
Purpose & Capability
Name, description, and included scripts consistently implement video upload + Gemini model analysis against generativelanguage.googleapis.com. The single required credential (GOOGLE_AI_API_KEY) is the expected credential for this purpose. Minor mismatch: the declared required binaries include curl although the provided scripts use only python3/urllib; this is a small inconsistency but not evidence of malicious intent.
Instruction Scope
Runtime instructions and scripts explicitly upload user video files to Google Files API and then call the Gemini model — this is consistent with the stated purpose. Important privacy note: videos (and any text/UI/audio they contain) are transmitted to Google and may be processed server-side and retained per the API (SKILL.md claims ~48h retention). The instructions do not read unrelated files or other environment variables.
Install Mechanism
This is instruction-only plus two Python scripts with no install spec. Nothing is downloaded from third-party URLs during install; risk from installation is low. The scripts perform network calls at runtime (to Google endpoints) which is expected for this skill.
Credentials
Only GOOGLE_AI_API_KEY is requested and used, which is proportionate to contacting Google's Files/Generative Language APIs. Users should ensure the API key is scoped/restricted (project, API quotas, billing) because it could be used to bill requests or access other Google APIs depending on key permissions. The skill does not request unrelated secrets or config paths.
Persistence & Privilege
The skill is not force-included (always: false) and does not request persistent system-wide privileges or modify other skills. It runs as-invoked and uses only its own scripts and the provided API key.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install a6-gemini-video-analyzer
  3. After installation, invoke the skill by name or use /a6-gemini-video-analyzer
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
Initial release of Gemini Video Analyzer. - Analyze video files natively using Google Gemini API with 1 FPS multimodal processing. - Supports video scene description, content Q&A, screen text/UI extraction, speech transcription, and object/action identification. - Accepts multiple popular video formats (up to 2GB each). - CLI tools provided for video analysis, content-specific prompts, and file management. - Requires only Python 3, curl, and a Google AI API key for setup.
Metadata
Slug a6-gemini-video-analyzer
Version 1.0.0
License
All-time Installs 1
Active Installs 1
Total Versions 1
Frequently Asked Questions

What is Gemini Video Analyzer?

Native video analysis using Google Gemini API. Upload and analyze video files — describe scenes, extract text/UI, answer questions about content, transcribe... It is an AI Agent Skill for Claude Code / OpenClaw, with 728 downloads so far.

How do I install Gemini Video Analyzer?

Run "/install a6-gemini-video-analyzer" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Gemini Video Analyzer free?

Yes, Gemini Video Analyzer is completely free (open-source). You can download, install and use it at no cost.

Which platforms does Gemini Video Analyzer support?

Gemini Video Analyzer is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Gemini Video Analyzer?

It is built and maintained by aiwithabidi (@aiwithabidi); the current version is v1.0.0.

💬 Comments