← Back to Skills Marketplace
tiktokdad

OpenClaw VLN Planner

by TIKTOKDAD · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ✓ Security Clean
131
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install openclaw-vln-planner
Description
Plan the next high-level navigation step for a robot from a user navigation instruction, one current image, and a sequence of historical images. Use when the...
Usage Guidance
This skill appears to do what it says: it will read local camera frame files, base64-encode them, and POST them (with the model API key from a YAML config) to whatever OpenAI-compatible gateway you configure. Before installing: (1) confirm you trust the gateway endpoint and operator because camera images and any scene data will be transmitted; (2) store the API key securely (the example uses a config file rather than env vars) and update the manifest if you need policy/audit visibility; (3) keep executor in dry_run while testing and review/replace the placeholder execute_* functions so the planner cannot command hardware until you've integrated a vetted execution bridge; (4) if you require stricter telemetry controls, inspect/modify image_to_data_url and build_messages to avoid sending raw images or to anonymize them. The small manifest omission (no declared required config path/credential) is not malicious but worth correcting for clarity and safety.
Capability Analysis
Type: OpenClaw Skill Name: openclaw-vln-planner Version: 1.0.0 The skill bundle is a legitimate vision-language navigation (VLN) planner designed to convert camera observations and user instructions into robot movement commands. The Python bridge (`scripts/vln_bridge.py`) implements standard multimodal LLM integration using the `requests` library and includes robust validation logic, such as movement distance/angle bounds and safety fallbacks. No evidence of data exfiltration, malicious execution, or harmful prompt injection was found; the file access and network activity are strictly aligned with the stated purpose of processing image frames and communicating with an AI gateway.
Capability Assessment
Purpose & Capability
Name/description, SKILL.md, config, and vln_bridge.py all align: the planner builds a prompt from historical+current images and an instruction, queries a multimodal model, parses JSON, validates bounds, and forwards a mid-level action. Network access to a model and reading image files are expected for this purpose.
Instruction Scope
Runtime instructions and the Python bridge explicitly read image files, load a YAML config containing model base_url/api_key/model_id, and send base64-encoded images to the configured OpenAI-compatible gateway. This behavior is necessary for a multimodal planner but means camera frames (potentially sensitive) are transmitted to an external service. SKILL.md and code limit outputs to pure JSON and define safety fallbacks.
Install Mechanism
This is an instruction-only skill with a small example Python bridge; there is no install spec, no external downloads, and only a minimal requirements.txt (requests, PyYAML). No extraction from arbitrary URLs or package installs are present.
Credentials
The package does not declare required env vars or primary credentials, but the bridge requires a model base_url and api_key in a YAML config file (config/vln-config.yaml). Expect to provide credentials to the model gateway; that is proportional to the task but the manifest omission of a required config/credential declaration is a small inconsistency to be aware of.
Persistence & Privilege
The skill does not request persistent/system privileges, does not set always:true, and has no install actions that modify other skills or system-wide settings. The bridge runs as a standalone script and prints dry-run execution by default.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install openclaw-vln-planner
  3. After installation, invoke the skill by name or use /openclaw-vln-planner
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
Initial release of OpenClaw VLN Planner. - Introduces a high-level vision-language navigation planner for robots based on visual observations and natural-language user instructions. - Outputs a single, validated mid-level navigation action as pure JSON (MOVE_FORWARD, TURN_LEFT, TURN_RIGHT, STOP) for each input. - Integrates with any OpenAI-compatible multimodal gateway for action prediction using current and historical frames. - Includes strict safety fallback rules—defaults to STOP on uncertainty, parse failure, or safety concerns. - Provides clear input requirements, output contract, and runtime configuration via external YAML. - Bundles schema reference, config template, and a bridge script for execution.
Metadata
Slug openclaw-vln-planner
Version 1.0.0
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 1
Frequently Asked Questions

What is OpenClaw VLN Planner?

Plan the next high-level navigation step for a robot from a user navigation instruction, one current image, and a sequence of historical images. Use when the... It is an AI Agent Skill for Claude Code / OpenClaw, with 131 downloads so far.

How do I install OpenClaw VLN Planner?

Run "/install openclaw-vln-planner" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is OpenClaw VLN Planner free?

Yes, OpenClaw VLN Planner is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does OpenClaw VLN Planner support?

OpenClaw VLN Planner is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created OpenClaw VLN Planner?

It is built and maintained by TIKTOKDAD (@tiktokdad); the current version is v1.0.0.

💬 Comments