← 返回 Skills 市场

Arxiv Gamedevbench Evaluating Agentic Capabili

Name: Arxiv Gamedevbench Evaluating Agentic Capabili
Author: wanng-ide

作者 WANGJUNJIE · GitHub ↗ · v1.0.0

cross-platform ✓ 安全检测通过

665

总下载

当前安装

版本数

在 OpenClaw 中安装

/install arxiv-gamedevbench-evaluating-agentic-capabili

功能描述

Learned from arXiv paper GameDevBench: Evaluating Agentic Capabilities Through Game Development. Use this skill to scaffold Node.js experiments based on the...

使用说明 (SKILL.md)

arxiv-gamedevbench-evaluating-agentic-capabili

Source

Paper key: 44f3ad505bee7a5c25a60d2a3686cb7e
Title: GameDevBench: Evaluating Agentic Capabilities Through Game Development
Categories: cs.AI,cs.CL,cs.SE

Learned insight

Despite rapid progress on coding agents, progress on their multimodal counterparts has lagged behind. A key challenge is the scarcity of evaluation testbeds that combine the complexity of software development with the need for deep multimodal understanding. Game development provides such a testbed as agents must navigate large, dense codebases while manipulating intrinsically multimodal assets such as shaders, sprites, and animations within a visual game scene. We present GameDevBench, the first

Node.js implementation entry

node {baseDir}/scripts/run.js

安全使用建议

This skill appears coherent and low-risk: it’s an auto-generated Node.js scaffold that only prints a paper summary and a TODO. Before running, inspect the bundled files (already included) to confirm there are no added network calls or secret reads. Execute in a sandbox or isolated environment if you want additional caution. If you plan to extend the scaffold, review package.json before adding dependencies and avoid running it with elevated privileges.

功能分析

Type: OpenClaw Skill Name: arxiv-gamedevbench-evaluating-agentic-capabili Version: 1.0.0 The skill bundle is a simple Node.js scaffold that prints information about an arXiv paper. All files, including `SKILL.md`, `index.js`, and `scripts/run.js`, contain only benign code and documentation. There are no indications of data exfiltration, malicious execution, persistence mechanisms, prompt injection attempts against the agent, or obfuscation. The `SKILL.md` explicitly requires the `node` binary, which is appropriate for a Node.js skill, and the `scripts/run.js` file clearly states it's a 'runnable scaffold' with a 'TODO' for further implementation.

能力评估

✓ Purpose & Capability

Name/description claim Node.js scaffolding for the GameDevBench paper and the skill only requires the node binary; included files (index.js, scripts/run.js, package.json, paper.json) are consistent with that purpose.

✓ Instruction Scope

SKILL.md instructs running scripts/run.js. The included run.js only logs metadata and a truncated abstract; there are no instructions to read unrelated files, access environment variables, or send data to external endpoints.

✓ Install Mechanism

No install spec provided (instruction-only). There are local code files bundled, but no downloads, package installs, or extract steps. package.json has no dependencies, so nothing is pulled at runtime beyond node.

✓ Credentials

The skill declares no required environment variables or credentials and the code does not access process.env; requested privileges are minimal and appropriate for a local Node.js scaffold.

✓ Persistence & Privilege

always is false and the skill does not persist configuration or modify other skills/system settings. It only exposes a main() that prints to stdout.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install arxiv-gamedevbench-evaluating-agentic-capabili
安装完成后，直接呼叫该 Skill 的名称或使用 /arxiv-gamedevbench-evaluating-agentic-capabili 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.0

- Initial release of arxiv-gamedevbench-evaluating-agentic-capabili skill. - Implements scaffolding for Node.js experiments inspired by the "GameDevBench" paper. - Focuses on evaluating agentic capabilities in game development, including code and multimodal asset manipulation. - Requires Node.js runtime for operation.

元数据

Slug arxiv-gamedevbench-evaluating-agentic-capabili

版本 1.0.0

许可证 —

累计安装 0

当前安装数 0

历史版本数 1

常见问题