Arxiv Gamedevbench Evaluating Agentic Capabili
/install arxiv-gamedevbench-evaluating-agentic-capabili
arxiv-gamedevbench-evaluating-agentic-capabili
Source
- Paper key: 44f3ad505bee7a5c25a60d2a3686cb7e
- Title: GameDevBench: Evaluating Agentic Capabilities Through Game Development
- Categories: cs.AI,cs.CL,cs.SE
Learned insight
Despite rapid progress on coding agents, progress on their multimodal counterparts has lagged behind. A key challenge is the scarcity of evaluation testbeds that combine the complexity of software development with the need for deep multimodal understanding. Game development provides such a testbed as agents must navigate large, dense codebases while manipulating intrinsically multimodal assets such as shaders, sprites, and animations within a visual game scene. We present GameDevBench, the first
Node.js implementation entry
node {baseDir}/scripts/run.js
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install arxiv-gamedevbench-evaluating-agentic-capabili - After installation, invoke the skill by name or use
/arxiv-gamedevbench-evaluating-agentic-capabili - Provide required inputs per the skill's parameter spec and get structured output
What is Arxiv Gamedevbench Evaluating Agentic Capabili?
Learned from arXiv paper GameDevBench: Evaluating Agentic Capabilities Through Game Development. Use this skill to scaffold Node.js experiments based on the... It is an AI Agent Skill for Claude Code / OpenClaw, with 665 downloads so far.
How do I install Arxiv Gamedevbench Evaluating Agentic Capabili?
Run "/install arxiv-gamedevbench-evaluating-agentic-capabili" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is Arxiv Gamedevbench Evaluating Agentic Capabili free?
Yes, Arxiv Gamedevbench Evaluating Agentic Capabili is completely free (open-source). You can download, install and use it at no cost.
Which platforms does Arxiv Gamedevbench Evaluating Agentic Capabili support?
Arxiv Gamedevbench Evaluating Agentic Capabili is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created Arxiv Gamedevbench Evaluating Agentic Capabili?
It is built and maintained by WANGJUNJIE (@wanng-ide); the current version is v1.0.0.