Description

Use when generating BAML code for type-safe LLM extraction, classification, RAG, or agent workflows - creates complete .baml files with types, functions, clients, tests, and framework integrations from natural language requirements. Queries official BoundaryML repositories via MCP for real-time patterns. Supports multimodal inputs (images, audio), Python/TypeScript/Ruby/Go, 10+ frameworks, 50-70% token optimization, 95%+ compilation success.

README (SKILL.md)

BAML Code Generation

Name: baml-codegen
Author: killerapp

Generate type-safe LLM extraction code. Use when creating structured outputs, classification, RAG, or agent workflows.

Golden Rules

NEVER edit baml_client/ - 100% generated, overwritten on every baml-cli generate; check baml_src/generators.baml for output_type (python, typescript, ruby, go)
ALWAYS edit baml_src/ - Source of truth for all BAML code
Run baml-cli generate after changes - Regenerates typed client code for target language

Philosophy (TL;DR)

Schema Is The Prompt - Define data models first, compiler injects types
Types Over Strings - Use enums/classes/unions, not string parsing
Fuzzy Parsing Is BAML's Job - BAML extracts valid JSON from messy LLM output
Transpiler Not Library - Write .baml → generate native code (Python/TypeScript/Ruby/Go), no runtime dependency
Test-Driven Prompting - Use VS Code playground or baml-cli test to iterate

Workflow

Analyze → Pattern Match (MCP) → Validate → Generate → Test → Deliver
         ↓ [IF ERRORS] Error Recovery (MCP) → Retry

BAML Syntax

Element	Example
Class	`class Invoice { total float @description("Amount") @assert(this > 0) @alias("amt") }`
Enum	`enum Category { Tech @alias("technology") @description("Tech sector"), Finance, Other }`
Function	`function Extract(text: string, img: image?) -> Invoice { client GPT5 prompt #"{{ text }} {{ img }} {{ ctx.output_format }}"# }`
Client	`client\x3Cllm> GPT5 { provider openai options { model gpt-5 } retry_policy Exponential }`
Fallback	`client\x3Cllm> Resilient { provider fallback options { strategy [FastModel, SlowModel] } }`

Types

Primitives: string, int, float, bool | Multimodal: image, audio
Containers: Type[] (array), Type? (optional), map\x3Cstring, Type> (key-value)
Composite: Type1 | Type2 (union), nested classes
Annotations: @description("..."), @assert(condition), @alias("json_name"), @check(name, condition)

Providers

openai, anthropic, gemini, vertex, bedrock, ollama + any OpenAI-compatible via openai-generic

Pattern Categories

Pattern	Use Case	Model	Framework Markers
Extraction	Unstructured → structured	GPT-5	fastapi, next.js
Classification	Categorization	GPT-5-mini	any
RAG	Answers with citations	GPT-5	langgraph
Agents	Multi-step reasoning	GPT-5	langgraph
Vision	Image/audio data extraction	GPT-5-Vision	multimodal

Resilience

retry_policy: retry_policy Exp { max_retries 3 strategy { type exponential_backoff } }
fallback client: Chain models [FastCheap, SlowReliable] for cost/reliability tradeoff

MCP Indicators

Found patterns from baml-examples | Validated against BoundaryML/baml | Fixed errors using docs | MCP unavailable, using fallback

Output Artifacts

BAML Code - Complete .baml files (types, functions, clients, retry_policy)
Tests - pytest/Jest with 100% function coverage
Integration - Framework-specific client code (LangGraph nodes, FastAPI endpoints, Next.js API routes)
Metadata - Pattern used, token count, cost estimate

References

providers.md - OpenAI, Anthropic, Google, Ollama, Azure, Bedrock, openai-generic
types-and-schemas.md - Full type system, classes, enums, unions, map, image, audio
validation.md - @assert, @check, @alias, block-level @@assert
patterns.md - Pattern library with code examples
philosophy.md - BAML principles, golden rules
mcp-interface.md - Query workflow, caching
languages-python.md - Python/Pydantic, async
languages-typescript.md - TypeScript, React/Next.js
frameworks-langgraph.md - LangGraph integration

Usage Guidance

This skill appears to be a detailed BAML code-generation recipe, but there are important red flags: (1) The SKILL.md assumes you can run `baml-cli` and that MCP/LLM provider access exists, yet the skill metadata declares no required binaries or credentials. (2) The source is unknown and there is no homepage; you cannot verify origin or upstream code. Before installing or enabling: - Confirm you trust the publisher or obtain the upstream repository/homepage. - Ensure `baml-cli` and any language toolchains are installed from official sources; do not run arbitrary install links. - Provide LLM provider API keys only to parts of your system you control; do not hand credentials to an unknown remote. - Run this skill in a sandboxed project or disposable environment first (so generated code and any post-generation hooks cannot affect unrelated files). - If you expect the skill to query MCP or provider endpoints automatically, require that it declare those endpoints and explicit env vars in metadata; consider asking the author to add required env var declarations and an install spec. If you want, I can list exactly what env vars and binaries would be reasonable to require for this skill (e.g., BAML_CLI, OPENAI_API_KEY, ANTHROPIC_API_KEY, MCP_ENDPOINT) and suggest a minimal secure run checklist.

Capability Analysis

Type: OpenClaw Skill Name: baml-codegen Version: 2.0.0 The skill is designed to fetch BAML code patterns from external GitHub repositories (BoundaryML/baml-examples, BoundaryML/baml) using MCP tools like `mcp__baml_Examples__fetch_generic_url_content` as detailed in `references/mcp-interface.md`. These fetched patterns are then used by `baml-cli generate` to produce executable client code in various languages (Python, TypeScript, Go, Ruby) on the user's system. This generation process can trigger `on_generate` hooks (e.g., `black . && isort .`, `gofmt -w . && go mod tidy`, `prettier --write .`) which execute arbitrary shell commands locally. While the skill explicitly states it queries 'official BoundaryML repositories', this mechanism introduces a supply chain risk and a potential prompt-injection vulnerability where a malicious prompt could redirect the agent to fetch and generate code from an untrusted source, leading to local code execution.

Capability Assessment

⚠ Purpose & Capability

The skill claims to query MCP servers and to require running `baml-cli generate` (and to integrate with LLM providers). Yet the registry metadata declares no required binaries, no required environment variables, and no install specification. That mismatch is unexpected: a codegen workflow that invokes a CLI and cloud LLM providers would normally declare those dependencies and credentials.

ℹ Instruction Scope

SKILL.md is detailed and constrained to BAML generation tasks (edit baml_src/, generate baml_client/, run tests, integrate with frameworks). It instructs the agent to run `baml-cli generate`, manage project files, and use provider clients. It does not explicitly instruct exfiltration or access unrelated system paths, but it assumes ability to run CLIs, access networked MCP servers, and use LLM provider credentials that are not declared.

ℹ Install Mechanism

There is no install spec (instruction-only), which reduces risk from arbitrary downloads. However, the instructions assume the existence of external tooling (`baml-cli`, language runtimes, package managers) without declaring them in metadata or providing safe install sources.

⚠ Credentials

The skill references many LLM providers (openai, anthropic, gemini, bedrock, ollama, openai-generic) and MCP servers in prose, but requires.env is empty and no primary credential is declared. Requesting zero credentials while instructing use of cloud LLM providers and MCP queries is disproportionate and inconsistent — the skill will need API keys and network access in practice.

✓ Persistence & Privilege

Flags show normal defaults (always:false, agent invocation allowed). The skill does not request permanent presence or system-wide config changes in the metadata or SKILL.md. It instructs editing project files (baml_src/) and generating baml_client/ which is expected for a generator.

Version History

v2.0.0

From Foundry: Use when generating BAML code for type-safe LLM extraction, classification, RAG,

Metadata

Slug baml-codegen

Version 2.0.0

License —

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is baml-codegen?

Use when generating BAML code for type-safe LLM extraction, classification, RAG, or agent workflows - creates complete .baml files with types, functions, clients, tests, and framework integrations from natural language requirements. Queries official BoundaryML repositories via MCP for real-time patterns. Supports multimodal inputs (images, audio), Python/TypeScript/Ruby/Go, 10+ frameworks, 50-70% token optimization, 95%+ compilation success. It is an AI Agent Skill for Claude Code / OpenClaw, with 1424 downloads so far.

How do I install baml-codegen?

Run "/install baml-codegen" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is baml-codegen free?

Yes, baml-codegen is completely free (open-source). You can download, install and use it at no cost.

Which platforms does baml-codegen support?

baml-codegen is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created baml-codegen?

It is built and maintained by Vaskin Kissoyan (@killerapp); the current version is v2.0.0.

More Skills

baml-codegen