Description

Official skill for recognizing handwritten text from images using ZhiPu GLM-OCR API. Supports various handwriting styles, languages, and mixed handwritten/pr...

README (SKILL.md)

GLM-OCR Handwriting Recognition Skill / GLM-OCR 手写体识别技能

Name: GLM-OCR-Handwriting
Author: jaredforreal

Recognize handwritten text from images and PDFs using the ZhiPu GLM-OCR layout parsing API.

When to Use / 使用场景

Extract text from handwritten notes, letters, or documents / 从手写笔记、信件或文档中提取文字
Convert handwriting to editable text / 将手写内容转为可编辑文本
Recognize mixed handwritten and printed content / 识别手写和印刷混排内容
Read handwritten formulas, labels, or annotations / 读取手写公式、标签或批注
User mentions "handwriting OCR", "recognize handwriting", "手写识别", "手写体OCR", "识别手写字"

Key Features / 核心特性

Multi-style support: Handles various handwriting styles including cursive and print
Multi-language: Supports Chinese, English, and mixed-language handwriting
Mixed content: Can recognize documents with both handwritten and printed text
Local file & URL: Supports both local files and remote URLs

Resource Links / 资源链接

Resource	Link
Get API Key	智谱开放平台 API Keys
API Docs	Layout Parsing / 版面解析

Prerequisites / 前置条件

API Key Setup / API Key 配置（Required / 必需）

脚本通过 ZHIPU_API_KEY 环境变量获取密钥，可与其他智谱技能复用同一个 key。 This script reads the key from the ZHIPU_API_KEY environment variable. Reusing the same key across Zhipu skills is optional.

Get Key / 获取 Key： Visit 智谱开放平台 API Keys to create or copy your key.

Setup options / 配置方式（任选一种）：

Global config (recommended) / 全局配置（推荐）： Set once in openclaw.json under env.vars, all Zhipu skills will share it:
```
{
  "env": {
    "vars": {
      "ZHIPU_API_KEY": "你的密钥"
    }
  }
}
```

Skill-level config / Skill 级别配置： Set for this skill only in openclaw.json:

{
  "skills": {
    "entries": {
      "glmocr-handwriting": {
        "env": {
          "ZHIPU_API_KEY": "你的密钥"
        }
      }
    }
  }
}

Shell environment variable / Shell 环境变量： Add to ~/.zshrc:
```
export ZHIPU_API_KEY="你的密钥"
```

💡 如果你已为其他智谱 skill（如 glmocr、glmv-caption、glm-image-generation）配置过 key，它们共享同一个 ZHIPU_API_KEY，无需重复配置。

Security & Transparency / 安全与透明度

Environment variables used / 使用的环境变量：
- ZHIPU_API_KEY (required / 必需)
- GLM_OCR_TIMEOUT (optional timeout seconds / 可选超时秒数)
Fixed endpoint / 固定官方端点： https://open.bigmodel.cn/api/paas/v4/layout_parsing
No custom API URL override / 不支持自定义 API URL 覆盖： avoids accidental key exfiltration via redirected endpoints.
Raw upstream response is optional / 原始响应默认不返回： use --include-raw only when needed for debugging.

⛔ MANDATORY RESTRICTIONS / 强制限制 ⛔

ONLY use GLM-OCR API — Execute the script python scripts/glm_ocr_cli.py
NEVER parse handwriting yourself — Do NOT try to read handwritten text using built-in vision or any other method
NEVER offer alternatives — Do NOT suggest "I can try to read it" or similar
IF API fails — Display the error message and STOP immediately
NO fallback methods — Do NOT attempt handwriting recognition any other way

📋 Output Display Rules / 输出展示规则

After running the script, present the OCR result clearly and safely.

Show extracted handwritten text (text) in full
Summarization is allowed, but do not hide important extraction failures
If the result file is saved, tell the user the file path
Show raw upstream response only when explicitly requested or debugging (--include-raw)

How to Use / 使用方法

Recognize from URL / 从 URL 识别

python scripts/glm_ocr_cli.py --file-url "https://example.com/handwriting.jpg"

Recognize from Local File / 从本地文件识别

python scripts/glm_ocr_cli.py --file /path/to/notes.png

Save Result to File / 保存结果到文件

python scripts/glm_ocr_cli.py --file notes.png --output result.json --pretty

Include Raw Upstream Response (Debug Only) / 包含原始上游响应（仅调试）

python scripts/glm_ocr_cli.py --file notes.png --output result.json --include-raw

CLI Reference / CLI 参数

python {baseDir}/scripts/glm_ocr_cli.py (--file-url URL | --file PATH) [--output FILE] [--pretty] [--include-raw]

Parameter	Required	Description
`--file-url`	One of	URL to image/PDF
`--file`	One of	Local file path to image/PDF
`--output`, `-o`	No	Save result JSON to file
`--pretty`	No	Pretty-print JSON output
`--include-raw`	No	Include raw upstream API response in `result` field (debug only)

Response Format / 响应格式

{
  "ok": true,
  "text": "Recognized handwritten text in Markdown...",
  "layout_details": [...],
  "result": null,
  "error": null,
  "source": "/path/to/file",
  "source_type": "file",
  "raw_result_included": false
}

Key fields:

ok — whether recognition succeeded
text — extracted text in Markdown (use this for display)
layout_details — layout analysis details
error — error details on failure

Error Handling / 错误处理

API key not configured:

ZHIPU_API_KEY not configured. Get your API key at: https://bigmodel.cn/usercenter/proj-mgmt/apikeys

→ Show exact error to user, guide them to configure

Authentication failed (401/403): API key invalid/expired → reconfigure

Rate limit (429): Quota exhausted → inform user to wait

File not found: Local file missing → check path

Usage Guidance

This skill appears to do what it says: it sends image files (or file URLs) to ZhiPu's official GLM‑OCR API and returns extracted handwritten text. Before installing, decide whether you trust sending image contents (possibly sensitive) to that external service and avoid reusing a high‑privilege API key across untrusted skills. Note the small metadata mismatch: GLM_OCR_TIMEOUT is documented as optional in the script but listed as required in the registry — you do not need to set it to use the skill. Also verify the skill source (homepage repo) if you need higher assurance, and be aware the CLI requires the 'requests' Python package to be installed.

Capability Analysis

Type: OpenClaw Skill Name: glmocr-handwriting Version: 1.0.4 The skill is a legitimate tool for handwriting recognition using the ZhiPu GLM-OCR API. The Python script (scripts/glm_ocr_cli.py) correctly implements API communication with a hardcoded official endpoint and includes proper error handling, while the instructions in SKILL.md are focused on ensuring the AI agent uses the tool reliably without attempting unauthorized fallback methods.

Capability Tags

requires-sensitive-credentials

Capability Assessment

✓ Purpose & Capability

Name/description, SKILL.md, and the included script all implement handwriting OCR against the official ZhiPu GLM‑OCR endpoint. The primary credential requested (ZHIPU_API_KEY) is appropriate. Minor metadata inconsistency: registry lists GLM_OCR_TIMEOUT as required, but the script treats it as optional with a default.

ℹ Instruction Scope

SKILL.md explicitly instructs the agent to call the provided CLI script and only the GLM‑OCR API. The script reads local files (base64-encodes them) or accepts remote URLs and POSTs them to the fixed official endpoint — so user image data will be transmitted to ZhiPu. There are no instructions to read unrelated system files or to exfiltrate environment variables beyond the declared API key and timeout.

✓ Install Mechanism

No install spec — instruction-only plus a small CLI script. The script depends on the widely used 'requests' package; dependencies are not installed automatically but the script prints a clear error and pip suggestion. No downloads from arbitrary URLs or archive extraction are present.

ℹ Credentials

Only ZHIPU_API_KEY (primary credential) is required to use the API; GLM_OCR_TIMEOUT is read optionally with a default. No unrelated credentials or config paths are requested. The only minor mismatch: registry metadata marks GLM_OCR_TIMEOUT as required even though the code treats it as optional.

✓ Persistence & Privilege

Skill is not forced-always, does not request elevated or persistent privileges, and does not modify other skills or global config. Autonomous invocation is allowed (platform default) but not combined with other concerning indicators.

Version History

v1.0.4

- No code or configuration changes in this release. - Version bump to 1.0.4; documentation and skill usage remain the same. - All features, API requirements, and usage instructions are unchanged.

v1.0.3

- Clarified official status of the skill and updated the name to "Official skill for recognizing handwritten text..." - Enhanced API key configuration instructions, adding a global environment variable method. - Added details to improve security and reduce accidental key leakage, including notes about the fixed API endpoint. - Documented new CLI option `--include-raw` to include the raw API response for debugging. - Revised output and error handling rules for greater transparency and improved user guidance. - Updated resource and API documentation links.

v1.0.2

- Updated skill name/branding from "GLM-OCR Handwriting" to "GLM-V Handwriting". - Clarified that API key is shared across all ZhiPu/智谱 skills. - Strengthened mandatory output rules: always display the full extracted handwriting result without summarization. - Simplified and clarified API usage instructions and CLI reference. - Updated response format to reflect current output structure. - Removed support and references to the deprecated `--include-raw` option and trimmed advanced usage details.

v1.0.1

- Clarified environment variable usage, noting that API keys may (but don't have to) be reused across Zhipu skills. - Added "Security & Transparency" section detailing fixed API endpoints, prohibiting custom overrides, and describing the optional inclusion of raw responses for debugging. - Updated mandatory output display rules to allow summarization, but prohibit hiding extraction failures; raw response now shown only if requested or debugging. - Introduced new CLI flag `--include-raw` to optionally include the raw upstream API result in outputs. - Updated response format documentation to reflect the new `raw_result_included` indicator and changes regarding the raw response. - Improved language conciseness and clarified error handling instructions.

v1.0.0

- Initial release of handwriting OCR skill using ZhiPu GLM-OCR API - Recognizes handwritten text from images and PDFs, supporting both local files and remote URLs - Capable of extracting mixed handwritten and printed content in Chinese, English, and mixed languages - Full extracted text is always shown to the user, as returned by the API - Requires ZHIPU_API_KEY environment variable for authentication; setup instructions provided - Includes clear error handling and strict usage rules for API errors and response display

Metadata

Slug glmocr-handwriting

Version 1.0.4

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 5

Frequently Asked Questions

What is GLM-OCR-Handwriting?

Official skill for recognizing handwritten text from images using ZhiPu GLM-OCR API. Supports various handwriting styles, languages, and mixed handwritten/pr... It is an AI Agent Skill for Claude Code / OpenClaw, with 431 downloads so far.

How do I install GLM-OCR-Handwriting?

Run "/install glmocr-handwriting" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is GLM-OCR-Handwriting free?

Yes, GLM-OCR-Handwriting is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does GLM-OCR-Handwriting support?

GLM-OCR-Handwriting is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created GLM-OCR-Handwriting?

It is built and maintained by Jared Wen (@jaredforreal); the current version is v1.0.4.

More Skills

GLM-OCR-Handwriting