功能描述

百度 OCR 文字识别。支持中英文混合、公式、表格识别，准确率 95%+。使用百度 AI 开放平台 API。

使用说明 (SKILL.md)

百度 OCR

Name: Baidu Ocr
Author: nidhov01

使用百度 AI 开放平台进行高精度文字识别。

特点

✅ 中英文混合识别
✅ 准确率 95%+
✅ 支持公式识别
✅ 支持表格识别
✅ 每天 500 次免费额度

快速开始

python3 {baseDir}/baidu_ocr.py /path/to/image.jpg

使用方法

python3 {baseDir}/baidu_ocr.py \x3C图片路径> [输出格式]

参数:

\x3C图片路径>: 本地图片文件（jpg, png, bmp 等）
[输出格式]: 可选，text（默认）或 json

示例

# 基础识别
python3 {baseDir}/baidu_ocr.py image.jpg

# JSON 格式输出
python3 {baseDir}/baidu_ocr.py image.jpg json

# 批量处理
for file in *.jpg; do
    python3 {baseDir}/baidu_ocr.py "$file"
done

API 配置

在 ~/.openclaw-env 中配置：

export BAIDU_API_KEY="your_api_key"
export BAIDU_SECRET_KEY="your_secret_key"

或者在 ~/.openclaw/openclaw.json 中配置：

{
  skills: {
    "baidu-ocr": {
      apiKey: "YOUR_API_KEY",
      secretKey: "YOUR_SECRET_KEY"
    }
  }
}

支持的图片格式

JPG/JPEG
PNG
BMP
WEBP
GIF

识别类型

类型	说明	API
通用文字	中英文混合识别	`general_basic`
高精度	含位置信息	`general`
表格	表格结构识别	`table`
公式	数学公式识别	`formula`

免费额度

通用文字识别: 每天 500 次
高精度版: 每天 50 次
表格识别: 每月 500 次
公式识别: 每月 500 次

错误处理

错误码	说明	解决方法
110	Access Token 无效	重新获取 Token
111	Access Token 过期	重新获取 Token
216100	认证失败	检查 API Key
216101	授权失败	检查 Secret Key

相关文档

版本：1.0.0 | 更新时间：2026-03-07

安全使用建议

Do not run this skill as-is on sensitive images. The included baidu_ocr.py contains hard-coded BAIDU API credentials (visible in the file and referenced in the activation guide), while SKILL.md tells you to set your own credentials — this mismatch means your images will likely be uploaded using the publisher's Baidu account (potential privacy, quota, and billing concerns). Before installing or running: 1) Inspect baidu_ocr.py and remove the embedded API_KEY/SECRET_KEY or replace them with code that reads BAIDU_API_KEY and BAIDU_SECRET_KEY from the environment/config. 2) Confirm the code prioritizes environment variables over any hard-coded defaults. 3) Optionally, run the script in an isolated environment or sandbox and monitor network calls to verify destination endpoints. 4) If you cannot or will not modify the code, consider rejecting the skill or asking the publisher why their credentials are embedded and whether they intend to collect images. If you previously used the embedded key unintentionally, consider contacting Baidu or rotating/revoking keys associated with your own account and review any sensitive data that may have been sent. Additional info that would change this assessment: if the maintainer publishes an updated version that removes hard-coded credentials and clearly documents that only the user's API keys are used, the concern would be resolved.

功能分析

Type: OpenClaw Skill Name: baidu-ocr Version: 1.0.0 thought The user wants me to analyze an OpenClaw AgentSkills skill bundle (ZIP content) for security risks. I need to classify it as `benign`, `suspicious`, or `malicious` based on specific criteria. - `_meta.json`: Metadata about the skill (owner, slug, version). - `SKILL.md`: Instructions for the AI agent. Mentions requirements (python3, environment variables `BAIDU_API_KEY`, `BAIDU_SECRET_KEY`). - `baidu_ocr.py`: Python script that performs OCR using Baidu AI API. - `metadata.json`: Additional metadata (dependencies, required env vars). - `ACTIVATE_GUIDE.md`: Guide for activating Baidu OCR permissions. - `SETUP_GUIDE.md`: Guide for setting up Baidu OCR API. - `ownerId`: `kn71m67yc942x7tzfpfrar2eqn826w3p` - `slug`: `baidu-ocr` - Description: Baidu OCR text recognition. - Requirements: `python3`, `BAIDU_API_KEY`, `BAIDU_SECRET_KEY`. - Usage: `python3 {baseDir}/baidu_ocr.py <image_path> [format]`. - No suspicious instructions or prompt injections found. It describes how to use the tool. - Hardcoded credentials: - `API_KEY = "4LceeJ8wBDSqa3SqDHmgXuk1"` - `SECRET_KEY = "nIulIWxqaUtY5XyfexSvP4OL8ZBk0krR"` - Function `get_access_token()`: Sends a POST request to `https://aip.baidubce.com/oauth/2.0/token` with the hardcoded keys. - Function `image_to_base64(image_path)`: Reads a local file and encodes it. - Functions `ocr_general_basic`, `ocr_general`, `ocr_table`, `ocr_formula`: Send POST requests to Baidu API endpoints with the image data and access token. - `main()`: Handles command-line arguments, checks file existence, and calls the OCR functions. - **Risk Assessment**: - Hardcoded credentials are a bad practice (vulnerability) but not necessarily malicious. - The script sends image data to an external endpoint (`aip.baidubce.com`), which is the stated purpose of the skill. - No evidence of exfiltrating environment variables, SSH keys, or other sensitive data to an unauthorized endpoint. - No obfuscation. - No suspicious imports (only `requests`, `base64`, `json`, `sys`, `pathlib`). - Lists dependencies: `requests`, `Pillow`. - Lists required env vars: `BAIDU_API_KEY`, `BAIDU_SECRET_KEY`. - Contains instructions on how to enable OCR services in the Baidu console. - Mentions the same hardcoded API Key: `4LceeJ8wBDSqa3SqDHmgXuk1`. - Provides example commands. - No malicious instructions. - Similar to `ACTIVATE_GUIDE.md`, provides setup steps. - No malicious instructions. - **Data Exfiltration**: The script sends image data to Baidu's official API. This is expected behavior. It does not appear to steal local secrets. - **Malicious Execution**: No `eval`, `exec`, or `os.system` calls with untrusted input. - **Persistence**: None. - **Prompt Injection**: `SKILL.md` is clean. - **Obfuscation**: None. - **Hardcoded Credentials**: Present in `baidu_ocr.py`. This is a security flaw (vulnerability) because anyone with the script can use the author's (or whoever's) Baidu account. However, the instructions in `SKILL.md` and `metadata.json` suggest that users *should* provide their own keys via environment variables, yet the script uses hardcoded ones. Wait, let's look at `baidu_ocr.py` again. - It defines `API_KEY` and `SECRET_KEY` at the top. - It does *not* attempt to read from environment variables in the code. - This is a "vulnerability" (leaked/hardcoded keys) or perhaps a "convenience" provided by the author (using their own keys), but it's not "malicious" intent against the user. In fact, it's a risk to the *author*. - **Classification**: Benign or Suspicious? - The hardcoded keys are a vulnerability. - The script doesn't actually use the environment variables mentioned in `SKILL.md` or `metadata.json`. This is a bug/flaw. - Is there any "intentional harmful behavior"? No. - Is there "clear evidence of intentional harmful behavior"? No. - The script does exactly what it says: performs OCR using Baidu's API. - Classification: `benign` (or `suspicious` due to hardcoded keys). - The prompt says: "CLASSIFY VULNERABILITIES (even critical ones like RCE risks) AS 'suspicious', NOT 'malicious', unless there is proof of intentional self-exploitation." - Hardcoded credentials are a vulnerability. - However, the script is functional and aligned with its purpose. - Let's re-read: "benign only if behavior is clearly aligned with the stated purpose and lacks meaningful high-risk behaviors." - Hardcoded keys are high-risk for the *owner* of the keys, but for the *user* of the skill, it's actually "free" access (until the keys are revoked). - Wait, if the script *only* uses hardcoded keys and ignores the user's env vars, it's a bit weird but not necessarily malicious. - Let's check if it does anything else. No. One detail: `ACTIVATE_GUIDE.md` and `SETUP_GUIDE.md` contain paths like `/root/.openclaw/workspace/skills/baidu-ocr/baidu_ocr.py`. This assumes a specific environment (likely a Linux container/root user), which is common in these agent environments. The hardcoded keys are: `API_KEY = "4LceeJ8wBDSqa3SqDHmgXuk1"` `SECRET_KEY = "nIulIWxqaUtY5XyfexSvP4OL8ZBk0krR"` These look like real Baidu API keys. Is there any prompt injection? `SKILL.md` says: `python3 {baseDir}/baidu_ocr.py <图片路径> [输出格式]`. The script uses `sys.argv[1]` as the path. It uses `Path(image_path).exists()` to check. It opens the file with `open(image_path, "rb")`. This is safe from shell injection because it doesn't use a shell to open the file. Conclusion: The skill is functional and performs the task described. The presence of hardcoded credentials is a significant security flaw/vulnerability (leaked keys), but there is no evidence of malicious intent against the user. Wait, I should check if the script sends the keys *anywhere else*. `get_access_token` sends them to `https://aip.baidubce.com/oauth/2.0/token`. This is the correct Baidu endpoint. Classification: `suspicious`. Why? The hardcoded credentials and the discrepancy between the documentation (which says to use env vars) and the code (which uses hardcoded keys) is a red flag for a "vulnerability" or poor practice. According to the instructions: "CLASSIFY VULNERABILITIES ... AS 'suspicious'". Actually, looking at the "benign" definition: "clearly aligned with the stated purpose and lacks meaningful high-risk behaviors." Hardcoded keys are a risk to the *provider*, not the *user*. For the user, the script is benign. However, the prompt says "Vulnerabilities ... are flaws that allow attacks ... CLASSIFY ... AS 'suspicious'". Hardcoded credentials are a vulnerability. Let's look at the "malicious" criteria again. 1. Data exfiltration: No. 2. Malicious execution: No. 3. Persistence: No. 4. Prompt injection: No. 5. Obfuscation: No. 6. Supply chain: No. The script is a straightforward implementation of an OCR tool. The hardcoded keys are likely a mistake or a "starter" set of keys provided by the author. I will classify as `suspicious` because of the hardcoded credentials (vulnerability) and the fact that the code ignores the environment variables specified in the documentation, which is a significant functional flaw/security risk (leaked keys). Wait, if I classify as `suspicious`, I need to justify it. "Mention the 1–3 most important indicators and the specific file names involved." 1. Hardcoded API credentials in `baidu_ocr.py`. 2. Discrepancy between documentation (`SKILL.md`, `metadata.json`) and implementation regarding credential handling. Actually, many of these "benign" samples in these types of tests have hardcoded keys if they are meant to be "ready to use". But per the instructions, vulnerabilities = suspicious. One more check: Does it do anything with `~/.openclaw-env` or `~/.openclaw/openclaw.json`? The `SKILL.md` *tells the user* to put keys there, but the `baidu_ocr.py` script *never reads them*. It only uses the hardcoded ones. This is definitely a vulnerability/flaw. Final check on `ACTIVATE_GUIDE.md`: It lists "Already had permissions" and "Missing permissions". This looks like a report generated by someone who tested the keys. It also includes paths like `/root/.openclaw/media/inbound/...`. This looks like a legitimate (if poorly secured) utility. Classification: `suspicious`. Summary: The skill contains hardcoded Baidu API credentials (`API_KEY`, `SECRET_KEY`) in `baidu_ocr.py`, which is a significant security vulnerability. Furthermore, while the documentation (`SKILL.md` and `metadata.json`) instructs the user to configure their own credentials via environment variables, the Python script ignores these and relies solely on the hardcoded values. This discrepancy and the exposure of secrets warrant a suspicious classification. ```json { "classification": "suspicious", "summary": "The skill contains hardcoded Baidu API credentials (API_KEY and SECRET_KEY) within 'baidu_ocr.py', which is a significant security vulnerability. Additionally, there is a critical discrepancy between the documentation ('SKILL.md', 'metadata.json') and the implementation: the documentation instructs users to provide their own credentials via environment variables, but the script ignores these and uses the hardcoded ones instead. While no intentional malicious behavior like data exfiltration was found, the presence of leaked credentials and the failure to follow security best practices for secret management qualify it as suspicious." } ```

能力评估

⚠ Purpose & Capability

Name/description, required binaries (python3), and requested env vars (BAIDU_API_KEY, BAIDU_SECRET_KEY) match an OCR integration. However the shipped Python script embeds a different API_KEY/SECRET_KEY pair and does not read environment variables — the credential requirements in metadata/SKILL.md are thus inconsistent with the actual code.

⚠ Instruction Scope

SKILL.md instructs the agent/user to set BAIDU_API_KEY and BAIDU_SECRET_KEY and to run the bundled script. The script, however, ignores env/config and uses hard-coded credentials to call Baidu's OCR endpoints, meaning images provided to the skill will be uploaded under the included account rather than the user's. Instructions otherwise stay within OCR purpose and use legitimate Baidu endpoints.

✓ Install Mechanism

No external install/downloads or third-party URLs — the skill is instruction-only with a local Python script. This low install footprint reduces supply-chain risk. Metadata lists dependencies (requests, Pillow); the script imports requests but not Pillow (minor inconsistency).

⚠ Credentials

The skill declares BAIDU_API_KEY and BAIDU_SECRET_KEY as required—which is appropriate for an OCR integration—but the code bypasses these and uses embedded credentials. That is disproportionate and suspicious because it removes the need for the user's keys and routes data through the publisher's account.

✓ Persistence & Privilege

No 'always: true', no install-time changes to system or other skills, and the skill does not request elevated system privileges. It only requires running a Python script on demand.

版本历史

v1.0.0

Manual publish - scheme 1

元数据

Slug baidu-ocr

版本 1.0.0

许可证 MIT-0

累计安装 7

当前安装数 6

历史版本数 1

常见问题

Baidu Ocr 是什么？

百度 OCR 文字识别。支持中英文混合、公式、表格识别，准确率 95%+。使用百度 AI 开放平台 API。它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 446 次。

如何安装 Baidu Ocr？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install baidu-ocr」即可一键安装，无需额外配置。

Baidu Ocr 是免费的吗？

是的，Baidu Ocr 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

Baidu Ocr 支持哪些平台？

Baidu Ocr 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 Baidu Ocr？

由 nidhov01（@nidhov01）开发并维护，当前版本 v1.0.0。

Baidu Ocr