功能描述

Read, write and manage Lark/Feishu Sheets (spreadsheets) and download Lark/Feishu cloud files via Lark OpenAPI. Reads Feishu app credentials (appId/appSecret...

使用说明 (SKILL.md)

Lark/Feishu Sheets & Cloud File Download (with PDF extraction)

Name: Lark/Feishu Sheets & Cloud File Download (with PDF extraction)
Author: mli-cj

Read, write and manage Lark/Feishu Sheets, and download Lark/Feishu cloud files, by calling the official OpenAPI from local scripts.

Prerequisites

python3 on PATH

Feishu/Lark app credentials configured in ~/.openclaw/openclaw.json under channels.feishu:

{
  "channels": {
    "feishu": {
      "appId": "cli_xxx",
      "appSecret": "xxx",
      "domain": "feishu"
    }
  }
}

The Feishu/Lark app must have Sheets read & write permissions and Drive file download permissions enabled in the developer console.
The target spreadsheet/file must be shared with the app/bot identity.

Quick Start

Get spreadsheet token from the URL

Example URL: https://.../sheets/YOUR_SPREADSHEET_TOKEN?sheet=SHEET_ID

spreadsheet_token = YOUR_SPREADSHEET_TOKEN
sheet query param (often a sheetId) = SHEET_ID

Read / Export

# Export a single range to CSV
python3 {baseDir}/scripts/sheets_export.py \
  --token YOUR_SPREADSHEET_TOKEN \
  --range 'SHEET_ID!A1:Z200' \
  --csv /tmp/sheet.csv

# Or export to JSON (recommended for multi-range)
python3 {baseDir}/scripts/sheets_export.py \
  --url 'https://xxx.larksuite.com/sheets/YOUR_SPREADSHEET_TOKEN?sheet=SHEET_ID' \
  --range 'SHEET_ID!A1:Z200' \
  --json /tmp/sheet.json

Then load /tmp/sheet.csv or /tmp/sheet.json and continue with analysis/summarization.

Write / Update

# List all sheet tabs
python3 {baseDir}/scripts/sheets_write.py \
  --token YOUR_SPREADSHEET_TOKEN list-sheets

# Write values to a single range
python3 {baseDir}/scripts/sheets_write.py \
  --token YOUR_SPREADSHEET_TOKEN \
  write --range 'SheetId!A1:C2' --values '[["a","b","c"],["d","e","f"]]'

# Write values from a JSON file
python3 {baseDir}/scripts/sheets_write.py \
  --token YOUR_SPREADSHEET_TOKEN \
  write --range 'SheetId!A1:C2' --values-file /tmp/data.json

# Batch write multiple ranges at once
python3 {baseDir}/scripts/sheets_write.py \
  --token YOUR_SPREADSHEET_TOKEN \
  batch-write --batch '[{"range":"Sheet1!A1:B1","values":[["x","y"]]},{"range":"Sheet1!A2:B2","values":[["1","2"]]}]'

# Add a new sheet tab
python3 {baseDir}/scripts/sheets_write.py \
  --token YOUR_SPREADSHEET_TOKEN \
  add-sheet --title 'NewSheet'

# Clone an existing sheet's values into a new tab
python3 {baseDir}/scripts/sheets_write.py \
  --token YOUR_SPREADSHEET_TOKEN \
  clone-sheet --source-sheet-id abc123 --title 'ClonedSheet' --clone-range 'A1:Z200'

Using a URL instead of --token

Both scripts accept --url to auto-extract the spreadsheet token:

python3 {baseDir}/scripts/sheets_write.py \
  --url 'https://xxx.larksuite.com/sheets/YOUR_SPREADSHEET_TOKEN?sheet=SHEET_ID' \
  write --range 'SHEET_ID!A1:B1' --values '[["hello","world"]]'

File Download

Download cloud files (PDF, documents, etc.) from Lark/Feishu Drive.

Get file token from the URL

Example URL: https://.../file/YOUR_FILE_TOKEN

file_token = YOUR_FILE_TOKEN

Download a file

# Download by URL (PDF files auto-extract text to .txt)
python3 {baseDir}/scripts/file_download.py \
  --url "https://.../file/YOUR_FILE_TOKEN" \
  --out /tmp/report.pdf

# Download by file token directly
python3 {baseDir}/scripts/file_download.py \
  --file-token YOUR_FILE_TOKEN \
  --out /tmp/report.pdf

# Force text extraction for non-.pdf files
python3 {baseDir}/scripts/file_download.py \
  --file-token YOUR_FILE_TOKEN \
  --out /tmp/document.bin --extract-text

Reading downloaded PDF content

When --out ends with .pdf, the script automatically:

Extracts text to a .txt file (e.g. /tmp/report.pdf → /tmp/report.txt)
Extracts embedded images to a _images/ directory (e.g. /tmp/report_images/img-000.png, ...)
If text is garbled/unreadable, renders each page as a PNG image to _pages/ directory for visual reading

Text extraction priority: pdfplumber → pypdf → pdftotext (poppler). All Python packages are auto-installed via pip on first use. Includes garbled-text detection — if extracted text is unreadable (e.g. scanned PDF, special fonts), pages are rendered to images automatically.

Image extraction priority: pypdf → pdfimages (poppler).

Page rendering (garbled fallback): pymupdf → pdf2image.

The typical workflow is:

Run the download script
If text is readable → read /tmp/report.txt with the Read tool
If text is garbled → read page images in /tmp/report_pages/ with the Read tool (AI vision)
Read embedded images in /tmp/report_images/ for charts, diagrams, etc.
Summarize / analyze the content

For non-PDF files, use --extract-text to force extraction.

Write Subcommands Reference

Subcommand	Description	Key flags
`list-sheets`	List all sheet tabs (id, title, index)	—
`write`	Write values to a single range	`--range`, `--values` or `--values-file`
`batch-write`	Write values to multiple ranges in one call	`--batch` or `--batch-file`
`add-sheet`	Create a new empty sheet tab	`--title`
`clone-sheet`	Clone values from an existing sheet to a new tab	`--source-sheet-id`, `--title`, `--clone-range`

All subcommands support --dry-run to preview without executing.

Notes / Gotchas

Range format: the API accepts "{sheetId}!A1:Z200" or "{sheetTitle}!A1:Z200".
- If you don't know the sheet title, use list-sheets first, or start with the sheet= value from the URL.
Large sheets: export only the needed columns/rows first; widen the range iteratively.
Values format: must be a JSON array of arrays (rows x columns), e.g. [["a","b"],["c","d"]].
Secrets: the script reads appId/appSecret from ~/.openclaw/openclaw.json. Do not print or paste those credentials into chat.

Troubleshooting

403 / permission errors:
- Confirm the sheet/file has been shared with the app/bot identity.
- Confirm the Lark/Feishu app has the required permissions (Sheets read & write, Drive file download) enabled in the developer console.
values_batch_get failed / values_batch_update failed with non-zero code:
- Most often a bad range string. Try a smaller range or verify the sheetId/title via list-sheets.
addSheet failed:
- The title may already exist. Sheet titles must be unique within a spreadsheet.

External Endpoints

This skill makes outbound requests to the following Lark/Feishu OpenAPI endpoints only:

URL	Purpose
`https://open.feishu.cn/open-apis/auth/v3/tenant_access_token/internal`	Obtain tenant access token
`https://open.feishu.cn/open-apis/sheets/v3/spreadsheets/*/sheets/query`	List sheet tabs
`https://open.feishu.cn/open-apis/sheets/v2/spreadsheets/*/values_batch_get`	Read cell values
`https://open.feishu.cn/open-apis/sheets/v2/spreadsheets/*/values_batch_update`	Write cell values
`https://open.feishu.cn/open-apis/sheets/v2/spreadsheets/*/sheets_batch_update`	Add/manage sheet tabs
`https://open.feishu.cn/open-apis/drive/v1/files/*/download`	Download file content

For Lark (international) users, the base URL is https://open.larksuite.com instead.

Security & Privacy

Credentials are local. appId/appSecret are read from ~/.openclaw/openclaw.json and only sent to the official Feishu/Lark OpenAPI for token exchange.
No data leaves the Feishu ecosystem. All read/write operations go through the official Lark OpenAPI.
Scripts are sandboxed. They only access the OpenClaw config file and the target spreadsheet. No other files or environment variables are read.

安全使用建议

This skill is internally coherent with its description, but note a few practical safety points before installing/using it: (1) The code reads your Feishu/Lark appId and appSecret from ~/.openclaw/openclaw.json (or the OPENCLAW_CONFIG path) — those credentials will be used to fetch tenant tokens and call the official OpenAPI; only provide credentials you intend the skill to use. (2) The scripts auto-install Python PDF libraries from PyPI on first use; if you prefer, run them inside a virtualenv or container to avoid modifying your system Python environment. (3) The tool will download files to paths you supply and may create image/text output directories; review and choose output paths carefully. (4) The skill source and homepage are not provided in registry metadata — if you need stronger provenance guarantees, review the included scripts yourself or only install from a trusted source. (5) If you have sensitive credentials in your OpenClaw config that you do not want available to third-party code, consider creating a dedicated Feishu app with minimal scopes and rotating credentials after testing.

功能分析

Type: OpenClaw Skill Name: feishu-lark-sheets-edit Version: 1.3.1 The skill provides legitimate tools for interacting with Lark/Feishu Sheets and Drive APIs. It handles authentication securely by reading credentials from a local configuration file (~/.openclaw/openclaw.json) and communicating only with official Lark/Feishu endpoints. The scripts include robust PDF processing features, such as text extraction and page rendering, and while they perform automatic installation of necessary Python libraries (e.g., pdfplumber, pypdf) via pip, this behavior is transparently documented and limited to the stated purpose. No evidence of malicious intent, data exfiltration, or prompt injection was found.

能力评估

✓ Purpose & Capability

Name/description match the actual behavior: the scripts implement Sheets read/write and Drive file download using Feishu/Lark OpenAPI. The scripts expect appId/appSecret in ~/.openclaw/openclaw.json (or via OPENCLAW_CONFIG), which is a reasonable way to authenticate for these APIs.

✓ Instruction Scope

SKILL.md and the included scripts limit actions to reading the OpenClaw config, fetching tenant tokens, calling Feishu/Lark API endpoints, downloading files, extracting PDF text/images, and writing user-specified output files. I found no instructions to read unrelated system files or send data to external endpoints beyond the official OpenAPI hosts.

ℹ Install Mechanism

This is an instruction-only skill (no install spec). The scripts auto-install Python packages (pdfplumber, pypdf, pymupdf, etc.) using pip at runtime if missing. Auto-installing packages from PyPI is expected for PDF extraction but does modify the environment and requires network access to PyPI; consider running in an isolated virtualenv if undesirable.

✓ Credentials

The skill does not request unrelated credentials or environment variables. It reads appId/appSecret from the documented OpenClaw config path (or OPENCLAW_CONFIG). That access is required to obtain API tokens for the listed functionality and is proportionate to the stated purpose.

✓ Persistence & Privilege

The skill is user-invocable, not forced (always: false), and does not attempt to modify other skills or system-wide agent settings. It writes only to user-specified output paths (and creates support directories for extraction). Autonomous invocation is allowed by platform default but is not combined with elevated/hidden privileges here.

版本历史

v1.3.1

Add required app permissions (API scopes) documentation to README, update skill title

v1.3.0

Robust PDF extraction: pdfplumber for text, garbled-text detection, auto page-to-image rendering for scanned/special-font PDFs

v1.2.1

Update README with PDF text/image extraction documentation

v1.2.0

PDF text/image extraction via pypdf (pure Python, auto-installed), no system dependencies required

v1.1.2

Add Drive API reference, Security & Credentials section in README, clean up token placeholders

v1.1.1

Declare credential file access (~/.openclaw/openclaw.json) in structured metadata

v1.1.0

Add file download support for Lark/Feishu cloud files (PDF, etc.)

v1.0.3

Fix: explicitly declare credential file access (~/.openclaw/openclaw.json) in registry description metadata

v1.0.2

Fix: sanitize all example tokens, fix docstring file name references (lark_sheets_* -> sheets_*), align regex pattern between export/write scripts, remove unused variables

v1.0.1

Fix security scan: replace hardcoded paths with ~ expansion, sanitize example tokens in docstrings

v1.0.0

Initial release: read/export and write/update Lark/Feishu Sheets via OpenAPI

元数据

Slug feishu-lark-sheets-edit

版本 1.3.1

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 11

常见问题