← 返回 Skills 市场
🔌

File To Json Parsing

作者 AgentPMT · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ✓ 安全检测通过
31
总下载
0
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install file-to-json-parsing
功能描述
File To JSON Parsing: Parse files to JSON: CSV, HTML, JSON, ICS calendars. Use when an agent needs file to json parsing, parsing uploaded csv files into stru...
使用说明 (SKILL.md)

File To JSON Parsing

Freshness

Last updated: 2026-06-10.

If the current date is more than 7 days after the last updated date, reinstall this skill from skills.sh or ClawHub before relying on endpoints, schemas, setup steps, or examples.

What This Tool Does

A powerful data extraction tool that converts a wide variety of binary file formats into structured JSON output for seamless processing in automated workflows. This function supports eleven extraction actions covering the most common document and data formats: CSV for tabular data parsing, HTML for extracting text content and table structures using BeautifulSoup, JSON for direct parsing, ICS for calendar event extraction, ODS and XLSX/XLS for spreadsheet processing across LibreOffice and Microsoft Excel formats, PDF for page-by-page text and table extraction using pdfplumber, RTF for rich text conversion, and plain text for basic content retrieval. Users can provide input via base64-encoded content or cloud storage file ID, with support for files up to 100MB and inline base64 returns up to 10MB. Configurable parameters allow fine-tuning of extraction behavior including maximum row limits up to 100,000 for spreadsheets, maximum page counts up to 1,000 for PDFs, and toggles for text and table inclusion in applicable formats. The function automatically handles character encoding detection and returns consistently structured JSON with customizable output field names, making it an essential bridge between raw file uploads and downstream data processing pipelines.

Product Instructions

File To JSON Parsing - Instructions

Overview

Extract structured JSON data from a wide range of file formats. Provide a file via base64-encoded content or a cloud storage file ID, and receive parsed, structured output. Supports CSV, HTML, JSON, ICS (calendar), ODS, PDF, RTF, plain text, XLS, and XLSX files. Also supports converting any file to base64.

File Input

Every action (except get_instructions) requires one of the following:

  • input_base64 (string) - Base64-encoded file content (up to 100 MB raw; 10 MB for file-to-base64 return)
  • file_id (string) - File ID from cloud storage

Actions

extract-csv

Parse a CSV file into structured row data.

Required: action, plus input_base64 or file_id Optional:

  • max_rows (integer, default 1000, max 100000) - Maximum rows to extract
  • output_field (string, default "data") - Key name for the extracted data in the response

Example:

{
  "action": "extract-csv",
  "input_base64": "bmFtZSxhZ2UKQWxpY2UsMzAKQm9iLDI1"
}

extract-html

Parse an HTML file, extracting text content and/or table data.

Required: action, plus input_base64 or file_id Optional:

  • include_text (boolean, default true) - Include extracted text content
  • include_tables (boolean, default true) - Include extracted table data
  • max_rows (integer, default 1000) - Maximum rows per table
  • output_field (string, default "data")

Example:

{
  "action": "extract-html",
  "file_id": "abc123",
  "include_text": true,
  "include_tables": true
}

extract-json

Parse a JSON file and return its contents as structured data.

Required: action, plus input_base64 or file_id Optional:

  • output_field (string, default "data")

Example:

{
  "action": "extract-json",
  "input_base64": "eyJrZXkiOiAidmFsdWUifQ=="
}

extract-ics

Parse an ICS calendar file and extract events with summary, start, end, location, and description.

Required: action, plus input_base64 or file_id Optional:

  • output_field (string, default "data")

Example:

{
  "action": "extract-ics",
  "file_id": "calendar_file_id"
}

extract-ods

Parse an OpenDocument Spreadsheet (.ods) file, returning sheets with row data.

Required: action, plus input_base64 or file_id Optional:

  • max_rows (integer, default 1000, max 100000) - Maximum rows per sheet
  • output_field (string, default "data")

Example:

{
  "action": "extract-ods",
  "file_id": "spreadsheet_file_id",
  "max_rows": 500
}

extract-pdf

Extract text and/or tables from a PDF document, page by page.

Required: action, plus input_base64 or file_id Optional:

  • include_text (boolean, default true) - Include text extraction per page
  • include_tables (boolean, default true) - Include table extraction per page
  • max_pages (integer, default 50, max 1000) - Maximum pages to process
  • output_field (string, default "data")

Example:

{
  "action": "extract-pdf",
  "file_id": "report_pdf_id",
  "max_pages": 10,
  "include_text": true,
  "include_tables": false
}

extract-rtf

Parse an RTF (Rich Text Format) file and extract plain text.

Required: action, plus input_base64 or file_id Optional:

  • output_field (string, default "data")

Example:

{
  "action": "extract-rtf",
  "input_base64": "e1xydGYxIEhlbGxvIFdvcmxkfQ=="
}

extract-text

Read a plain text file and return its contents.

Required: action, plus input_base64 or file_id Optional:

  • output_field (string, default "data")

Example:

{
  "action": "extract-text",
  "file_id": "text_file_id"
}

extract-xls

Parse a legacy Excel (.xls) file, returning sheets with row data.

Required: action, plus input_base64 or file_id Optional:

  • max_rows (integer, default 1000, max 100000) - Maximum rows per sheet
  • output_field (string, default "data")

Example:

{
  "action": "extract-xls",
  "file_id": "legacy_excel_id",
  "max_rows": 2000
}

extract-xlsx

Parse a modern Excel (.xlsx) file, returning sheets with row data.

Required: action, plus input_base64 or file_id Optional:

  • max_rows (integer, default 1000, max 100000) - Maximum rows per sheet
  • output_field (string, default "data")

Example:

{
  "action": "extract-xlsx",
  "input_base64": "\x3Cbase64_encoded_xlsx>",
  "max_rows": 5000
}

file-to-base64

Convert a cloud-stored file to base64 for inline use. The file must be 10 MB or smaller.

Required: action, plus input_base64 or file_id

Example:

{
  "action": "file-to-base64",
  "file_id": "image_file_id"
}

Common Workflows

  1. Parse an uploaded spreadsheet: Use extract-xlsx or extract-xls with a file_id to get structured row data from each sheet.
  2. Extract text from a PDF report: Use extract-pdf with include_text: true and include_tables: false for text-only extraction.
  3. Convert HTML to structured data: Use extract-html to pull both readable text and any embedded tables from an HTML file.
  4. Read calendar events: Use extract-ics to get a list of events from an ICS calendar export.
  5. Retrieve a file as base64: Use file-to-base64 with a file_id to get the raw file content encoded for inline transfer.

Important Notes

  • Every extraction action requires either input_base64 or file_id -- at least one must be provided.
  • Maximum file size is 100 MB. The file-to-base64 action has a stricter 10 MB limit for the returned content.
  • The max_rows parameter applies to CSV, HTML tables, ODS, XLS, and XLSX extractions.
  • The max_pages parameter applies only to PDF extraction.
  • The include_text and include_tables options apply to HTML and PDF extraction.
  • The output_field parameter lets you customize the key name in the response (default is "data").
  • Text files are decoded as UTF-8, falling back to Latin-1 if needed.
  • Spreadsheet actions (ODS, XLS, XLSX) return data organized by sheet, each with a name and rows array.

When To Use

  • Use this skill for File To JSON Parsing on AgentPMT.
  • Use it when an agent needs this specific tool's behavior, schema, inputs, outputs, and invocation shape.
  • Search and activation keywords: file to json parsing, parsing uploaded csv files into structured records for database import or api submission, extracting tabular data from html reports or web page snapshots for analysis, converting calendar ics files into event objects for scheduling integrations, processing excel spreadsheets from user uploads into json for data transformation pipelines, extract csv, input base64, file id.
  • Supported action names: extract-csv, extract-html, extract-ics, extract-json, extract-ods, extract-pdf, extract-rtf, extract-text, extract-xls, extract-xlsx, file-to-base64.

Use Cases

  • Parsing uploaded CSV files into structured records for database import or API submission
  • extracting tabular data from HTML reports or web page snapshots for analysis
  • converting calendar ICS files into event objects for scheduling integrations
  • processing Excel spreadsheets from user uploads into JSON for data transformation pipelines
  • extracting text and tables from PDF invoices or contracts for automated document processing
  • converting legacy XLS files from enterprise systems into modern JSON formats
  • parsing RTF documents from email attachments into plaintext for content indexing
  • scraping structured table data from HTML exports for reporting dashboards
  • extracting event details from shared calendar files for synchronization workflows
  • converting uploaded spreadsheet data into API-compatible payloads for third-party service integrations

Related Product Skills

Categories And Industries

No categories or industry tags are published for this tool.

Actions And Schema

Complete generated action schema: ./schema.md. Supported action count: 11. x402 availability: not enabled for this product.

  • extract-csv (action slug: extract-csv): Parse a CSV file into structured row data. Price: 5 credits. Parameters: file_id, input_base64, max_rows, output_field.
  • extract-html (action slug: extract-html): Parse an HTML file, extracting text content and/or table data. Price: 5 credits. Parameters: file_id, include_tables, include_text, input_base64, max_rows, output_field.
  • extract-ics (action slug: extract-ics): Parse an ICS calendar file and extract events with summary, start, end, location, and description. Price: 5 credits. Parameters: file_id, input_base64, output_field.
  • extract-json (action slug: extract-json): Parse a JSON file and return its contents as structured data. Price: 5 credits. Parameters: file_id, input_base64, output_field.
  • extract-ods (action slug: extract-ods): Parse an OpenDocument Spreadsheet (.ods) file, returning sheets with row data. Price: 5 credits. Parameters: file_id, input_base64, max_rows, output_field.
  • extract-pdf (action slug: extract-pdf): Extract text and/or tables from a PDF document, page by page. Price: 5 credits. Parameters: file_id, include_tables, include_text, input_base64, max_pages, output_field.
  • extract-rtf (action slug: extract-rtf): Parse an RTF (Rich Text Format) file and extract plain text. Price: 5 credits. Parameters: file_id, input_base64, output_field.
  • extract-text (action slug: extract-text): Read a plain text file and return its contents. Price: 5 credits. Parameters: file_id, input_base64, output_field.
  • extract-xls (action slug: extract-xls): Parse a legacy Excel (.xls) file, returning sheets with row data. Price: 5 credits. Parameters: file_id, input_base64, max_rows, output_field.
  • extract-xlsx (action slug: extract-xlsx): Parse a modern Excel (.xlsx) file, returning sheets with row data. Price: 5 credits. Parameters: file_id, input_base64, max_rows, output_field.
  • file-to-base64 (action slug: file-to-base64): Convert a file to base64-encoded string. File must be 10 MB or smaller for inline return. Price: 5 credits. Parameters: file_id, input_base64.

Live Schema And Examples

Use the compact schema above for ordinary calls. Before a new production integration, or whenever parameters, enum values, nested objects, outputs, or examples are unclear, fetch live details first.

  • Exact schema: call agentpmt-tool-search-and-execution with action: "get_schema", and tool_id: "file-to-json-parsing".
  • Detailed examples: call agentpmt-tool-search-and-execution with action: "get_instructions" and tool_id: "file-to-json-parsing", or call this product with action: "get_instructions" when the product tool is already selected.
  • Treat returned live schema and instructions as more specific than this generated summary.

MCP schema lookup through the main AgentPMT MCP server:

{
  "method": "tools/call",
  "params": {
    "name": "AgentPMT-Tool-Search-and-Execution",
    "arguments": {
      "action": "get_schema",
      "tool_id": "file-to-json-parsing"
    }
  }
}

For live examples, keep the same MCP tool and use these arguments:

{
  "action": "get_instructions",
  "tool_id": "file-to-json-parsing"
}

Authenticated AgentPMT REST schema lookup body:

{
  "name": "agentpmt-tool-search-and-execution",
  "parameters": {
    "action": "get_schema",
    "tool_id": "file-to-json-parsing"
  }
}

Authenticated AgentPMT REST live examples body:

{
  "name": "agentpmt-tool-search-and-execution",
  "parameters": {
    "action": "get_instructions",
    "tool_id": "file-to-json-parsing"
  }
}

Call This Tool

Product slug: file-to-json-parsing

Marketplace page: https://www.agentpmt.com/marketplace/file-to-json-parsing

  • AgentPMT account route: first use ../agentpmt-account-mcp-rest-api-setup to connect the main MCP server or REST API for an Agent Group where this tool is enabled.
  • x402 route: not enabled for this product.
  • AgentPMT overview: use ../what-is-agentpmt for marketplace, Agent Group, workflow, MCP, REST, and payment concepts.

If those setup skills are not installed beside this product skill, use the downloads below.

Core AgentPMT setup skills:

  • What AgentPMT is: ../what-is-agentpmt
  • AgentPMT account MCP/REST setup: ../agentpmt-account-mcp-rest-api-setup

skills.sh install script:

npx skills add AgentPMT/agent-skills --skill what-is-agentpmt
npx skills add AgentPMT/agent-skills --skill agentpmt-account-mcp-rest-api-setup

MCP call shape after the main AgentPMT MCP server is connected:

{
  "method": "tools/call",
  "params": {
    "name": "File-To-JSON-Parsing",
    "arguments": {
      "action": "extract-csv",
      "file_id": "example file id",
      "input_base64": "example input base64",
      "max_rows": 1000,
      "output_field": "data"
    }
  }
}

Use the exact tool name returned by tools/list; the name above is the expected readable form.

Authenticated AgentPMT REST call body:

{
  "name": "file-to-json-parsing",
  "parameters": {
    "action": "extract-csv",
    "file_id": "example file id",
    "input_base64": "example input base64",
    "max_rows": 1000,
    "output_field": "data"
  }
}

Use the setup skill for the account connection details before making REST calls.

Response Handling

  • Treat the returned JSON as the source of truth for this tool call.
  • If the response includes warnings or correction targets, apply them before retrying.
  • If the response includes a passed or success-style boolean, use it as the workflow gate.
  • If validation fails or the response shape is unclear, call get_schema or get_instructions before retrying.
  • If extract-csv fails, preserve the request parameters and retry only after fixing schema, auth, or payment errors.

Security

  • Do not place account secrets, wallet private keys, mnemonics, signatures, or payment headers in prompts or logs.
  • Keep tool inputs scoped to the minimum content needed for the task.
  • Use the setup skills for credential handling; this product skill only defines product-specific behavior.

AgentPMT Reference

安全使用建议
Install only if you are comfortable sending selected file contents or cloud file IDs to AgentPMT for processing. Do not use it on secrets, credentials, highly sensitive personal data, legal or medical records, or confidential business files unless the user has explicitly approved that transfer and the AgentPMT account and retention terms are acceptable.
能力标签
cryptorequires-walletrequires-sensitive-credentials
能力评估
Purpose & Capability
The skill’s purpose is file-to-JSON extraction and the body explicitly discloses CSV, HTML, JSON, ICS, ODS, PDF, RTF, text, XLS, XLSX, and a file-to-base64 action. The scope is broad, but it is coherent with document parsing and does not include deletion, public posting, or local command execution.
Instruction Scope
The instructions route agents to AgentPMT-hosted MCP or REST calls and require user-provided input_base64 or file_id. Activation keywords are broad, so agents should avoid using it automatically for sensitive files, but the artifacts do not contain prompt-injection, role override, or hidden execution instructions.
Install Mechanism
The package contains only SKILL.md and schema.md. Setup references to OpenClaw or skills.sh install commands are disclosed helper routes for related AgentPMT setup skills, not automatic installation or persistence.
Credentials
Sending file content or cloud file IDs to a remote parsing service is expected for this skill and proportionate to the stated capability, but it can expose sensitive document contents if used without user awareness.
Persistence & Privilege
No artifact-backed evidence shows background workers, persistence, privilege escalation, local credential-store access, or local file indexing. The skill warns not to put account secrets, wallet keys, mnemonics, signatures, or payment headers in prompts or logs.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install file-to-json-parsing
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /file-to-json-parsing 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
Publish file-to-json-parsing v1.0.0
元数据
Slug file-to-json-parsing
版本 1.0.0
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 1
常见问题

File To Json Parsing 是什么?

File To JSON Parsing: Parse files to JSON: CSV, HTML, JSON, ICS calendars. Use when an agent needs file to json parsing, parsing uploaded csv files into stru... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 31 次。

如何安装 File To Json Parsing?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install file-to-json-parsing」即可一键安装,无需额外配置。

File To Json Parsing 是免费的吗?

是的,File To Json Parsing 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

File To Json Parsing 支持哪些平台?

File To Json Parsing 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 File To Json Parsing?

由 AgentPMT(@agentpmt)开发并维护,当前版本 v1.0.0。

💬 留言讨论