← 返回 Skills 市场
upstage-deployment

Upstage Document Classification

作者 Upstage Deployment · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ✓ 安全检测通过
29
总下载
0
收藏
1
当前安装
1
版本数
在 OpenClaw 中安装
/install upstage-document-classification
功能描述
Classify documents into user-defined categories using Upstage Document Classification API. Also supports document splitting for multi-document PDFs. Use when...
使用说明 (SKILL.md)

Upstage Document Classification

Classify documents into user-defined categories with confidence scores. Also supports Document Split for separating multi-document PDFs into individual documents.

Quick Start

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.environ["UPSTAGE_API_KEY"],
    base_url="https://api.upstage.ai/v1/document-classification"
)

response = client.chat.completions.create(
    model="document-classify",
    messages=[{
        "role": "user",
        "content": [{"type": "image_url", "image_url": {"url": "https://example.com/document.pdf"}}]
    }],
    response_format={
        "type": "json_schema",
        "json_schema": {
            "name": "document-classify",
            "schema": {
                "type": "string",
                "oneOf": [
                    {"const": "invoice", "description": "Commercial invoice with itemized charges"},
                    {"const": "receipt", "description": "Payment receipt"},
                    {"const": "contract", "description": "Legal contract or agreement"},
                    {"const": "resume", "description": "Personal resume or CV"}
                ]
            }
        }
    }
)
print(response.choices[0].message.content)

API Key: Always use os.environ["UPSTAGE_API_KEY"]. Get your key at console.upstage.ai.

Endpoint

POST https://api.upstage.ai/v1/document-classification

OpenAI SDK compatible — set base_url to https://api.upstage.ai/v1/document-classification.

Parameters

Parameter Type Required Description
model string Yes document-classify or document-classify-nightly
messages array Yes Single user message with image_url
response_format object Yes JSON Schema defining classification categories
split boolean No Enable multi-document splitting (default: false)
split_criteria array No Additional splitting criteria (used with split=true)

Schema Definition (Important)

Categories are defined using oneOf with const values. The root schema type must be "string", not "object".

{
  "type": "json_schema",
  "json_schema": {
    "name": "document-classify",
    "schema": {
      "type": "string",
      "oneOf": [
        {"const": "invoice", "description": "Commercial invoice"},
        {"const": "receipt", "description": "Payment receipt"},
        {"const": "other", "description": "Other document type"}
      ]
    }
  }
}

Important: Using enum or object-based schemas will return a 400 error. The Classification API requires oneOf with const/description pairs.

Response Structure

{
  "choices": [{
    "message": {
      "content": "invoice",
      "tool_calls": [{
        "function": {
          "arguments": {
            "document_type": {"_value": "invoice", "confidence_score": 0.99},
            "pages": [1, 2]
          }
        }
      }]
    }
  }]
}
  • content: classified category name
  • tool_calls.function.arguments.document_type._value: classified value
  • tool_calls.function.arguments.document_type.confidence_score: 0.0–1.0
  • tool_calls.function.arguments.pages: page range (most useful with split mode)

Document Split (Multi-Document PDFs)

For PDFs containing multiple document types, set extra_body={"split": True} to separate them into groups. Each group is returned as a separate choices entry. See references/document-split.md for the full split workflow with optional split_criteria.

Output Files

  • Default (classify only): \x3Csystem-temp>/\x3Cinput-stem>.classified.json (e.g., /tmp/contract.classified.json).
  • Default (split mode): directory \x3Csystem-temp>/\x3Cinput-stem>.split/ with one file per detected document (e.g., page-001.invoice.pdf).
  • Override: if the user specifies an output path, use it.
  • Always print the resolved absolute path(s) in your response so the user can locate the file(s).

Tips

  • Include "other" in your categories to handle unclassified documents.
  • split is useful as the first step in a document processing pipeline for scanned mixed-document files.
  • A common pattern: classify first → apply category-specific schemas with upstage-information-extraction.
  • Use confidence_score to flag low-confidence documents for manual review.

Detailed References

File Content
references/document-split.md Document split (basic + with criteria), curl example
安全使用建议
This skill appears safe to review as a normal external API integration. Before installing or using it, make sure you are comfortable providing an Upstage API key and sending the documents you classify to Upstage. Avoid using it for highly sensitive documents unless your organization approves Upstage's privacy, retention, and billing terms.
功能分析
Type: OpenClaw Skill Name: upstage-document-classification Version: 1.0.0 The skill bundle provides documentation and code examples for integrating the Upstage Document Classification API into the OpenClaw agent. It uses standard environment variables for authentication (UPSTAGE_API_KEY) and communicates with the legitimate Upstage API endpoint (api.upstage.ai). The instructions in SKILL.md and references/document-split.md are consistent with the stated purpose of document classification and splitting, with no evidence of malicious intent, data exfiltration, or unauthorized command execution.
能力标签
cryptocan-make-purchasesrequires-sensitive-credentials
能力评估
Purpose & Capability
The skill's stated purpose, API endpoint, examples, and split workflow are coherent: it classifies and optionally splits documents through the Upstage Document Classification API. The main user-visible implication is that document data is handled by an external provider.
Instruction Scope
The instructions are API usage documentation and examples. They do not direct hidden actions, override user intent, run automatically, or encourage destructive/bulk operations.
Install Mechanism
There is no install spec and no code files; the skill is instruction-only, so there is no reviewed artifact evidence of package installation, downloaded executables, or local helper execution.
Credentials
The skill requires an UPSTAGE_API_KEY in the environment for a purpose-aligned external API, but the registry metadata does not declare required environment variables or a primary credential.
Persistence & Privilege
The skill describes writing classification results or split PDFs to system temporary paths or a user-specified output path. This is disclosed and purpose-aligned, with no evidence of background persistence.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install upstage-document-classification
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /upstage-document-classification 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
- Initial release of upstage-document-classification. - Classifies documents into user-defined categories using the Upstage Document Classification API. - Supports splitting multi-document PDFs into individual documents. - Provides category confidence scores and outputs classified files or split documents. - Compatible with OpenAI SDK via base_url configuration. - Includes support for category definition via JSON Schema (must use "string" type with "oneOf" and "const" pairs).
元数据
Slug upstage-document-classification
版本 1.0.0
许可证 MIT-0
累计安装 1
当前安装数 1
历史版本数 1
常见问题

Upstage Document Classification 是什么?

Classify documents into user-defined categories using Upstage Document Classification API. Also supports document splitting for multi-document PDFs. Use when... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 29 次。

如何安装 Upstage Document Classification?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install upstage-document-classification」即可一键安装,无需额外配置。

Upstage Document Classification 是免费的吗?

是的,Upstage Document Classification 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Upstage Document Classification 支持哪些平台?

Upstage Document Classification 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Upstage Document Classification?

由 Upstage Deployment(@upstage-deployment)开发并维护,当前版本 v1.0.0。

💬 留言讨论