← Back to Skills Marketplace
upstage-deployment

Upstage Document Classification

by Upstage Deployment · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ✓ Security Clean
29
Downloads
0
Stars
1
Active Installs
1
Versions
Install in OpenClaw
/install upstage-document-classification
Description
Classify documents into user-defined categories using Upstage Document Classification API. Also supports document splitting for multi-document PDFs. Use when...
README (SKILL.md)

Upstage Document Classification

Classify documents into user-defined categories with confidence scores. Also supports Document Split for separating multi-document PDFs into individual documents.

Quick Start

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.environ["UPSTAGE_API_KEY"],
    base_url="https://api.upstage.ai/v1/document-classification"
)

response = client.chat.completions.create(
    model="document-classify",
    messages=[{
        "role": "user",
        "content": [{"type": "image_url", "image_url": {"url": "https://example.com/document.pdf"}}]
    }],
    response_format={
        "type": "json_schema",
        "json_schema": {
            "name": "document-classify",
            "schema": {
                "type": "string",
                "oneOf": [
                    {"const": "invoice", "description": "Commercial invoice with itemized charges"},
                    {"const": "receipt", "description": "Payment receipt"},
                    {"const": "contract", "description": "Legal contract or agreement"},
                    {"const": "resume", "description": "Personal resume or CV"}
                ]
            }
        }
    }
)
print(response.choices[0].message.content)

API Key: Always use os.environ["UPSTAGE_API_KEY"]. Get your key at console.upstage.ai.

Endpoint

POST https://api.upstage.ai/v1/document-classification

OpenAI SDK compatible — set base_url to https://api.upstage.ai/v1/document-classification.

Parameters

Parameter Type Required Description
model string Yes document-classify or document-classify-nightly
messages array Yes Single user message with image_url
response_format object Yes JSON Schema defining classification categories
split boolean No Enable multi-document splitting (default: false)
split_criteria array No Additional splitting criteria (used with split=true)

Schema Definition (Important)

Categories are defined using oneOf with const values. The root schema type must be "string", not "object".

{
  "type": "json_schema",
  "json_schema": {
    "name": "document-classify",
    "schema": {
      "type": "string",
      "oneOf": [
        {"const": "invoice", "description": "Commercial invoice"},
        {"const": "receipt", "description": "Payment receipt"},
        {"const": "other", "description": "Other document type"}
      ]
    }
  }
}

Important: Using enum or object-based schemas will return a 400 error. The Classification API requires oneOf with const/description pairs.

Response Structure

{
  "choices": [{
    "message": {
      "content": "invoice",
      "tool_calls": [{
        "function": {
          "arguments": {
            "document_type": {"_value": "invoice", "confidence_score": 0.99},
            "pages": [1, 2]
          }
        }
      }]
    }
  }]
}
  • content: classified category name
  • tool_calls.function.arguments.document_type._value: classified value
  • tool_calls.function.arguments.document_type.confidence_score: 0.0–1.0
  • tool_calls.function.arguments.pages: page range (most useful with split mode)

Document Split (Multi-Document PDFs)

For PDFs containing multiple document types, set extra_body={"split": True} to separate them into groups. Each group is returned as a separate choices entry. See references/document-split.md for the full split workflow with optional split_criteria.

Output Files

  • Default (classify only): \x3Csystem-temp>/\x3Cinput-stem>.classified.json (e.g., /tmp/contract.classified.json).
  • Default (split mode): directory \x3Csystem-temp>/\x3Cinput-stem>.split/ with one file per detected document (e.g., page-001.invoice.pdf).
  • Override: if the user specifies an output path, use it.
  • Always print the resolved absolute path(s) in your response so the user can locate the file(s).

Tips

  • Include "other" in your categories to handle unclassified documents.
  • split is useful as the first step in a document processing pipeline for scanned mixed-document files.
  • A common pattern: classify first → apply category-specific schemas with upstage-information-extraction.
  • Use confidence_score to flag low-confidence documents for manual review.

Detailed References

File Content
references/document-split.md Document split (basic + with criteria), curl example
Usage Guidance
This skill appears safe to review as a normal external API integration. Before installing or using it, make sure you are comfortable providing an Upstage API key and sending the documents you classify to Upstage. Avoid using it for highly sensitive documents unless your organization approves Upstage's privacy, retention, and billing terms.
Capability Analysis
Type: OpenClaw Skill Name: upstage-document-classification Version: 1.0.0 The skill bundle provides documentation and code examples for integrating the Upstage Document Classification API into the OpenClaw agent. It uses standard environment variables for authentication (UPSTAGE_API_KEY) and communicates with the legitimate Upstage API endpoint (api.upstage.ai). The instructions in SKILL.md and references/document-split.md are consistent with the stated purpose of document classification and splitting, with no evidence of malicious intent, data exfiltration, or unauthorized command execution.
Capability Tags
cryptocan-make-purchasesrequires-sensitive-credentials
Capability Assessment
Purpose & Capability
The skill's stated purpose, API endpoint, examples, and split workflow are coherent: it classifies and optionally splits documents through the Upstage Document Classification API. The main user-visible implication is that document data is handled by an external provider.
Instruction Scope
The instructions are API usage documentation and examples. They do not direct hidden actions, override user intent, run automatically, or encourage destructive/bulk operations.
Install Mechanism
There is no install spec and no code files; the skill is instruction-only, so there is no reviewed artifact evidence of package installation, downloaded executables, or local helper execution.
Credentials
The skill requires an UPSTAGE_API_KEY in the environment for a purpose-aligned external API, but the registry metadata does not declare required environment variables or a primary credential.
Persistence & Privilege
The skill describes writing classification results or split PDFs to system temporary paths or a user-specified output path. This is disclosed and purpose-aligned, with no evidence of background persistence.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install upstage-document-classification
  3. After installation, invoke the skill by name or use /upstage-document-classification
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
- Initial release of upstage-document-classification. - Classifies documents into user-defined categories using the Upstage Document Classification API. - Supports splitting multi-document PDFs into individual documents. - Provides category confidence scores and outputs classified files or split documents. - Compatible with OpenAI SDK via base_url configuration. - Includes support for category definition via JSON Schema (must use "string" type with "oneOf" and "const" pairs).
Metadata
Slug upstage-document-classification
Version 1.0.0
License MIT-0
All-time Installs 1
Active Installs 1
Total Versions 1
Frequently Asked Questions

What is Upstage Document Classification?

Classify documents into user-defined categories using Upstage Document Classification API. Also supports document splitting for multi-document PDFs. Use when... It is an AI Agent Skill for Claude Code / OpenClaw, with 29 downloads so far.

How do I install Upstage Document Classification?

Run "/install upstage-document-classification" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Upstage Document Classification free?

Yes, Upstage Document Classification is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Upstage Document Classification support?

Upstage Document Classification is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Upstage Document Classification?

It is built and maintained by Upstage Deployment (@upstage-deployment); the current version is v1.0.0.

💬 Comments