← Back to Skills Marketplace
zwcih

Azure Content Understanding Layout

by zwcih · GitHub ↗ · v1.3.0 · MIT-0
cross-platform ✓ Security Clean
174
Downloads
0
Stars
0
Active Installs
5
Versions
Install in OpenClaw
/install azure-content-layout
Description
Extract document structure, text, tables, and figures from documents using Azure Content Understanding prebuilt-layout analyzer. Converts PDF, images, Office...
README (SKILL.md)

Azure Content Understanding — Layout Analyzer

Extract structured content from documents using Azure's prebuilt-layout analyzer. Outputs Markdown and structured JSON with text, tables, figures, and document hierarchy.

Setup

Set environment variables:

export AZURE_CU_ENDPOINT="https://YOUR_RESOURCE.services.ai.azure.com/"
export AZURE_CU_API_KEY="YOUR_KEY_HERE"

Optional: set API version (defaults to 2025-05-01-preview):

export AZURE_CU_API_VERSION="2025-11-01"

Quick Usage

Analyze a URL and print Markdown

node scripts/analyze.mjs --url "https://example.com/document.pdf"

Analyze a local file (pipe via stdin)

cat invoice.pdf | node scripts/analyze.mjs --stdin --markdown output.md --output result.json

Save both Markdown and full JSON

node scripts/analyze.mjs --url "https://example.com/report.pdf" \
  --markdown report.md \
  --output report.json

Direct API Call

When the script isn't available, use curl:

# Submit analysis (preview API)
curl -s -X POST "$AZURE_CU_ENDPOINT/contentunderstanding/analyzers/prebuilt-layout:analyze?api-version=2025-05-01-preview" \
  -H "Ocp-Apim-Subscription-Key: $AZURE_CU_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"url":"https://example.com/doc.pdf"}'

# Response includes Operation-Location header — poll that URL for results

For GA API (2025-11-01), the body format changes:

{"inputs": [{"url": "https://example.com/doc.pdf"}]}

Output

Markdown

The analyzer produces GitHub Flavored Markdown preserving:

  • Headings (h1–h6)
  • Tables (as HTML \x3Ctable> blocks)
  • Selection marks (☒ checked, ☐ unchecked)
  • Figures (with references)
  • Paragraphs with reading order

Structured JSON

The full result includes detailed per-element data:

  • pages — dimensions, word/line counts per page
  • paragraphs — text blocks with bounding regions and semantic roles
  • tables — cells with row/column spans
  • figures — detected images/charts with bounding regions
  • sections — hierarchical document structure

Supported Formats

PDF, JPEG, PNG, BMP, TIFF, HEIF, DOCX, XLSX, PPTX, HTML

Best Practices

  • Async operation — the API returns 202; poll Operation-Location for results
  • Poll interval — 3 seconds is reasonable; results typically arrive in 5–60 seconds
  • Large documents — up to 2,000 pages supported; processing time scales linearly
  • File upload — use Content-Type: application/octet-stream with binary body
  • Tables — rendered as HTML in markdown for complex layouts (merged cells, etc.)

API Reference

See references/api.md for full request/response details.

Usage Guidance
This appears to be a straightforward Azure document-layout helper. Before installing: 1) Only provide AZURE_CU_API_KEY and AZURE_CU_ENDPOINT that you trust — the key grants access to your Azure resource. Prefer a least-privilege or short-lived key if available. 2) Be aware that sending a document or a URL sends data to Azure (URL mode causes Azure to fetch the URL). Do not send sensitive documents unless you accept that. 3) The script runs locally with Node; review or run it on a trusted host. 4) The metadata marks AZURE_CU_API_VERSION as required but the script will default it — you can safely omit it. 5) If you need higher assurance, call the Azure API directly (curl or official SDK) or inspect/execute the included analyze.mjs locally rather than granting broad agent-level access.
Capability Analysis
Type: OpenClaw Skill Name: azure-content-layout Version: 1.3.0 The skill provides a legitimate interface for Azure Content Understanding's Layout Analyzer, allowing users to extract structured content from documents. The Node.js script (scripts/analyze.mjs) correctly implements the asynchronous polling pattern required by the Azure API and handles both URL-based and local file inputs without any evidence of malicious behavior, data exfiltration, or prompt injection.
Capability Assessment
Purpose & Capability
Name/description, SKILL.md, and scripts/analyze.mjs all implement calls to Azure Content Understanding (prebuilt-layout) and only request the Azure endpoint, API key, and an API version. No unrelated services, binaries, or credentials are required.
Instruction Scope
Runtime instructions and the script only read stdin or a user-supplied URL and call the Azure analyzer. Important: when you supply a URL, the Azure service will fetch that URL (so the remote host and Azure will see it). The SKILL.md metadata lists AZURE_CU_API_VERSION as required but the code provides a default value — minor inconsistency.
Install Mechanism
No install spec; this is an instruction-only skill plus a single Node script (analyze.mjs). Nothing is downloaded from external sites and no installers run — low install risk. The script requires a Node runtime to execute.
Credentials
Requiring AZURE_CU_ENDPOINT and AZURE_CU_API_KEY is proportional and expected. AZURE_CU_API_VERSION is declared required in metadata but the script defaults it if unset — the extra 'required' declaration is unnecessary but not harmful. PrimaryEnv correctly set to the API key.
Persistence & Privilege
always:false and the skill does not request persistent system modifications or modify other skills' configs. Autonomous invocation is allowed but is the platform default and not in itself a concern here.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install azure-content-layout
  3. After installation, invoke the skill by name or use /azure-content-layout
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.3.0
Remove direct file read from script; use stdin pipe instead to avoid file-read+network-send pattern flagged by static analysis
v1.2.0
Declare AZURE_CU_API_VERSION in requires.env for full credential transparency
v1.1.0
Add metadata.openclaw.requires.env and primaryEnv declarations for AZURE_CU_ENDPOINT and AZURE_CU_API_KEY
v1.0.1
Fix: remove patterns that triggered false-positive suspicious content flag
v1.0.0
Initial release: document layout analysis with markdown and JSON output via Azure Content Understanding prebuilt-layout analyzer
Metadata
Slug azure-content-layout
Version 1.3.0
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 5
Frequently Asked Questions

What is Azure Content Understanding Layout?

Extract document structure, text, tables, and figures from documents using Azure Content Understanding prebuilt-layout analyzer. Converts PDF, images, Office... It is an AI Agent Skill for Claude Code / OpenClaw, with 174 downloads so far.

How do I install Azure Content Understanding Layout?

Run "/install azure-content-layout" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Azure Content Understanding Layout free?

Yes, Azure Content Understanding Layout is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Azure Content Understanding Layout support?

Azure Content Understanding Layout is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Azure Content Understanding Layout?

It is built and maintained by zwcih (@zwcih); the current version is v1.3.0.

💬 Comments