/install information-extraction
Information Extraction
Extract entity, relation, attribute, and event information from text into a normalized intermediate structure, then export triples in JSON, JSONL, or TSV.
Core workflow
- Define extraction scope and output granularity.
- Segment input text into sentences and paragraphs.
- Extract entities with evidence.
- Extract relations, attributes, and events.
- Normalize aliases, predicates, and duplicated records.
- Export triples. Default output is JSON.
- Review ambiguities before treating output as final.
Input scope
Prefer this skill for:
- Plain text strings
- Markdown text
- Text copied from webpages, notes, reports, transcripts, or documents
If the user provides a file in another format, convert it to text first, then use this skill.
Output contract
Default output should contain:
{
"triples": [],
"entities": [],
"attributes": [],
"events": [],
"ambiguities": []
}
Support export formats:
- JSON (default)
- JSONL
- TSV
Extraction principles
- Extract explicit facts before inference.
- Preserve evidence spans for important records.
- Prefer controlled predicates from
references/relation-taxonomy.md. - Keep attributes and events separate internally, even when final output is triples.
- Do not flatten complex events too early.
- Normalize before exporting.
- Record unresolved ambiguity instead of pretending certainty.
Minimal internal schema
Use these record shapes during extraction.
Entity
{
"id": "ent_001",
"mention": "OpenAI",
"canonical_name": "OpenAI",
"type": "Organization",
"evidence": "OpenAI published the GPT-4 Technical Report.",
"confidence": 0.95
}
Relation
{
"subject": "ent_001",
"predicate": "published",
"object": "ent_002",
"evidence": "OpenAI published the GPT-4 Technical Report.",
"confidence": 0.93
}
Attribute
{
"entity_id": "ent_002",
"attribute": "year",
"value": "2023",
"evidence": "The report was released in 2023.",
"confidence": 0.87
}
Event
{
"id": "ev_001",
"type": "Publication",
"trigger": "published",
"participants": {
"agent": "ent_001",
"object": "ent_002"
},
"time": "2023",
"location": null,
"evidence": "OpenAI published the GPT-4 Technical Report in 2023.",
"confidence": 0.92
}
How to use references
- Read
references/pipeline.mdfor the end-to-end procedure. - Read
references/schema.mdfor types and intermediate record structure. - Read
references/relation-taxonomy.mdbefore inventing new predicates. - Read
references/triple-mapping.mdwhen exporting final triples. - Read
references/event-modeling.mdwhen text describes complex events. - Read
references/quality-checklist.mdbefore final delivery.
Scripts
Extract
python3 skills/information-extraction/scripts/extract.py --text "OpenAI published GPT-4." --output out.json
Or read from stdin:
echo "OpenAI published GPT-4." | python3 skills/information-extraction/scripts/extract.py --stdin --output out.json
Normalize
python3 skills/information-extraction/scripts/normalize.py --input out.json --output normalized.json
Export triples
python3 skills/information-extraction/scripts/export_triples.py --input normalized.json --format json --output triples.json
python3 skills/information-extraction/scripts/export_triples.py --input normalized.json --format jsonl --output triples.jsonl
python3 skills/information-extraction/scripts/export_triples.py --input normalized.json --format tsv --output triples.tsv
Notes on automation
This is a semi-automatic pipeline, not a claim of perfect extraction. The scripts provide scaffolding, normalization, and export. For high-stakes outputs, keep evidence and perform manual review.
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install information-extraction - After installation, invoke the skill by name or use
/information-extraction - Provide required inputs per the skill's parameter spec and get structured output
What is Information Extraction?
Extract structured information from unstructured text through a semi-automatic pipeline. Support entity extraction, relation extraction, attribute extraction... It is an AI Agent Skill for Claude Code / OpenClaw, with 139 downloads so far.
How do I install Information Extraction?
Run "/install information-extraction" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is Information Extraction free?
Yes, Information Extraction is completely free, licensed under MIT-0. You can download, install and use it at no cost.
Which platforms does Information Extraction support?
Information Extraction is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created Information Extraction?
It is built and maintained by quqxui (@quqxui); the current version is v1.0.0.