← 返回 Skills 市场

Knowledge Graph - Csv Graph Loader Generator

Name: Knowledge Graph - Csv Graph Loader Generator
Author: fisa712

作者 Muhammad Asif · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ✓ 安全检测通过

总下载

当前安装

版本数

在 OpenClaw 中安装

/install csv-graph-loader-generator

功能描述

Generate graph database loaders and triple mappings from CSV datasets. Converts tabular CSV data into graph-ready nodes, edges, and triples for graph databas...

使用说明 (SKILL.md)

CSV Graph Loader Generator

Convert CSV datasets into graph-ready structures for knowledge graph construction.

This skill transforms tabular CSV data into graph database ingestion formats such as nodes, edges, or triples. It generates mappings and loader scripts for graph systems like Neo4j, RDF triple stores, property graphs, and knowledge graphs.

Quick Start

Use When

Converting CSV datasets to graph structures
Preparing data for graph database import
Generating Neo4j LOAD CSV scripts
Creating RDF triple mappings
Building property graphs from tables
Constructing knowledge graphs from structured data
Automating CSV → graph ETL workflows

Inputs

CSV file or data
Column definitions and data types
Entity type specifications
Relationship definitions
Target graph format
Custom mapping rules

Outputs

Node definitions (labels, properties)
Relationship mappings (types, directions)
Neo4j Cypher import scripts
RDF triple definitions
Property graph JSON
Graph schema definitions
Mapping configuration files

Example

Input CSV:

person_id,name,company_name,company_industry,job_title
1,Alice Johnson,Acme Corp,Technology,Software Engineer
2,Bob Smith,Acme Corp,Technology,Product Manager
3,Carol Davis,TechStart Inc,Technology,CTO

Generated Output (Nodes & Edges):

{
  "nodes": [
    {"id": "p1", "type": "Person", "properties": {"name": "Alice Johnson", "job": "Software Engineer"}},
    {"id": "p2", "type": "Person", "properties": {"name": "Bob Smith", "job": "Product Manager"}},
    {"id": "c1", "type": "Company", "properties": {"name": "Acme Corp", "industry": "Technology"}},
    {"id": "c2", "type": "Company", "properties": {"name": "TechStart Inc", "industry": "Technology"}}
  ],
  "edges": [
    {"source": "p1", "target": "c1", "type": "WORKS_AT"},
    {"source": "p2", "target": "c1", "type": "WORKS_AT"},
    {"source": "p3", "target": "c2", "type": "WORKS_AT"}
  ]
}

CSV-to-Graph Transformation Strategy

1. Entity Detection

Automatically identify entity columns:

ID columns - Unique identifiers (person_id, company_id)
Name columns - Entity names (name, title, company_name)
Category columns - Entity types (type, category, industry)

2. Entity Classification

Create node types from detected entities:

person_id → Person node
company_name → Company node
department → Department node

3. Relationship Inference

Map column associations to relationships:

person_id + company_name → WORKS_AT
employee_id + manager_id → REPORTS_TO
product_id + category_id → BELONGS_TO

4. Property Assignment

Remaining columns become node properties:

name → Person.name
salary → Person.salary
industry → Company.industry

5. Output Generation

Create loader scripts for target system:

Neo4j: LOAD CSV WITH HEADERS...
RDF: :Alice rdf:type :Person
JSON: {"nodes": [...], "edges": [...]}

Supported Entity Patterns

ID-Based Entities

pattern: column_name contains "id" or "identifier"
example: person_id, user_id, company_id

Name-Based Entities

pattern: column_name contains "name" or "title"
example: person_name, company_name, job_title

Category-Based Entities

pattern: column_name contains "type" or "category"
example: person_type, company_category, product_category

Implicit Entities

pattern: column values represent entities
example: department column with values like "Sales", "Engineering"

Output Formats

Neo4j Cypher Script

LOAD CSV WITH HEADERS FROM 'file:///employees.csv' AS row
MERGE (p:Person {id: row.person_id})
SET p.name = row.name, p.job_title = row.job_title
MERGE (c:Company {name: row.company_name})
SET c.industry = row.company_industry
MERGE (p)-[:WORKS_AT]->(c)

RDF Triples (Turtle)

@prefix ex: \x3Chttp://example.org/> .

ex:person1 a ex:Person ;
  ex:name "Alice Johnson" ;
  ex:jobTitle "Software Engineer" ;
  ex:worksAt ex:acme_corp .

ex:acme_corp a ex:Company ;
  ex:name "Acme Corp" ;
  ex:industry "Technology" .

Property Graph JSON

{
  "nodes": [
    {"id": "p1", "type": "Person", "properties": {"name": "Alice", "job": "Engineer"}},
    {"id": "c1", "type": "Company", "properties": {"name": "Acme", "industry": "Tech"}}
  ],
  "edges": [
    {"source": "p1", "target": "c1", "type": "WORKS_AT"}
  ]
}

CSV to Node/Edge Tables

# nodes.csv
id,type,name,job_title
p1,Person,Alice,Software Engineer
p2,Person,Bob,Product Manager
c1,Company,Acme Corp,

# edges.csv
source,target,type
p1,c1,WORKS_AT
p2,c1,WORKS_AT

Mapping Strategies

Automatic Detection

pros: No manual configuration needed
cons: May infer wrong relationships
use: Quick prototyping, simple datasets

Semi-Automated with Hints

pros: Balance between automation and control
cons: Requires some input
use: Most common production use case

Explicit Mapping

pros: Full control, exact desired output
cons: Requires complete configuration
use: Complex schemas, strict requirements

Data Type Inference

The loader automatically infers:

String - Text columns
Integer - Numeric whole numbers
Float - Decimal numbers
Boolean - True/false values
DateTime - Date and time formats
Reference - Columns pointing to other entities

Duplicate Handling

Merge Strategy

Identical entities across rows are merged
example: Two rows with "Acme Corp" → one Company node

Deduplication

Removes duplicate edges from same source/target
example: Multiple WORKS_AT edges → single edge

ID-Based Deduplication

Uses unique identifiers to prevent duplicates
example: person_id as stable key for Person nodes

Execution Steps

Parse CSV – Read file and validate structure
Analyze Schema – Detect data types and patterns
Detect Entities – Identify entity columns
Infer Relationships – Map column associations
Create Schema – Define node types and relationship types
Generate Mappings – Create column-to-property mappings
Validate Data – Check for data quality issues
Generate Loaders – Output scripts for target system

Recommended Libraries

CSV Processing: pandas, csv, polars
Graph Generation: neo4j, rdflib, networkx
Data Validation: pydantic, jsonschema
RDF/OWL: rdflib, owlready2, sparql-client
JSON Transformation: jsonschema, jinja2

Best Practices

✓ Use stable, meaningful identifiers
✓ Normalize entity names to prevent duplicates
✓ Define explicit entity types rather than guessing
✓ Validate data before loading to graph database
✓ Document custom mapping rules
✓ Test with sample data first
✓ Monitor for duplicate nodes or relationships
✓ Keep mapping configurations version-controlled
✓ Handle missing values explicitly
✓ Implement referential integrity checks

Integration with Downstream Skills

The generated graph data feeds into:

Graph Query Optimization – Optimize generated Cypher queries
Schema Validation – Validate against graph schema
Graph Constraint Generator – Define constraints for loaded data
Knowledge Graph Construction – Build KGs from CSV sources
ETL Pipeline Generator – Orchestrate full CSV → Graph workflows

References

See loader-patterns.md for detailed CSV loader patterns and example-loaders.md for complete domain-specific examples.

Version: 1.0.0

安全使用建议

Review generated Neo4j loaders, RDF mappings, and import configs before running them, especially against production databases. Use least-privilege database credentials, confirm target paths and database names, and prefer a test database or backup when importing data.

能力评估

✓ Purpose & Capability

The noted capability to generate Neo4j import scripts, RDF mappings, and loader outputs fits a graph-data or knowledge-graph workflow. The supplied evidence does not show the skill secretly executing those outputs, exfiltrating data, or doing unrelated actions.

ℹ Instruction Scope

The scanner-supported concern is that the skill should more clearly warn users that generated loaders and mappings may write to graph databases, create files, or alter schemas when users run them.

✓ Install Mechanism

No concerning install behavior was supplied or found in the available workspace context, and VirusTotal engine telemetry was clean.

✓ Credentials

Generating graph import artifacts is proportionate to the stated graph-system purpose. Any database mutation appears to depend on the user later executing generated scripts against their own target systems.

✓ Persistence & Privilege

No evidence was provided of background persistence, privilege escalation, credential harvesting, automatic database execution, or hidden long-running behavior.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install csv-graph-loader-generator
安装完成后，直接呼叫该 Skill 的名称或使用 /csv-graph-loader-generator 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.0

- Initial release of CSV Graph Loader Generator. - Converts tabular CSV datasets into graph-ready nodes, edges, and triples for graph and knowledge databases. - Generates loader scripts and mapping files for systems like Neo4j (Cypher), RDF triple stores, and property graphs (JSON). - Supports automatic, semi-automated, and explicit mapping strategies for flexible entity and relationship extraction. - Handles data type inference, duplicate merging, and configurable mapping rules. - Provides best practices and recommendations for robust CSV-to-graph workflows.

元数据

Slug csv-graph-loader-generator

版本 1.0.0

许可证 MIT-0

累计安装 1

当前安装数 1

历史版本数 1

常见问题