← 返回 Skills 市场

Claw SQLite Knowledge

Name: Claw SQLite Knowledge
Author: ernestyu

作者 Ernestyu · GitHub ↗ · v1.0.2 · MIT-0

cross-platform ✓ 安全检测通过

366

总下载

当前安装

版本数

在 OpenClaw 中安装

/install clawsqlite-knowledge

功能描述

Knowledge base skill that wraps the clawsqlite knowledge CLI for ingest/search/show.

使用说明 (SKILL.md)

clawsqlite-knowledge (OpenClaw Skill)

clawsqlite-knowledge is a knowledge base Skill built around the PyPI package clawsqlite.

It is a thin wrapper:

it does not vendor the source code and does not git clone any repository;
during installation, it installs clawsqlite>=1.0.2 (with a workspace-prefix fallback when the runtime env is not writable);
during runtime, it operates the knowledge base only through the clawsqlite knowledge ... CLI.

Its main capabilities are grouped into three areas:

Ingestion
- ingest from a URL (together with an existing fetch tool such as clawfetch);
- ingest from a piece of text, an idea, or an excerpt (marked as a local source).
Retrieval
- hybrid retrieval (LLM-aware query_refine/query_tags + FTS/vec with automatic downgrade)
- show a full record by id (including full content).
Reporting (via underlying CLI)
- build interest clusters (summary/tag embeddings → interest topics)
- generate periodic interest reports (Markdown + PNG, optional HTML/PDF) for the current knowledge base, based on previously built interest clusters.

Installation (performed by ClawHub / OpenClaw)

This skill is meant to be installed and run inside an OpenClaw/ClawHub runtime. It assumes:

Python 3.10+ is available in the Skill runtime environment;
the environment can access PyPI to install the clawsqlite package;
the runtime model has access to a workspace directory where this skill lives under skills/clawsqlite-knowledge.

Stage 1 — install the skill shell

Use the OpenClaw CLI to install the skill into your active workspace:

openclaw skills install clawsqlite-knowledge

This step downloads the skill package from ClawHub into:

~/.openclaw/workspace/skills/clawsqlite-knowledge

At this point the directory only contains:

SKILL.md
manifest.yaml
bootstrap_deps.py
run_clawknowledge.py
README.md / README_zh.md

The clawsqlite PyPI package itself is not yet guaranteed to be installed.

Stage 2 — install or upgrade `clawsqlite` (PyPI, >=0.1.4)

The second stage is handled by the bootstrap script declared in manifest.yaml:

install:
  - id: clawsqlite_knowledge_bootstrap
    kind: python
    label: Install clawsqlite from PyPI
    script: bootstrap_deps.py

bootstrap_deps.py is intentionally small and auditable. In simplified form:

requirement = "clawsqlite>=1.0.2"
cmd = [sys.executable, "-m", "pip", "install", requirement]
proc = subprocess.run(cmd)
if proc.returncode != 0:
    prefix = _workspace_prefix()
    subprocess.run([
        sys.executable,
        "-m",
        "pip",
        "install",
        requirement,
        f"--prefix={prefix}",
    ])

Semantics:

First, it tries to install clawsqlite>=1.0.2 into the default venv used for the Skill runtime.
If that fails (e.g. read-only venv), it falls back to a workspace-local prefix:
```
\x3Cworkspace>/skills/clawsqlite-knowledge/.clawsqlite-venv
```
On success in the prefix, it prints a NEXT: hint describing:
- Where the package is installed; and
- Which site-packages path will be added to PYTHONPATH at runtime.

From the OpenClaw CLI, you typically do not need to call bootstrap_deps.py manually; openclaw skills install clawsqlite-knowledge (or a future openclaw skills update ...) will run the install hooks. If you want to force a re-install or upgrade of clawsqlite to 0.1.4+ inside the skill directory, you can run:

cd ~/.openclaw/workspace/skills/clawsqlite-knowledge
python bootstrap_deps.py

Where does the `clawsqlite` CLI live?

Depending on how pip is configured:

If the first pip install succeeds in the base env, the clawsqlite command and clawsqlite_cli module live in that venv;
If we fall back to the workspace prefix, clawsqlite will be installed under .clawsqlite-venv and the Skill runtime adds its site-packages directory to PYTHONPATH before invoking run_clawknowledge.py.

For advanced users, this means you can also invoke the CLI manually from the same prefix, for example:

cd ~/.openclaw/workspace/skills/clawsqlite-knowledge
PYTHONPATH="$(python - \x3C\x3C 'EOF'
from bootstrap_deps import _workspace_prefix, _site_packages
p = _workspace_prefix()
print(_site_packages(p))
EOF)"$PYTHONPATH" \
  python -m clawsqlite_cli knowledge --help

In normal Skill usage (agents calling the JSON API), you do not need to manage this manually.

Runtime entry

The Skill runtime calls run_clawknowledge.py. This script:

reads a JSON payload from stdin;
routes by the action field to the matching handler;
calls python -m clawsqlite_cli knowledge ... to perform the actual operation;
writes the result JSON back to stdout.

All CLI calls are centralized in one function, which also injects the workspace-prefix site-packages path into PYTHONPATH when present so that the fallback installation works transparently.

If the underlying CLI emits NEXT: hints, this runtime surfaces them as a structured next array in the JSON response. On failure, it also includes an error_kind field for quick classification.

Supported actions

1. `ingest_url`

Ingest an article from a URL. The actual fetching logic is determined by the environment variable CLAWSQLITE_SCRAPE_CMD (recommended: the clawfetch CLI). This Skill does not fetch web pages directly.

Example payload:

{
  "action": "ingest_url",
  "url": "https://mp.weixin.qq.com/s/UzgKeQwWWoV4v884l_jcrg",
  "title": "WeChat article: Ground Station project",   // optional
  "category": "web",                                   // optional (default: web)
  "tags": "wechat,ground-station",                     // optional
  "gen_provider": "openclaw",                          // optional: openclaw|llm|off (default: openclaw)
  "root": "/data/clawsqlite-knowledge"                 // optional storage directory
}

Behavior:

calls clawsqlite knowledge ingest --url ...;
by default uses provider=openclaw:
- generates a long summary with heuristics (first ~800 characters, cut by sentence/paragraph boundaries);
- generates tags with jieba or a lightweight algorithm (in 0.1.4 these are backed by the new keyword/semantic pipelines);
- if embedding configuration is complete, generates an embedding for the long summary and stores it in the vec table;
filenames use pinyin plus an English slug for easier cross-platform storage;
the database keeps the original title and source_url.

Returns:

{
  "ok": true,
  "data": { "id": 1, "title": "...", "local_file_path": "...", ... }
}

2. `ingest_text`

Ingest a piece of text, an idea, or an excerpt, marked as a local source.

Example payload:

{
  "action": "ingest_text",
  "text": "Today I had an idea about a web scraping architecture...",
  "title": "Notes on web scraping architecture",   // optional; auto-generated if omitted
  "category": "idea",                              // optional (default: note)
  "tags": "crawler,architecture",                  // optional
  "gen_provider": "openclaw",                      // optional
  "root": "/data/clawsqlite-knowledge"             // optional storage directory
}

Behavior:

calls clawsqlite knowledge ingest --text ...;
generates long summary, tags, and embedding the same way as in the URL case, depending on configuration;
source_url will be Local;
filenames use pinyin / English slug for easier cross-platform handling.

3. `search`

Search the knowledge base using the full clawsqlite>=1.0.2 search pipeline (query_refine/query_tags + FTS/vec hybrid), with automatic downgrade when embeddings or vec0 are not available.

Example payload:

{
  "action": "search",
  "query": "web scraping architecture",
  "mode": "hybrid",               // optional: hybrid|fts|vec (default: hybrid)
  "topk": 10,                     // optional
  "category": "idea",             // optional
  "tag": "crawler",               // optional
  "include_deleted": false,       // optional
  "root": "/data/clawsqlite-knowledge"   // optional storage directory
}

Behavior (high level):

Calls clawsqlite knowledge search ... with --json and forwards filters.
Uses the new four-mode capability model inside clawsqlite:
- Mode1: LLM + Embedding → query_refine + query_tags from SMALL_LLM, content/tag vectors + FTS + lexical tags.
- Mode2: LLM + no Embedding → LLM-based query_refine/query_tags + FTS + lexical tags.
- Mode3: no LLM + Embedding → heuristic query_refine/query_tags + content vectors + tag vectors + FTS/lexical tags.
- Mode4: no LLM + no Embedding → heuristic query_refine/query_tags + FTS + lexical tags only.
In all modes, natural-language queries are converted into:
- query_refine: a single, search-friendly sentence;
- query_tags: a small set of keywords (length controlled by CLAWSQLITE_SEARCH_QUERY_TAG_MIN/MAX).
When embeddings are available, the search scorer uses:
- content vectors (summary-based) for semantic similarity;
- tag vectors + lexical tag matches as a tag channel;
- FTS rank for BM25-like keyword matching;
- priority and recency as light biases.
Tag scoring is split into semantic and lexical parts, controlled by CLAWSQLITE_TAG_VEC_FRACTION and CLAWSQLITE_TAG_FTS_LOG_ALPHA.
Final scores are a weighted sum of these channels, with per-mode default weights tunable via CLAWSQLITE_SCORE_WEIGHTS_MODE1..4 (and legacy CLAWSQLITE_SCORE_WEIGHTS*).

This skill does not re-implement scoring; it simply forwards the JSON result and lets agents inspect score, score_components, and any next hints surfaced by the underlying CLI.

Returns:

{
  "ok": true,
  "data": [
    {"id": 3, "title": "...", "category": "idea", "score": 0.92, ...},
    ...
  ]
}

4. `show`

Show one record from the knowledge base by id, optionally including full content.

Example payload:

{
  "action": "show",
  "id": 3,
  "full": true,                   // optional, default: true
  "root": "/data/clawsqlite-knowledge"   // optional storage directory
}

Behavior:

calls clawsqlite knowledge show --id ... --full --json;
returns full metadata and optional body content (the content field).

FTS/jieba fallback (CJK)

This Skill relies on the underlying clawsqlite CLI for FTS tokenization. When the CJK tokenizer extension libsimple cannot be loaded, clawsqlite can switch to a jieba-based pre-segmentation mode controlled by CLAWSQLITE_FTS_JIEBA=auto|on|off:

auto (default): only enable when libsimple is unavailable and jieba is installed.
on: force jieba pre-segmentation even if libsimple is available.
off: disable jieba pre-segmentation.

In jieba mode, CJK text is segmented with jieba and joined with spaces before being written to the FTS index; queries apply the same normalization, so write/rebuild/query stay consistent.

If you change this setting on an existing DB, rebuild the FTS index:

clawsqlite knowledge reindex --rebuild --fts

Maintenance (CLI only)

This skill intentionally does not expose destructive maintenance actions via its JSON API. To clean up orphan files, old backups, or compact the knowledge database, use the clawsqlite CLI directly from a trusted administrative context, for example:

# Preview maintenance (no deletions)
clawsqlite knowledge maintenance prune \
  --root /data/clawsqlite-knowledge \
  --days 3 \
  --dry-run \
  --json

# Apply maintenance (delete orphans + old backups, then VACUUM)
clawsqlite knowledge maintenance prune \
  --root /data/clawsqlite-knowledge \
  --days 7 \
  --json

Only administrators or scheduled automation should run these commands. Agents using the clawsqlite-knowledge skill have access only to ingestion, retrieval, and show operations.

Security and auditability

The Skill depends only on the clawsqlite package from PyPI.
It does not vendor source code, does not git clone, and does not download extra binaries.
All knowledge base operations are performed through explicit clawsqlite knowledge ... CLI calls, and their stdout/stderr can be fully audited in logs.

安全使用建议

This skill is a thin wrapper around the public clawsqlite PyPI package and appears internally consistent. Before installing: (1) review and trust the upstream clawsqlite package on PyPI/GitHub; the bootstrap runs pip install, so a compromised package would be consequential; (2) run `clawsqlite knowledge doctor --json` (as recommended) to inspect required paths and embedding/LLM configuration; (3) be aware that embedding providers or scrapers (if you enable them) may require API keys and will contact network endpoints — review and control those env vars (EMBEDDING_*, CLAWSQLITE_SCRAPE_CMD) and provider settings before use; (4) if you prefer, run the bootstrap step manually to see what pip installs and whether the workspace prefix is used.

功能分析

Type: OpenClaw Skill Name: clawsqlite-knowledge Version: 1.0.2 The clawsqlite-knowledge skill is a legitimate wrapper for the 'clawsqlite' PyPI package, providing knowledge base functionalities such as document ingestion, hybrid search, and interest reporting. The installation process in bootstrap_deps.py and the runtime logic in run_clawknowledge.py follow standard practices for managing dependencies and executing CLI commands via subprocess without shell=True, minimizing injection risks. No evidence of data exfiltration, malicious persistence, or harmful prompt injection was found.

能力评估

✓ Purpose & Capability

The skill's name/description match what it installs and runs: it bootstraps the clawsqlite PyPI package and invokes `python -m clawsqlite_cli knowledge ...`. Required binary (python) and the pip install of clawsqlite are proportional to a CLI wrapper. Minor doc-version text differences (0.1.x vs 1.0.2) are present but not material.

✓ Instruction Scope

SKILL.md and run_clawknowledge.py instruct only to run the clawsqlite knowledge CLI and forward JSON payloads (ingest/search/show/report). The runtime does not read unrelated system files or environment secrets. It may use configured scraper or embedding providers (per clawsqlite settings) which is expected for this skill.

✓ Install Mechanism

Installation uses a small bootstrap script that runs pip to install `clawsqlite>=1.0.2`, with a workspace-prefix fallback. This is a standard PyPI install (moderate risk but expected); no downloads from arbitrary URLs or archive extraction are present.

ℹ Credentials

The skill itself declares no required env vars or credentials. It reads/processes the environment only to add a workspace site-packages path to PYTHONPATH. However, functionality (embeddings, scrapers, small LLMs) depends on the external clawsqlite configuration (EMBEDDING_*, CLAWSQLITE_SCRAPE_CMD, etc.) that may require API keys or network access — this is expected but you should review embedding/scraper configuration before enabling.

✓ Persistence & Privilege

The skill does not request always:true and will not auto-enable itself globally. It writes a workspace-local prefix (.clawsqlite-venv) if pip falls back, which is normal for a skill installing a Python package; it does not modify other skills or system-wide configs.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install clawsqlite-knowledge
安装完成后，直接呼叫该 Skill 的名称或使用 /clawsqlite-knowledge 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.2

Version 1.0.2 - Updated dependency requirement to install clawsqlite>=1.0.2. - Added a "first_run" section to metadata, instructing users to run `clawsqlite knowledge doctor --json` and check runtime configuration before use. - Updated SKILL.md and documentation to reflect the new minimum dependency. - Removed the unused ENV_EXAMPLE.md file. - Other minor updates and clarifications in documentation files.

v1.0.0

clawsqlite-knowledge 1.0.0 - Major release with version bump to 1.0.0 and upgrade to require clawsqlite>=1.0.0. - Improved retrieval: now supports LLM-aware query_refine/query_tags and enhanced hybrid fallback (FTS/vector/semantic). - Adds support for building interest clusters and more advanced report generation. - Updated installation scripts and documentation to reflect new minimum dependencies and features. - README and SKILL.md updated for new usage patterns and action documentation.

v0.1.13

clawsqlite-knowledge 0.1.13 - Minimum `clawsqlite` dependency bumped to `>=0.1.8` (was `>=0.1.7`). - Adds support in documentation and metadata for a new reporting capability: generate periodic interest reports (Markdown + PNG, optionally HTML/PDF) for current knowledge base. - Updates description of main capabilities to include reporting. - No API or CLI argument changes; all existing ingestion and retrieval actions remain the same. - Documentation and install/upgrade instructions updated to match new dependency version.

v0.1.12

clawsqlite-knowledge 0.1.12 - Updated dependency: now requires and installs clawsqlite>=0.1.7. - Documentation revised to reference the new minimum version. - No functional changes to skill workflow or interface.

v0.1.11

- Bumped required clawsqlite version to >=0.1.6 for improved compatibility. - Updated installation instructions and bootstrap script to reference the new version. - Documentation refreshed in SKILL.md, README.md, and README_zh.md to clarify installation requirements and supported actions.

v0.1.10

clawsqlite-knowledge 0.1.10 - Updated minimum required `clawsqlite` version to 0.1.5 (was 0.1.4). - Updated all documentation and install scripts to require and reference `clawsqlite>=0.1.5`. - No functional changes to skill API or action handling.

v0.1.9

clawsqlite-knowledge 0.1.9 - Minimum required `clawsqlite` version increased to 0.1.4 (was 0.1.2). - Updated installation process: documentation clarifies two-stage install (skill shell, then PyPI dependency), and fallback to workspace-local install is improved. - Documentation updated for clarity on installation, CLI locations, and runtime behavior. - README and skill docs revised to match new dependency/version requirements and workflows. - No changes to skill API or runtime entrypoint; usage is unchanged for existing actions.

v0.1.8

- Bump required clawsqlite version to >=0.1.2 for improved dependency compatibility. - Update documentation to reflect the new minimum version requirement. - No runtime or API behavior changes.

v0.1.7

clawsqlite-knowledge 0.1.7 - Updated minimum required version of `clawsqlite` from 0.1.0 to 0.1.1. - Added ENV_EXAMPLE.md with environment variable explanations. - Clarified and expanded documentation for FTS/jieba fallback and environment configuration. - Maintenance operations are no longer documented as available in the JSON API; users are now directed to use the CLI for destructive actions. - General documentation and dependency updates across README and supporting files.

v0.1.6

## clawsqlite-knowledge 0.1.6 - Added fallback installer: if pip install fails (e.g. unwritable env), auto-install clawsqlite with a workspace-local prefix. - At runtime, auto-add the fallback prefix site-packages to PYTHONPATH when needed for CLI invocation. - Runtime now forwards "NEXT:" hints from the CLI as a structured `next` array in the output. - Failures now include an `error_kind` field for quick error classification. - Documentation improvements and clarification of fallback/auto-prefix behavior in installation and runtime sections.

v0.1.4

clawsqlite-knowledge 0.1.4 - Maintenance actions (delete orphans, backups, VACUUM) are no longer exposed via the JSON API; users must run them manually via the CLI. - The documentation (SKILL.md) was updated to clarify the scope: only ingestion, retrieval, and show operations are available through the skill API. - Skill API is now safer—no destructive maintenance operations from agents; only trusted users/automation can run these via CLI. - No changes to installation or ingestion/search behaviors.

v0.1.3

clawsqlite-knowledge 0.1.3 - Updated SKILL.md to clarify that all read/write file operations are performed by the underlying CLI under the specified storage root. - Adjusted documentation examples and wording to use a dedicated knowledge directory (e.g., /data/clawsqlite-knowledge). - Removed guarantee that the skill never writes outside the working directory; now specifies caller controls storage location via the payload. - No changes to skill API or runtime logic.

v0.1.2

clawsqlite-knowledge 0.1.2 - SKILL.md has been fully translated from Chinese to English for broader accessibility. - No changes to logic or interfaces; documentation only. - Version updated in both SKILL.md and manifest.yaml.

v0.1.1

clawsqlite-knowledge v0.1.1 - Initial release as a thin knowledge base wrapper for the `clawsqlite` CLI. - Supports URL and text ingestion, with automatic long summary, tagging, and optional vector embedding. - Provides hybrid, FTS, or vector search with filtering by category and tag. - Includes record inspection, preview and application of maintenance (GC/backup/VACUUM), and safe/auditable operation. - Installs `clawsqlite` from PyPI; does not vendor or clone source; all actions routed via dedicated runtime script.

元数据

Slug clawsqlite-knowledge

版本 1.0.2

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 14

常见问题