功能描述

Long-term semantic memory layer for AI agents, built on the Datris Platform via MCP. Local markdown files (MEMORY.md, memory/*.md) are the source of truth; D...

使用说明 (SKILL.md)

Datris memory layer

Name: Datris Memory
Author: datris

Datris is the long-term semantic memory layer. Local memory files are the source of truth. Datris is rebuilt from them — never the other way around.

Prerequisites

This skill assumes a local Datris install:

Datris Platform (includes the MCP server) — Docker-based, see docs.datris.ai/installation. The MCP server runs at http://localhost:3000/sse by default; override with MCP_SERVER_URL.
Datris CLI — brew tap datris/tap && brew install datris. See docs.datris.ai/cli for command reference.

Confirm both are working before relying on this skill: datris health pings every backend service the memory pipeline needs.

Rules

Prefer Datris MCP tools over the CLI for memory operations. The CLI is a fallback for the cases listed under "When to use the CLI" below.
One file in = one upload_data call. Never concatenate, bundle, or consolidate multiple memory files into a single corpus upload. Each file's name is its provenance — it must round-trip cleanly into Datris and back out in retrieval results. Bundling looks like an optimization and is not one: it permanently breaks provenance, blocks per-file incremental sync, and makes resets harder.
Upload each file as-is. Let Datris chunk server-side. Do not pre-chunk. Do not use a document tap for local files. The only time to split a single file is when that one file genuinely exceeds the upload limit; in that case, split it with explicit provenance markers in the chunk filenames (e.g. MEMORY.md.part1, MEMORY.md.part2).
Upload files in parallel. Each upload_data call returns a job token immediately — fire all uploads first, then poll. Do not wait for one job to finish before starting the next.
Poll job status to completion before claiming any ingestion succeeded. Polling is for verification, not for gating subsequent uploads.
Verify retrieval with a semantic search after every ingestion run.
Memory pipelines must target a vector destination (pgvector, Qdrant, Weaviate, Milvus, or Chroma).
The embedding model is pinned at pipeline-creation time. Vector dimensions cannot change after the fact. Confirm the embedder matches the pipeline before ingesting. Switching embedders means dropping and recreating the destination collection.
Place every created resource in the openclaw data catalog by default. Pipelines, taps, secrets, destination collections — anything the agent creates in Datris on OpenClaw's behalf goes in the catalog named openclaw unless the user explicitly directs otherwise. This keeps OpenClaw's footprint cleanly separated from other Datris workloads on the same instance, makes cleanup and auditing trivial, and lets multiple OpenClaw users share an instance without colliding. If an existing resource the agent wants to reuse lives in a different catalog, do not migrate it silently — surface the mismatch to the user and ask before continuing.

When to use the CLI

MCP tools are the default. Reach for the datris CLI in these specific situations:

Health checks during bootstrap or troubleshooting. datris health confirms every backend service is up. Use it as a more thorough cross-check when MCP check_service_health returns ambiguous results, or when diagnosing a stuck ingestion.
MCP server unavailable. If the MCP connection is down, datris ingest \x3Cfile> --dest pgvector and datris search "\x3Cquery>" --store pgvector --collection \x3Cname> are the equivalent fallbacks for ingestion and retrieval. Use these only to keep the user unblocked; restore MCP-based operation as soon as the server is reachable.
Pipeline status when MCP polling stalls. datris status \x3Cpipeline> is a clean way to read the latest job state if the MCP get_job_status loop has lost track of which token to follow.
Spot-checks the user runs themselves. When the user wants to verify an ingestion by hand, point them at datris search against the destination collection rather than walking them through MCP tool calls.

Log every CLI invocation in the audit log (memory/\x3Ctoday>.md) the same way an MCP call would be logged — same provenance discipline applies.

First-run bootstrap

Read the Datris MCP resources and tool descriptions. Understand the pipeline, upload, job-status, and search workflows before acting.
Check service health via the MCP check_service_health tool. If it returns ambiguous or partial results, fall back to datris health for a more detailed per-service view.
Reuse an existing vector pipeline for memory in the openclaw catalog if one exists. Otherwise create one in the openclaw catalog — pgvector is fine. If a memory pipeline exists outside the openclaw catalog, surface that to the user before reusing or migrating it.
Ingest MEMORY.md and each memory/*.md file via its own upload_data call — one upload per file, no exceptions. Do not concatenate them into a single corpus document, even if the total set is small. Fire the uploads in parallel and collect all the job tokens before polling.
Poll all jobs concurrently until every one reports completion.
Verify with two or three representative semantic queries. Confirm both that (a) the expected content comes back, and (b) results show real source filenames (MEMORY.md, memory/2026-05-06.md, etc.) — not a consolidated corpus filename or any other synthetic name. The filename round-trip check is the early-warning signal that the one-file-per-upload rule is being followed; if results show a corpus filename, stop and apply the remediation workflow below.
Record the run in memory/\x3Ctoday>.md: pipeline used, files ingested, verification queries and results, any failures.
Propose an incremental sync strategy for future edits.

Ongoing sync

Memory files change continuously — the agent writes to them during sessions, the user edits them in their editor between sessions, and new dated files appear over time. Sync is incremental and runs lazily, in three modes:

When to sync

Periodic background sync. A timer-driven sweep runs every 30 minutes by default — diff memory files by mtime against the last sync record, upload anything stale. This runs out-of-band: it never blocks an agent response or a user query. Cadence is configurable: faster (5–10 min) for users actively editing memory between sessions, slower (hourly or more) for read-mostly use. The right value is whatever keeps the staleness window short enough that the user rarely needs to force a sync.
End of any agent response that wrote to memory. When the agent edits or creates a memory file during a turn, flush those uploads before the response is considered complete — including polling to completion. This puts the cost on the response that did the writing, not on later queries, and guarantees that a follow-up question in the same conversation can retrieve what the agent just wrote. Never let an agent-authored memory write wait for the next timer tick.
On explicit user request. Phrases like "sync memory," "save what we discussed," "update Datris with my recent notes" — full diff sweep across all memory files, immediate sync. This is the user's escape valve for the case where they just edited a file in their editor and want it queryable right now rather than waiting for the next timer tick.

Staleness window — and why it's acceptable

Memory edits made outside of an agent session — for example, the user editing MEMORY.md in their editor between conversations — may be up to one timer interval behind in retrieval results. That is the explicit trade. Query latency stays predictable, ingestion is invisible, and the user has a one-line escape hatch ("sync memory") when freshness matters. Do not try to close the staleness window by syncing on every retrieval; that path puts ingestion cost on the user's wait time and is the design this skill replaces.

Detecting what changed

Compare each memory file's filesystem mtime against the most recent sync timestamp recorded for that file in the memory/\x3Cdate>.md audit logs. Three cases:

No sync record exists → treat as a new file, ingest it.
mtime newer than last-sync timestamp → re-upload.
mtime unchanged → skip. Do not re-ingest unchanged files; the point of incremental sync is to do less work.

A content hash is more reliable than mtime if the user touches files without editing them (some editors do this on save), but mtime is sufficient as a default. Switch to hashing only if redundant uploads start showing up in the audit log.

Sync workflow

Classify each changed file into one of four cases and act accordingly:

Edited file → re-upload via upload_data. Pipelines upsert on source filename, so this overwrites the file's existing chunks cleanly.
New file (no prior sync record) → upload via upload_data like any other.
Renamed file → treat as delete + add, never as update. Delete chunks for the old filename from the destination collection, then upload the new file. Otherwise the index keeps orphan chunks under the old name.
Deleted file → delete its chunks from the destination collection. Do not leave orphans.

Then:

Fire all uploads in parallel, collect the job tokens, poll concurrently to completion.
Verify with one or two semantic queries that touch the changed content. Confirm filenames in results reflect the post-sync state — no stale entries from before a rename or delete, no consolidated-corpus names.
Append a per-file entry to today's audit log: filename, change type (edit / new / rename / delete), timestamp, verification result.

Inheriting a consolidated-corpus pipeline

If the existing memory pipeline was bootstrapped with a single consolidated upload (a *-corpus-*.md source file, or any upload whose filename doesn't match a real memory file), the pipeline has broken provenance and cannot be incrementally synced cleanly. Do not patch around it. Reset and re-ingest:

Confirm with the user before resetting — destination collections may contain manual edits.
Drop the destination collection (or recreate the pipeline with the same name).
Re-ingest each canonical memory file individually per the bootstrap workflow above.
Verify with semantic queries that retrieval results now show real source filenames (MEMORY.md, memory/2026-05-06.md, etc.) rather than a corpus filename.

Retrieval

When the user asks a memory-shaped question, reach for vector_search (or ai_answer for synthesis) against the memory pipeline before grepping local files. Substring search on local markdown is a fallback, not the default. If the MCP layer is unreachable, datris search "\x3Cquery>" --store pgvector --collection \x3Cname> is the equivalent fallback against the same destination.

Reporting

After any bootstrap or sync, report:

Pipeline used or created.
Files ingested.
Verification queries and whether they returned the expected content.
Anything that failed and why.

安全使用建议

Before installing, make sure you are comfortable with MEMORY.md and memory/*.md becoming long-term searchable agent memory, use a trusted local or known MCP server, and verify the Datris CLI/Docker installation source.

功能分析

Type: OpenClaw Skill Name: datris-memory Version: 1.0.0 The datris-memory skill (SKILL.md) provides a semantic memory layer by indexing local markdown files into a Datris Platform instance using MCP tools and the datris CLI. The skill's logic is focused on maintaining data provenance, incremental synchronization, and data isolation within a dedicated 'openclaw' catalog. No indicators of malicious intent, unauthorized data exfiltration, or suspicious execution patterns were found; the behavior is entirely consistent with its stated purpose of providing long-term memory for AI agents.

能力评估

ℹ Purpose & Capability

The stated purpose and capabilities align: it builds a semantic memory index from MEMORY.md and memory/*.md. Users should still notice that this creates persistent, searchable memory.

ℹ Instruction Scope

The skill is meant to trigger on memory-related requests even when the user does not explicitly say Datris. This is disclosed and purpose-aligned, but it broadens when the agent may use the memory layer.

ℹ Install Mechanism

The skill requires Docker and the Datris CLI installed from a Homebrew tap. This is expected for the Datris integration, but users should verify the external Datris installation source.

ℹ Credentials

The data flow is mostly local by default via http://localhost:3000/sse, but MCP_SERVER_URL can point elsewhere, so users should ensure it targets a trusted Datris MCP server before syncing memory files.

ℹ Persistence & Privilege

The skill describes long-term memory, background/post-write/on-demand sync, audit logs, and Datris resource creation in an openclaw catalog. This is disclosed and bounded to memory files, but it is persistent behavior.

版本历史

v1.0.0

Initial release. Turns Datris into a queryable memory layer for agents with local markdown notes. One-file-per-upload provenance, incremental sync (timer + post-write + on-demand), edit/new/rename/delete handling, vector-store destinations (pgvector default, Qdrant/Weaviate/Milvus/Chroma supported).

元数据

Slug datris-memory

版本 1.0.0

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 1

常见问题

Datris Memory 是什么？

Long-term semantic memory layer for AI agents, built on the Datris Platform via MCP. Local markdown files (MEMORY.md, memory/*.md) are the source of truth; D... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 5 次。

如何安装 Datris Memory？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install datris-memory」即可一键安装，无需额外配置。

Datris Memory 是免费的吗？

是的，Datris Memory 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

Datris Memory 支持哪些平台？

Datris Memory 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 Datris Memory？

由 Datris.ai（@datris）开发并维护，当前版本 v1.0.0。

Datris Memory