功能描述

从家庭群聊记录（微信/WhatsApp/其他）提炼数字人格。输出 soul.md（集体人格）+ 每位成员的 persona 文件，可直接用于 AI agent 的人格底座。关键词：群聊分析、家庭人格、soul、persona、数字人格、聊天记录、微信导出、人格提炼。

使用说明 (SKILL.md)

SKILL: Soul Forge — 家庭数字人格提炼

Name: Family Soul Analyzer
Author: zengury

这个 skill 把一份家庭群聊记录变成可用于 AI agent 的人格文件。基于数字民族志方法论：用 AI 完成「田野调查」→「人格合成」的完整流程。

触发条件

以下情况触发此 skill：

用户说"帮我分析聊天记录"、"生成 soul 文件"、"提炼家庭人格"
用户提供了 .json 聊天导出文件
用户说"运行 soul-forge"、"开始人格提炼"
用户问"怎么用聊天记录生成 persona"

执行流程

第一步：确认输入

询问用户：

聊天记录文件路径（支持微信 WeFlow 导出的 JSON 格式）
输出目录（默认：~/soul-forge-output/）
家庭成员角色配置（默认：dad/mom/child 三人结构）

确认 ANTHROPIC_API_KEY 已设置（需要调用 Claude API）。

第二步：后台运行 pipeline

调用：

python3 {SKILL_DIR}/scripts/run_forge.py --file {用户提供的文件路径}

四个阶段，agent 依次推进：

阶段	脚本	说明	预计时间
1	`01_parse.py`	解析原始聊天 JSON → 标准化消息	30秒
2	`02_denoise.py`	去噪、按时间分块	1分钟
3	`03_extract.py`	Claude Haiku 批量提取行为模式（Batches API）	10-30分钟
4	`04_synthesize.py`	Claude Opus 综合生成 soul.md + persona	5-15分钟

阶段3说明：使用 Batches API 异步处理，成本低，自动缓存进度。如被中断可用 --resume 恢复，不重复计费。

第三步：进度汇报

解析 run_forge.py 的标记输出：

[STAGE:N:START] → 告知用户"正在进行阶段N"
[STAGE:N:DONE] → 告知用户"阶段N完成"
[PROGRESS:N/M] → 展示进度条
[OUTPUT:path] → 列出生成的文件
[ERROR:msg] → 报告错误，建议用户如何处理
[DONE] → 宣布完成，展示所有输出文件

第四步：完成后

输出文件说明：

soul-forge-output/
├── soul.md          ← 集体人格，可直接作为 AI agent SOUL.md 使用
├── persona_dad.md   ← 爸爸个人人格
├── persona_mom.md   ← 妈妈个人人格
└── persona_child.md ← 孩子/子女人格

询问用户是否要：

将 soul.md 安装为当前 agent 的 SOUL.md
为每个 persona 创建独立 agent

进阶用法

只更新 soul，不重新生成 persona

告诉 agent：「soul-forge 只更新 soul，跳过 persona」

内部：python3 run_forge.py --file {path} --soul-only

只重新生成 persona（soul 已存在）

告诉 agent：「soul-forge 只刷新 persona」

内部：python3 run_forge.py --file {path} --persona-only

从中断处恢复

告诉 agent：「soul-forge 继续上次的任务」

内部：python3 run_forge.py --resume

查看当前进度

告诉 agent：「soul-forge 状态」

内部：python3 run_forge.py --status

支持的输入格式

格式	来源	说明
微信 WeFlow JSON	WeFlow 工具导出	完整支持
标准 CSV	自定义导出	需包含 sender/timestamp/content 列

微信导出方法：用 WeFlow（Mac）→ 选群聊 → 导出 JSON 格式。

成本估算

一份 2-3 年的家庭群聊（~500 块对话）：

阶段3（Haiku Batches）：约 $0.5-1.0
阶段4（Opus）：约 $2-5
合计约 $3-6，一次性

常见问题

Q: 阶段3 很慢怎么办？ A: Batches API 通常 10-30 分钟，这是正常的。agent 会持续轮询状态，不需要人工干预。

Q: 中途断了怎么办？ A: 说「soul-forge 继续」，脚本会从断点恢复，已完成的阶段不会重复执行。

Q: API key 在哪里设置？ A: export ANTHROPIC_API_KEY='sk-ant-...'，或在 OpenClaw 的环境变量设置里配置。

Q: 支持几个人的群聊？ A: 默认三人（dad/mom/child），可在 pipeline/config.py 修改角色配置。

方法论背景

基于数字民族志（Digital Ethnography）：

阶段1-2：田野记录整理（去噪、结构化）
阶段3：系统性观察（Haiku 提取五维度行为模式）
阶段4：民族志分析（Opus 综合「厚描」）

Clifford Geertz：「浅描记录行为，厚描解释意义。」

soul.md 是厚描的产物——不是行为清单，而是理解这个家庭需要什么样的解释框架。

安全使用建议

Key things to consider before installing or running this skill: - Privacy: the skill uploads chat content to external LLM APIs. Only run it on data you own or have explicit consent to process. The package itself includes an example exported chat JSON (data/raw/...), which may contain real people’s messages — remove or inspect it before use. - Credentials: SKILL.md and the scripts expect ANTHROPIC_API_KEY and optionally KIMI/MOONSHOT keys, but the registry entry lists none. Do not use any hard-coded API key found in the code. Replace or remove hard-coded keys and set your own keys as environment variables. - Hard-coded secrets: the repo contains embedded API key-like strings. Treat them as compromised/unauthorized; remove them and audit where keys are used. Do not rely on those keys for production. - Data persistence: the pipeline caches raw API responses (raw_cache.jsonl) and writes outputs to the skill directory. If you run it, run in an isolated directory or container and clean caches after use if you do not want local persistence. - Run safely: review requirements.txt and the Python scripts before execution. If possible, run first on synthetic/dummy data to verify behavior and network calls. Consider running in a sandboxed environment (container) and monitor outbound network requests. - Consent and legality: extracting 'personas' from family chat may implicate privacy laws or consent obligations — ensure you have permission from chat participants. If you want, I can: (1) list the exact files and lines where hard-coded keys appear; (2) suggest minimal code edits to remove embedded keys and stop caching raw responses; or (3) provide a safe checklist to run the skill in a sandbox.

功能分析

Type: OpenClaw Skill Name: family-soul-analyzer Version: 0.1.0 The skill bundle implements a multi-stage pipeline to analyze private chat logs and generate AI personas using external LLM APIs. While the code logic is consistent with its stated purpose, several files (pipeline/03_extract_kimi_openclaw.py, pipeline/03_extract_kimi_v2.py, pipeline/03_extract_simple.py, and pipeline/04_synthesize_simple.py) contain a hardcoded Kimi/Moonshot API key (sk-kimi-Sgsy7YYJPrkwJbUu0EvyGCIOZLbkdvNDcGzN0GrxknUNmXwfLlxlzcyG3Ufs3xAI). Hardcoding credentials is a significant security vulnerability. Furthermore, the tool's core function involves sending sensitive personal communication data to third-party AI providers, which presents a high privacy risk, although this behavior is documented.

能力评估

ℹ Purpose & Capability

Overall the code and SKILL.md align with the described purpose (WeChat/CSV parsing → denoise → LLM extraction → synthesis). However the repo includes multiple alternative LLM backends (Anthropic/Claude, Kimi/Moonshot/OpenClaw) beyond the single provider named in SKILL.md, which is plausible but expands the skill's network footprint. Also the package bundles an actual exported chat JSON in data/raw/, which is unexpected for a reusable skill and raises privacy concerns.

⚠ Instruction Scope

SKILL.md instructs the agent to run the included pipeline scripts and to confirm ANTHROPIC_API_KEY; the scripts will read user-provided chat files and then transmit chunked chat text to third‑party LLM APIs. This is consistent with its purpose but the instructions (and code) will upload sensitive family chat content to external services — an explicit privacy/data-exfiltration risk the user must accept. The skill's runtime also writes cached API responses to disk (raw_cache.jsonl), increasing persistence of derived sensitive data.

ℹ Install Mechanism

There is no install spec beyond a requirements.txt and SKILL.md instructions — the skill is instruction-plus-code. That is lower risk than arbitrary remote downloads, but the presence of runnable Python scripts (and a requirements file) means the agent will execute code from this bundle. No external install URL was used, which avoids direct supply-chain download risks, but users should still vet requirements and run in an isolated environment.

⚠ Credentials

Registry metadata lists no required env vars, but the SKILL.md and multiple scripts expect ANTHROPIC_API_KEY (Claude) and optionally KIMI_API_KEY / MOONSHOT_API_KEY. Worse: several scripts contain a hard-coded API key string (e.g. 'sk-kimi-Sgsy7YYJPrk...') and hard-coded base URLs. Hard-coded credentials in published code are a major red flag (they may be leaked/stale/unauthorized) and the mismatch between declared requirements and actual env requirements is an incoherence to surface to users.

⚠ Persistence & Privilege

The skill writes outputs and caches to disk (soul.md, persona_*.md, data/observations/raw_cache.jsonl, observations.jsonl). That behavior is expected for a pipeline but means sensitive inputs and raw LLM responses are stored locally in the skill directory by default. The skill does not request elevated system privileges or set always:true, but the combination of autonomous invocation (normal default) plus persistent caches increases the blast radius if misconfigured.

版本历史

v0.1.0

WeChat chat analysis pipeline to extract behavioral patterns and group dynamics

元数据

Slug family-soul-analyzer

版本 0.1.0

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 1

常见问题

Family Soul Analyzer 是什么？

从家庭群聊记录（微信/WhatsApp/其他）提炼数字人格。输出 soul.md（集体人格）+ 每位成员的 persona 文件，可直接用于 AI agent 的人格底座。关键词：群聊分析、家庭人格、soul、persona、数字人格、聊天记录、微信导出、人格提炼。它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 117 次。

如何安装 Family Soul Analyzer？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install family-soul-analyzer」即可一键安装，无需额外配置。

Family Soul Analyzer 是免费的吗？

是的，Family Soul Analyzer 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

Family Soul Analyzer 支持哪些平台？

Family Soul Analyzer 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 Family Soul Analyzer？

由 zengury（@zengury）开发并维护，当前版本 v0.1.0。

Family Soul Analyzer

SKILL: Soul Forge — 家庭数字人格提炼

触发条件

执行流程

第一步：确认输入

第二步：后台运行 pipeline

第三步：进度汇报

第四步：完成后

进阶用法

只更新 soul，不重新生成 persona

只重新生成 persona（soul 已存在）

从中断处恢复

查看当前进度

支持的输入格式

成本估算

常见问题

方法论背景

Family Soul Analyzer 是什么？

如何安装 Family Soul Analyzer？

Family Soul Analyzer 是免费的吗？

Family Soul Analyzer 支持哪些平台？

谁开发了 Family Soul Analyzer？

💬 留言讨论