← 返回 Skills 市场
volcengine-skills

Byted Bytehouse Mcp

作者 volcengine-skills · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ⚠ suspicious
96
总下载
0
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install byted-bytehouse-mcp
功能描述
在本地拉起ByteHouse MCP Server并调用其tools的技能,用于连接ByteHouse数据库查询数据、使用MCP协议与ByteHouse交互、生成数据资产目录和血缘分析。当用户需要连接ByteHouse数据库查询数据、使用MCP协议与ByteHouse交互、生成数据资产目录和血缘分析时,使用此Sk...
使用说明 (SKILL.md)

ByteHouse MCP Server Skill

🔵 ByteHouse 品牌标识

「ByteHouse」—— 火山引擎云原生数据仓库,极速、稳定、安全、易用

本Skill基于ByteHouse官方MCP Server,提供完整的ByteHouse数据访问能力


描述

在本地拉起ByteHouse MCP Server并调用其tools的技能。

当以下情况时使用此 Skill: (1) 需要连接ByteHouse数据库查询数据 (2) 需要使用MCP协议与ByteHouse交互 (3) 用户提到"ByteHouse"、"MCP"、"查询数据库"、"看表" (4) 需要生成数据资产目录和血缘分析

📁 文件说明

  • SKILL.md - 本文件,技能主文档
  • mcp_client.py - MCP客户端模块,用于程序化调用MCP Server
  • test_mcp_server.py - MCP Server测试脚本
  • example_mcp_usage.py - MCP使用示例
  • query_top10_tables_mcp.py - 使用MCP查询Top 10大表
  • test_list_tables.py - 测试list_tables tool
  • data_asset_analyzer.py - 数据资产和血缘分析工具(新增)
  • start_mcp_service.sh - 启动常驻MCP Server服务
  • stop_mcp_service.sh - 停止MCP Server服务
  • status_mcp_service.sh - 查看MCP Server状态
  • restart_mcp_service.sh - 重启MCP Server服务

前置条件

  • Python 3.8+
  • uv (已安装在 /root/.local/bin/uv)
  • ByteHouse连接信息(需自行配置)

配置信息

ByteHouse连接配置

{
  "host": "\x3CByteHouse-host>",
  "port": "\x3CByteHouse-port>",
  "user": "\x3CByteHouse-user>",
  "password": "\x3CByteHouse-password>",
  "secure": true,
  "verify": true
}

环境变量

在使用前请设置以下环境变量:

export BYTEHOUSE_HOST="\x3CByteHouse-host>"
export BYTEHOUSE_PORT="\x3CByteHouse-port>"
export BYTEHOUSE_USER="\x3CByteHouse-user>"
export BYTEHOUSE_PASSWORD="\x3CByteHouse-password>"
export BYTEHOUSE_SECURE="true"
export BYTEHOUSE_VERIFY="true"
export BYTEHOUSE_CONNECT_TIMEOUT="30"
export BYTEHOUSE_SEND_RECEIVE_TIMEOUT="30"

🎯 ByteHouse MCP Server Tools

序号 Tool名称 功能描述
1 list_databases 列出所有数据库
2 list_tables 列出指定数据库中的所有表
3 run_select_query 运行SELECT查询
4 run_dml_ddl_query 运行DML/DDL查询
5 get_bytehouse_table_engine_doc 获取ByteHouse表引擎文档

🚀 快速开始

方法1: 测试MCP Server(推荐先测试)

cd /root/.openclaw/workspace/skills/bytehouse-mcp
# 先设置环境变量,然后运行
uv run test_mcp_server.py

这会:

  1. 自动安装ByteHouse MCP Server
  2. 启动MCP Server
  3. 列出所有可用的tools
  4. 尝试调用第一个tool

方法2: 列出数据库中的表

cd /root/.openclaw/workspace/skills/bytehouse-mcp
# 先设置环境变量,然后运行
uv run test_list_tables.py

方法3: 使用MCP查询Top 10大表

cd /root/.openclaw/workspace/skills/bytehouse-mcp
# 先设置环境变量,然后运行
uv run query_top10_tables_mcp.py

方法4: 生成数据资产和血缘分析(新增)

cd /root/.openclaw/workspace/skills/bytehouse-mcp
# 先设置环境变量,然后运行
uv run data_asset_analyzer.py

这会:

  1. 获取数据库的完整schema
  2. 生成数据资产目录
  3. 生成血缘分析报告
  4. 保存JSON文件到 output/ 目录

输出内容包括:

  • 数据库schema(所有表和字段)
  • 数据资产目录(表统计、标签、引擎分布)
  • 血缘分析(表关系、列相似性)

方法5: 启动常驻MCP Server服务

cd /root/.openclaw/workspace/skills/bytehouse-mcp
# 先在脚本中配置环境变量,然后运行
./start_mcp_service.sh

这会:

  1. 在后台启动MCP Server
  2. 保存PID到 mcp_server.pid
  3. 写入日志到 logs/mcp_server_*.log

方法6: 查看MCP Server状态

./status_mcp_service.sh

方法7: 停止MCP Server

./stop_mcp_service.sh

方法8: 重启MCP Server

./restart_mcp_service.sh

💻 数据资产和血缘分析(新增)

功能说明

data_asset_analyzer.py 提供以下功能:

  1. 完整Schema获取

    • 获取指定数据库的所有表
    • 获取每张表的所有字段
    • 提取表引擎、注释等元数据
  2. 数据资产目录生成

    • 表统计(总表数、总列数)
    • 引擎分布统计
    • 自动标签生成
    • 表资产详情
  3. 血缘分析

    • 表关系识别(Distributed → Local)
    • 列相似性分析
    • 关系可视化

使用示例

#!/usr/bin/env python3
# /// script
# dependencies = [
#   "mcp>=1.0.0",
# ]
# ///

import asyncio
from data_asset_analyzer import DataAssetAnalyzer

async def main():
    analyzer = DataAssetAnalyzer()
    await analyzer.connect()
    
    # 分析数据库
    result = await analyzer.analyze_database("default")
    
    # result 包含:
    # - schema: 完整的数据库schema
    # - catalog: 数据资产目录
    # - lineage: 血缘分析
    # - files: 生成的文件路径

asyncio.run(main())

输出文件

分析完成后会在 output/ 目录生成以下文件:

  1. schema_{database}_{timestamp}.json - 完整的数据库schema
  2. catalog_{database}_{timestamp}.json - 数据资产目录
  3. lineage_{database}_{timestamp}.json - 血缘分析报告

💻 程序化使用MCP Client

使用mcp_client.py模块

#!/usr/bin/env python3
# /// script
# dependencies = [
#   "mcp>=1.0.0",
# ]
# ///

import asyncio
from mcp_client import ByteHouseMCPClient

async def main():
    async with ByteHouseMCPClient() as client:
        await client.connect()
        
        # 1. 列出所有tools
        tools = await client.list_tools()
        print("可用的tools:", [t['name'] for t in tools])
        
        # 2. 调用tool
        # result = await client.call_tool("tool_name", {"param": "value"})
        # print(result)

asyncio.run(main())

直接使用MCP SDK

#!/usr/bin/env python3
# /// script
# dependencies = [
#   "mcp>=1.0.0",
# ]
# ///

import asyncio
import os
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client

async def main():
    # 设置环境变量(请自行配置)
    env = os.environ.copy()
    env.update({
        'BYTEHOUSE_HOST': '\x3CByteHouse-host>',
        'BYTEHOUSE_PORT': '\x3CByteHouse-port>',
        'BYTEHOUSE_USER': '\x3CByteHouse-user>',
        'BYTEHOUSE_PASSWORD': '\x3CByteHouse-password>',
        'BYTEHOUSE_SECURE': 'true',
        'BYTEHOUSE_VERIFY': 'true',
    })
    
    # MCP Server参数
    server_params = StdioServerParameters(
        command='/root/.local/bin/uvx',
        args=[
            '--from',
            'git+https://github.com/volcengine/mcp-server@main#subdirectory=server/mcp_server_bytehouse',
            'mcp_bytehouse',
            '-t',
            'stdio'
        ],
        env=env
    )
    
    # 启动MCP Server并连接
    async with stdio_client(server_params) as (read, write):
        async with ClientSession(read, write) as session:
            await session.initialize()
            
            # 列出tools
            result = await session.list_tools()
            print("Tools:", [t.name for t in result.tools])
            
            # 调用tool
            # call_result = await session.call_tool("tool_name", {"param": "value"})

asyncio.run(main())

🔧 服务管理

启动服务

# 先配置环境变量,然后运行
./start_mcp_service.sh

查看状态

./status_mcp_service.sh

查看日志

# 查看最新日志
tail -f logs/mcp_server_*.log

# 查看特定日志文件
tail -f logs/mcp_server_20260312_184500.log

停止服务

./stop_mcp_service.sh

重启服务

./restart_mcp_service.sh

💻 使用MCP Tools示例

示例1: 列出所有数据库

import asyncio
import os
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client

async def main():
    # 设置环境变量(请自行配置)
    env = os.environ.copy()
    env.update({
        'BYTEHOUSE_HOST': '\x3CByteHouse-host>',
        'BYTEHOUSE_PORT': '\x3CByteHouse-port>',
        'BYTEHOUSE_USER': '\x3CByteHouse-user>',
        'BYTEHOUSE_PASSWORD': '\x3CByteHouse-password>',
        'BYTEHOUSE_SECURE': 'true',
        'BYTEHOUSE_VERIFY': 'true',
    })
    
    server_params = StdioServerParameters(
        command='/root/.local/bin/uvx',
        args=[
            '--from',
            'git+https://github.com/volcengine/mcp-server@main#subdirectory=server/mcp_server_bytehouse',
            'mcp_bytehouse',
            '-t',
            'stdio'
        ],
        env=env
    )
    
    async with stdio_client(server_params) as (read, write):
        async with ClientSession(read, write) as session:
            await session.initialize()
            
            # 调用list_databases
            result = await session.call_tool("list_databases", {})
            for content in result.content:
                if content.type == 'text':
                    print(content.text)

asyncio.run(main())

示例2: 列出数据库中的表

# 调用list_tables
result = await session.call_tool("list_tables", {"database": "default"})

示例3: 运行SELECT查询

# 调用run_select_query
sql = "SELECT * FROM default.conversation_feedback LIMIT 10"
result = await session.call_tool("run_select_query", {"query": sql})

示例4: 查询Top 10大表

# 查询Top 10大表
sql = """
    SELECT 
        database,
        table,
        sum(bytes) as total_bytes,
        sum(rows) as total_rows
    FROM system.parts 
    WHERE active = 1
    GROUP BY database, table
    ORDER BY total_bytes DESC
    LIMIT 10
"""
result = await session.call_tool("run_select_query", {"query": sql})

最后更新: 2026-03-12

安全使用建议
This skill appears to implement what it says (running a local ByteHouse MCP Server and calling its tools) but you should be cautious before running it in a production or privileged environment. Recommended actions before installing or executing: - Inspect and verify the remote repository referenced (https://github.com/volcengine/mcp-server) so you know what code will be fetched and executed. - Confirm which binary will be used on your system: the docs reference both 'uv' and '/root/.local/bin/uvx'. Adjust scripts to point to a trusted runtime binary, or install uv/uvx from a trusted source. - Avoid running as root. Run inside an isolated environment (container, VM) or restricted user to limit blast radius. - Limit environment exposure: do not run these scripts in an environment containing unrelated secrets. Consider clearing or whitelisting environment variables before launching the MCP Server so only ByteHouse credentials are forwarded. - If you need to keep persistent service, verify the start/stop scripts and log/PID locations; consider running under a process supervisor you control. - If you are not comfortable auditing the remote code or controlling environment leakage, consider asking for an official packaged release or a version pinned to a known-good commit instead of fetching 'main'.
功能分析
Type: OpenClaw Skill Name: byted-bytehouse-mcp Version: 1.0.0 The skill bundle provides a legitimate Model Context Protocol (MCP) interface for interacting with ByteHouse, a cloud data warehouse by Volcengine. The included shell scripts (start_mcp_service.sh, stop_mcp_service.sh) and Python modules (mcp_client.py, test_mcp_server.py) are designed to manage a local MCP server instance and execute database queries. While the system uses 'uvx' to fetch and execute code from a remote GitHub repository (volcengine/mcp-server), this is a standard deployment pattern for MCP servers and aligns with the stated purpose. No evidence of data exfiltration, credential theft, or malicious prompt injection was found.
能力评估
Purpose & Capability
The skill's files (mcp_client.py, start/stop scripts, analyzer) align with the stated purpose of running an MCP server and querying ByteHouse. However there are minor inconsistencies: SKILL.md / README mention 'uv' installed at /root/.local/bin/uv while scripts and Python code invoke '/root/.local/bin/uvx' (uvx). The registry metadata did not declare required binaries even though the runtime assumes specific binaries at absolute paths.
Instruction Scope
Runtime instructions start a background MCP server that is installed/started by invoking a git+ URL (via uvx) and then call arbitrary MCP tools. The Python code passes os.environ.copy() into the launched MCP Server process, which will forward all environment variables (not only ByteHouse credentials) to that external code. That behavior can leak unrelated secrets in the agent environment to the external server process. The instructions also assume the ability to run background processes and write PID/log files in the skill folder.
Install Mechanism
There is no formal install spec, but the runtime relies on executing a command that fetches and installs code from a GitHub repo (git+https://github.com/volcengine/mcp-server@main#subdirectory=...). Fetching and executing remote code at runtime is expected for this use-case but increases risk. The reliance on an absolute binary path (/root/.local/bin/uvx) is brittle and mismatches other documentation references to 'uv'.
Credentials
The skill legitimately needs ByteHouse connection info (host/user/password), and the SKILL.md asks the user to set ByteHouse-related env vars. But the code sets the child process env to a copy of the entire os.environ, which will expose any other environment secrets (cloud credentials, tokens, API keys) to the MCP Server process. The skill metadata did not declare these potential exposures.
Persistence & Privilege
The skill starts a persistent background service (writes PID and logs) and provides start/stop/status scripts; this is consistent with running a local server. always:false and normal autonomy settings are used. Starting a background process is expected, but you should be aware it will run arbitrary remote-installed code until stopped.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install byted-bytehouse-mcp
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /byted-bytehouse-mcp 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
byted-bytehouse-mcp v1.0.0 – Initial Release - Provides tools for local setup and management of ByteHouse MCP Server, supporting connection, query, and data asset analysis for ByteHouse. - Includes new support for automated data asset catalog and lineage analysis via `data_asset_analyzer.py`. - Offers tools to list databases and tables, run SELECT/DML/DDL queries, and retrieve ByteHouse table engine docs. - Supplies bash scripts for starting, stopping, checking status, and restarting the MCP Server. - Example scripts provided for common database operations and MCP client usage. - Requires pre-configured environment variables and Python 3.8+.
元数据
Slug byted-bytehouse-mcp
版本 1.0.0
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 1
常见问题

Byted Bytehouse Mcp 是什么?

在本地拉起ByteHouse MCP Server并调用其tools的技能,用于连接ByteHouse数据库查询数据、使用MCP协议与ByteHouse交互、生成数据资产目录和血缘分析。当用户需要连接ByteHouse数据库查询数据、使用MCP协议与ByteHouse交互、生成数据资产目录和血缘分析时,使用此Sk... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 96 次。

如何安装 Byted Bytehouse Mcp?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install byted-bytehouse-mcp」即可一键安装,无需额外配置。

Byted Bytehouse Mcp 是免费的吗?

是的,Byted Bytehouse Mcp 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Byted Bytehouse Mcp 支持哪些平台?

Byted Bytehouse Mcp 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Byted Bytehouse Mcp?

由 volcengine-skills(@volcengine-skills)开发并维护,当前版本 v1.0.0。

💬 留言讨论