← 返回 Skills 市场
tenequm

lance-format

作者 Misha Kolesnik · GitHub ↗ · v0.1.0 · MIT-0
cross-platform ✓ 安全检测通过
88
总下载
0
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install lance-format
功能描述
Reference for Lance v7 - the open columnar lakehouse format for multimodal AI - and its Rust crate workspace (`lance`, `lance-table`, `lance-file`, `lance-en...
使用说明 (SKILL.md)

Lance v7 reference

Lance is an open columnar format for multimodal AI - "a columnar data format that is 100x faster than Parquet for random access." It is not one format but a stack of interoperating specs: a file format, a table format, index formats, catalog specs, and a namespace client spec. The Rust workspace at lance-format/lance implements all of them plus Python (pylance) and Java bindings.

This skill tracks v7.0.0-beta.16 (the lance-format/lance git tag). Pin against tags, not main - Lance ships beta tags every few days and next-format encodings can change.

The deep reference is references/lance-reference.md. Load it for any concrete schema, parameter, proto, or constraint. This file is the orientation: read it first, then jump into the reference section you need.

Lance vs LanceDB

These are two different things and conflating them produces wrong answers.

  • Lance - the format and engine. The lance-format/lance repo; the lance /lance-* Rust crates; pylance. It gives you datasets, the file/table format, indexes, commits, scans. Consumed directly by DuckDB, Polars, Ray, Spark, PyTorch, DataFusion, or your own Rust/Python code. This skill is about Lance.
  • LanceDB - a separate database product (lancedb/lancedb) built on top of Lance. It adds a query-builder API, an embedding registry, rerankers-as-API, multi-language SDK parity, and managed Cloud / Enterprise tiers. Not covered here.

If you are linking the lance crate in Cargo.toml, you are using Lance directly - use this skill. If a question is about LanceDB internals, the storage layer underneath it is still Lance, so this skill remains the authority for the format itself.

The crate workspace

23 crate directories under rust/. lance is the public entry point; the rest are layers beneath it. Full table with descriptions and citations in references/lance-reference.md section 2.

Crate Role
lance Public entry point - Dataset, scanner, indexes, commits
lance-table Table format - manifest, feature flags, commit handlers, row IDs
lance-file File format - file reader/writer
lance-encoding Structural encodings, compression (internal, not for external use)
lance-index Scalar / vector / FTS / system indexes
lance-io Object store, I/O schedulers
lance-core Shared Error/Result, cache, datatypes
lance-datafusion DataFusion glue (exec, expr, planner, UDFs)
lance-linalg SIMD L2 / dot / cosine / hamming kernels
lance-tokenizer FTS tokenizer stack (simple, ngram, jieba, lindera, stemmers)
lance-geo Geospatial UDFs (feature-gated geo)
lance-namespace / -impls / -datafusion Namespace trait, Directory/REST impls, DataFusion catalog bridge
lance-arrow, lance-core, lance-tools, fsst, lance-bitpacking, ... Arrow extensions, CLI, compression sub-crates

All share version = "7.0.0-beta.16" except lance-arrow-scalar (pinned 58.0.0, tracks Arrow) and lance-namespace-datafusion (pinned 7.0.0-beta.9). Workspace: edition 2024, rust-version = 1.91.0, resolver = "3".

File format versions

The file format carries a single major.minor version. Selected per-dataset at creation via data_storage_version and fixed once the dataset exists (to change it, rewrite the dataset).

Version Status Notes
0.1 (legacy) read-only Original format; no longer writable
2.0 stable Removed row groups; null support for lists/FSL/primitives
2.1 current default (stable) Adaptive structural encodings; better integer/string compression; nulls in struct fields; better nested random access. Default since Lance 5.0.0
2.2 next (unstable) Map type, Blob v2, VariablePackedStruct, larger mini-blocks. Required for Map and Blob v2; encodings may still change

stable and next are aliases resolved by the running Lance release - pin an explicit number for deterministic behavior.

What's new in v7

The v6 -> v7 boundary is one breaking change: feat!: make dataset object store access base-aware (#6647) - object-store access is now scoped to a dataset base rather than a flat global path, which underpins multi-base storage (hot/cold tiering, shallow clones).

The dominant theme across the v7 betas is MemWAL - an experimental LSM / write-ahead-log architecture for high-throughput streaming writes (WAL appender/tailer primitives, shard writers, a Lance-native in-memory HNSW index, the shared-memory:// object-store scheme). Also landing in the v7 era: branches (Git-like, alongside tags), segmented and distributed index builds (FTS, bitmap, btree), newer scalar indexes (zonemap, bloom filter, ngram), the geo / RTree index and lance-geo crate, manifest version hints for fast latest-version lookup, and a formal split of the catalog / namespace / table / index specifications. Details in references/lance-reference.md section 14.

Navigating the reference

references/lance-reference.md is the full v7 reference, regrounded against the v7.0.0-beta.16 source. Load the section for your task:

  1. What Lance is - the lakehouse spec stack
  2. Crate workspace - all 23 crates, what each does, the public entry point
  3. File format - versions, container layout, structural encoding (mini-block / full-zip / constant / blob page types), compression schemes, blob encoding
  4. Data types - Arrow type coverage, FixedSizeList for vectors, JSON (JSONB), blob, ML extension arrays (bfloat16, image types)
  5. Table format - dataset directory layout, manifest contents, fragments, deletion files, base paths
  6. Schema evolution - field IDs, zero-copy column add/drop/alter, why old rows read NULL
  7. Versioning, tags, branches - manifest versions, time travel, tag pinning, branches
  8. Row IDs - row address vs stable row ID, lineage, change-data-feed columns
  9. Transactions and concurrency - the 15 transaction ops, OCC retry/rebase, commit handlers (conditional-put, DynamoDB), conflict resolution matrix
  10. MemWAL - shards, MemTable/WAL/flush, the appender/tailer/flusher model, fencing
  11. Indexes - vector (IVF/HNSW/PQ/SQ/RQ), scalar (btree/bitmap/bloom/labellist/ngram/ zonemap), full-text (BM25, tokenizers), geo/RTree
  12. Distributed write and indexing - two-phase commits, segmented index builds
  13. Object store - URI schemes, storage options, per-backend config
  14. What changed in v7 - the full v7 delta
  15. Capability matrix - what Lance can and cannot do
  16. Source map - where each spec and proto lives in the repo

Maintenance

Citations in references/lance-reference.md are path:line relative to the lance-format/lance repo; build a permalink as https://github.com/lance-format/lance/blob/v7.0.0-beta.16/\x3Cpath>.

To refresh: git -C ~/pjv/lance-format/lance fetch --tags, check out the newest v7* tag, re-read the format spec under docs/src/format/ and the user guide under docs/src/guide/, re-verify the crate workspace, and bump metadata.upstream plus every v7.0.0-beta.16 reference. Line numbers in citations drift between tags - treat them as approximate.

安全使用建议
Treat this as an incomplete low-confidence review because the workspace artifact files could not be inspected; review SKILL.md, metadata, install steps, and bundled files before installing.
能力标签
cryptocan-make-purchases
能力评估
Purpose & Capability
No SKILL.md or artifact content was accessible to substantiate a purpose or capability concern.
Instruction Scope
No artifact-backed instruction-scope issue was available for review.
Install Mechanism
No install artifact content was accessible to substantiate an install-mechanism concern.
Credentials
No artifact-backed evidence showed disproportionate environment access.
Persistence & Privilege
No artifact-backed evidence showed persistence or privilege abuse.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install lance-format
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /lance-format 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v0.1.0
Initial publish of lance-format 0.1.0. Changes: - added `LICENSE.txt` - added `SKILL.md` - added `references/lance-reference.md`
元数据
Slug lance-format
版本 0.1.0
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 1
常见问题

lance-format 是什么?

Reference for Lance v7 - the open columnar lakehouse format for multimodal AI - and its Rust crate workspace (`lance`, `lance-table`, `lance-file`, `lance-en... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 88 次。

如何安装 lance-format?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install lance-format」即可一键安装,无需额外配置。

lance-format 是免费的吗?

是的,lance-format 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

lance-format 支持哪些平台?

lance-format 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 lance-format?

由 Misha Kolesnik(@tenequm)开发并维护,当前版本 v0.1.0。

💬 留言讨论