← Back to Skills Marketplace
simonpierreboucher02

hugging-face-api

by Simon-Pierrre Boucher · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ✓ Security Clean
52
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install hugging-face-api
Description
Search and discover Hugging Face open-source models and datasets, then run OpenAI-compatible chat or embedding inference securely with cost control.
README (SKILL.md)

Hugging Face Agent Skill

A playbook for agents that use the Hugging Face MCP server. Follow these steps in order. Discover for free first; run billed inference only against confirmed-supported models.


1. Name

Hugging Face — open-source model and dataset discovery plus OpenAI-compatible inference (chat and embeddings) across inference providers, via 7 MCP tools.

2. Purpose

Use this skill to find open-source models and datasets on the Hugging Face Hub, confirm which models are runnable through the Inference router, and run chat completions and embeddings — while controlling cost, respecting licenses, and keeping the access token secret.

3. When to use Hugging Face

Use it when the task involves:

  • Open-source models (Llama, Qwen, Mistral, BGE, sentence-transformers, etc.).
  • Model or dataset discovery — search/inspect the Hub catalog.
  • OpenAI-compatible inference across providers — one interface, many providers.
  • Embeddings — vectors for semantic search, RAG, clustering.

4. When NOT to use it

  • If you need a specific closed/proprietary model (e.g. a vendor's flagship), call that vendor's provider directly.
  • If the task needs no model at all (pure local computation), skip inference.
  • If a cheaper or already-integrated tool already solves the task, use it.

5. Environment

Set one secret:

Variable Required Notes
HF_TOKEN Yes hf_.... Get it at https://huggingface.co/settings/tokens. Never expose it.

Optional: HF_HUB_BASE_URL, HF_ROUTER_BASE_URL, HF_TIMEOUT_MS, HF_MAX_RETRIES, LOG_LEVEL.

6. Operations (the 7 tools)

Tool Use it to Cost
hf_search_models Search Hub models Free
hf_model_info Inspect one model (license, task) Free
hf_search_datasets Search Hub datasets Free
hf_list_inference_models List models runnable via router Free
hf_chat OpenAI-style chat completion Billed
hf_embeddings Embedding vectors Billed
hf_request Reach any other Hub/router endpoint Depends

7. Discovery workflow (FREE)

Do this first; it costs nothing.

  1. hf_search_models — find candidates by task/author/popularity.
  2. hf_model_info — check pipeline_tag and cardData.license.
  3. hf_search_datasets — find data if needed.
  4. hf_list_inference_models — confirm the chosen model is actually runnable.

8. Inference workflow (BILLED)

  1. Choose a model that appears in hf_list_inference_models.
  2. For chat: call hf_chat with OpenAI-style messages and a bounded max_tokens.
  3. For vectors: call hf_embeddings with a batch of inputs (default model sentence-transformers/all-MiniLM-L6-v2).
  4. Report the model id and the returned usage.

9. Cost control

  • Hub discovery is free — use it liberally.
  • Inference is billed per provider — always:
    • Set max_tokens on hf_chat.
    • Prefer smaller models when quality allows.
    • Batch embeddings (array inputs) instead of per-item calls.
    • Cache embeddings and deterministic completions.

10. Error handling

Error Reaction
model_not_supported (402/403) Call hf_list_inference_models, pick a listed model, retry.
401 invalid token Stop. Fix HF_TOKEN. Do not retry blindly.
402 credits Stop. Add credits or use a cheaper/free model.
429 rate limit Back off (server retries); slow down, batch, cache.

11. Security

  • Never print, log, or echo the hf_ token. The server redacts it; do not undo that.
  • Use a least-privilege token (read for discovery; inference only where needed).
  • Use placeholders (your_hf_token) in any shared config.

12. Reproducibility / model pinning

  • Use exact model ids (and a revision/commit if available) so runs are repeatable.
  • Use the same embedding model for indexing and querying in RAG.

13. Licensing

  • Before downstream use, check the model card's license (hf_model_infocardData.license).
  • Respect usage restrictions (commercial use, redistribution, gated access).

14. Agent checklist

  • Confirmed Hugging Face is the right tool (open-source / discovery / embeddings).
  • Discovered model via hf_search_models / hf_model_info (free).
  • Confirmed it is runnable via hf_list_inference_models.
  • Checked the license.
  • Set max_tokens (chat) / batched inputs (embeddings).
  • Did not expose the token.
  • Cited the exact model id and reported usage.

15. Example workflows

  • Find a model → run chat: hf_search_modelshf_model_infohf_list_inference_modelshf_chat. See recipes/find-and-run-model.md.
  • Build embeddings for RAG: hf_embeddings (batch) → store → query. See recipes/build-embeddings.md.
  • Dataset lookup: hf_search_datasetshf_request for details. See recipes/dataset-discovery.md.

16. Common mistakes

  • Calling hf_chat before confirming the model is supported (causes model_not_supported).
  • One embedding call per item instead of a batch (slow and costly).
  • Skipping the license check.
  • Exposing the token in logs or output.
  • Omitting max_tokens, leading to runaway generation cost.

17. Maintenance

  • The runnable model list changes — re-run hf_list_inference_models rather than hardcoding ids.
  • Re-check licenses when adopting a new model.
  • Rotate HF_TOKEN periodically.
  • Confirm endpoint/provider details against https://huggingface.co/docs when behavior changes.
Usage Guidance
Install only if you intend your agent to use Hugging Face services. Use a least-privilege HF token, expect prompts/document chunks sent to Hugging Face inference providers for chat or embeddings, avoid confidential or regulated data unless approved for that provider, and set spending limits or require confirmation before billed calls.
Capability Tags
requires-oauth-tokenrequires-sensitive-credentials
Capability Assessment
Purpose & Capability
The stated purpose matches the artifact contents: use Hugging Face MCP tools to search models/datasets, confirm runnable models, and run chat or embeddings with cost and license checks.
Instruction Scope
The recipes call external Hugging Face chat and embedding tools and one recipe has a broad open-ended-use trigger, but the main skill also tells agents to use Hugging Face only for open-source/discovery/embedding tasks and to skip unnecessary inference.
Install Mechanism
The package contains markdown playbooks and references only; no executable scripts, package install hooks, or automatic runtime behavior were found.
Credentials
Requiring HF_TOKEN and using billed router endpoints is proportionate to Hugging Face inference, and the skill repeatedly warns not to expose the token and to use least privilege.
Persistence & Privilege
The RAG recipe suggests storing embeddings and source metadata in a vector store, which is expected for embeddings workflows but should be applied only to data the user is allowed to store and send to Hugging Face.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install hugging-face-api
  3. After installation, invoke the skill by name or use /hugging-face-api
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
Initial release of the Hugging Face Skill. - Discover and inspect open-source models and datasets on the Hugging Face Hub for free. - Confirm model support, run chat completions, and generate embedding vectors via inference router (billed operations). - Provides step-by-step workflows for safe, cost-controlled, and reproducible agent usage. - Enforces strong security and licensing checks, including secret management and model license awareness. - Supports 7 distinct operations via modular tools for search, inspection, inference, and API access. - Includes an agent checklist, troubleshooting guides, and maintenance recommendations.
Metadata
Slug hugging-face-api
Version 1.0.0
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 1
Frequently Asked Questions

What is hugging-face-api?

Search and discover Hugging Face open-source models and datasets, then run OpenAI-compatible chat or embedding inference securely with cost control. It is an AI Agent Skill for Claude Code / OpenClaw, with 52 downloads so far.

How do I install hugging-face-api?

Run "/install hugging-face-api" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is hugging-face-api free?

Yes, hugging-face-api is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does hugging-face-api support?

hugging-face-api is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created hugging-face-api?

It is built and maintained by Simon-Pierrre Boucher (@simonpierreboucher02); the current version is v1.0.0.

💬 Comments