← Back to Skills Marketplace

hugging-face-api

Name: hugging-face-api
Author: simonpierreboucher02

by Simon-Pierrre Boucher · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ✓ Security Clean

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install hugging-face-api

Description

Search and discover Hugging Face open-source models and datasets, then run OpenAI-compatible chat or embedding inference securely with cost control.

README (SKILL.md)

Hugging Face Agent Skill

A playbook for agents that use the Hugging Face MCP server. Follow these steps in order. Discover for free first; run billed inference only against confirmed-supported models.

1. Name

Hugging Face — open-source model and dataset discovery plus OpenAI-compatible inference (chat and embeddings) across inference providers, via 7 MCP tools.

2. Purpose

Use this skill to find open-source models and datasets on the Hugging Face Hub, confirm which models are runnable through the Inference router, and run chat completions and embeddings — while controlling cost, respecting licenses, and keeping the access token secret.

3. When to use Hugging Face

Use it when the task involves:

Open-source models (Llama, Qwen, Mistral, BGE, sentence-transformers, etc.).
Model or dataset discovery — search/inspect the Hub catalog.
OpenAI-compatible inference across providers — one interface, many providers.
Embeddings — vectors for semantic search, RAG, clustering.

4. When NOT to use it

If you need a specific closed/proprietary model (e.g. a vendor's flagship), call that vendor's provider directly.
If the task needs no model at all (pure local computation), skip inference.
If a cheaper or already-integrated tool already solves the task, use it.

5. Environment

Set one secret:

Variable	Required	Notes
`HF_TOKEN`	Yes	`hf_...`. Get it at https://huggingface.co/settings/tokens. Never expose it.

Optional: HF_HUB_BASE_URL, HF_ROUTER_BASE_URL, HF_TIMEOUT_MS, HF_MAX_RETRIES, LOG_LEVEL.

6. Operations (the 7 tools)

Tool	Use it to	Cost
`hf_search_models`	Search Hub models	Free
`hf_model_info`	Inspect one model (license, task)	Free
`hf_search_datasets`	Search Hub datasets	Free
`hf_list_inference_models`	List models runnable via router	Free
`hf_chat`	OpenAI-style chat completion	Billed
`hf_embeddings`	Embedding vectors	Billed
`hf_request`	Reach any other Hub/router endpoint	Depends

7. Discovery workflow (FREE)

Do this first; it costs nothing.

hf_search_models — find candidates by task/author/popularity.
hf_model_info — check pipeline_tag and cardData.license.
hf_search_datasets — find data if needed.
hf_list_inference_models — confirm the chosen model is actually runnable.

8. Inference workflow (BILLED)

Choose a model that appears in hf_list_inference_models.
For chat: call hf_chat with OpenAI-style messages and a bounded max_tokens.
For vectors: call hf_embeddings with a batch of inputs (default model sentence-transformers/all-MiniLM-L6-v2).
Report the model id and the returned usage.

9. Cost control

Hub discovery is free — use it liberally.
Inference is billed per provider — always:
- Set max_tokens on hf_chat.
- Prefer smaller models when quality allows.
- Batch embeddings (array inputs) instead of per-item calls.
- Cache embeddings and deterministic completions.

10. Error handling

Error	Reaction
`model_not_supported` (402/403)	Call `hf_list_inference_models`, pick a listed model, retry.
`401` invalid token	Stop. Fix `HF_TOKEN`. Do not retry blindly.
`402` credits	Stop. Add credits or use a cheaper/free model.
`429` rate limit	Back off (server retries); slow down, batch, cache.

11. Security

Never print, log, or echo the hf_ token. The server redacts it; do not undo that.
Use a least-privilege token (read for discovery; inference only where needed).
Use placeholders (your_hf_token) in any shared config.

12. Reproducibility / model pinning

Use exact model ids (and a revision/commit if available) so runs are repeatable.
Use the same embedding model for indexing and querying in RAG.

13. Licensing

Before downstream use, check the model card's license (hf_model_info → cardData.license).
Respect usage restrictions (commercial use, redistribution, gated access).

14. Agent checklist

Confirmed Hugging Face is the right tool (open-source / discovery / embeddings).
Discovered model via hf_search_models / hf_model_info (free).
Confirmed it is runnable via hf_list_inference_models.
Checked the license.
Set max_tokens (chat) / batched inputs (embeddings).
Did not expose the token.
Cited the exact model id and reported usage.

15. Example workflows

Find a model → run chat: hf_search_models → hf_model_info → hf_list_inference_models → hf_chat. See recipes/find-and-run-model.md.
Build embeddings for RAG: hf_embeddings (batch) → store → query. See recipes/build-embeddings.md.
Dataset lookup: hf_search_datasets → hf_request for details. See recipes/dataset-discovery.md.

16. Common mistakes

Calling hf_chat before confirming the model is supported (causes model_not_supported).
One embedding call per item instead of a batch (slow and costly).
Skipping the license check.
Exposing the token in logs or output.
Omitting max_tokens, leading to runaway generation cost.

17. Maintenance

The runnable model list changes — re-run hf_list_inference_models rather than hardcoding ids.
Re-check licenses when adopting a new model.
Rotate HF_TOKEN periodically.
Confirm endpoint/provider details against https://huggingface.co/docs when behavior changes.

Usage Guidance

Install only if you intend your agent to use Hugging Face services. Use a least-privilege HF token, expect prompts/document chunks sent to Hugging Face inference providers for chat or embeddings, avoid confidential or regulated data unless approved for that provider, and set spending limits or require confirmation before billed calls.

Capability Tags

requires-oauth-tokenrequires-sensitive-credentials

Capability Assessment

✓ Purpose & Capability

The stated purpose matches the artifact contents: use Hugging Face MCP tools to search models/datasets, confirm runnable models, and run chat or embeddings with cost and license checks.

ℹ Instruction Scope

The recipes call external Hugging Face chat and embedding tools and one recipe has a broad open-ended-use trigger, but the main skill also tells agents to use Hugging Face only for open-source/discovery/embedding tasks and to skip unnecessary inference.

✓ Install Mechanism

The package contains markdown playbooks and references only; no executable scripts, package install hooks, or automatic runtime behavior were found.

ℹ Credentials

Requiring HF_TOKEN and using billed router endpoints is proportionate to Hugging Face inference, and the skill repeatedly warns not to expose the token and to use least privilege.

ℹ Persistence & Privilege

The RAG recipe suggests storing embeddings and source metadata in a vector store, which is expected for embeddings workflows but should be applied only to data the user is allowed to store and send to Hugging Face.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install hugging-face-api
After installation, invoke the skill by name or use /hugging-face-api
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.0.0

Initial release of the Hugging Face Skill. - Discover and inspect open-source models and datasets on the Hugging Face Hub for free. - Confirm model support, run chat completions, and generate embedding vectors via inference router (billed operations). - Provides step-by-step workflows for safe, cost-controlled, and reproducible agent usage. - Enforces strong security and licensing checks, including secret management and model license awareness. - Supports 7 distinct operations via modular tools for search, inspection, inference, and API access. - Includes an agent checklist, troubleshooting guides, and maintenance recommendations.

Metadata

Slug hugging-face-api

Version 1.0.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is hugging-face-api?

Search and discover Hugging Face open-source models and datasets, then run OpenAI-compatible chat or embedding inference securely with cost control. It is an AI Agent Skill for Claude Code / OpenClaw, with 52 downloads so far.

How do I install hugging-face-api?

Run "/install hugging-face-api" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is hugging-face-api free?

Yes, hugging-face-api is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does hugging-face-api support?

hugging-face-api is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created hugging-face-api?

It is built and maintained by Simon-Pierrre Boucher (@simonpierreboucher02); the current version is v1.0.0.

More Skills