Description

Community trust scores for AI agent payment endpoints — checks endpoint reputation before payment and queues anonymous failure reports locally (network repor...

README (SKILL.md)

fraud-filter

Name: Fraud Filter
Author: mattpolly

You have access to a community transaction outcome report network for agent payment endpoints. Before paying any service, you can check its satisfaction score, success rate, and price history. After transactions, you report outcomes back to the network automatically for all failures.

Available Tools

check-endpoint.sh

Look up outcome report data for an endpoint URL. Use this before any agent payment to assess risk.

# Basic check
check-endpoint.sh https://api.stockdata.xyz/report/AAPL

# Check with price anomaly detection
check-endpoint.sh https://api.stockdata.xyz/report/AAPL --price 0.05

Returns JSON with: known (bool), score (0-100), success_rate, median_price, price_range, warnings, and recommendation (allow/caution/block).

report.sh

Queue an anonymous transaction outcome report. See post-transaction workflow below.

# Report a post-payment failure (paid but received nothing or bad data)
report.sh https://shady-data.xyz/api/v2 post_payment_failure 0.50

# Report with skill attribution
report.sh https://shady-data.xyz/api/v2 post_payment_failure 0.50 --skill stock-research

# Report a pre-payment failure (failed before payment completed)
report.sh https://broken.example.com/api pre_payment_failure 0.10

sync-trust-db.sh

Download the latest outcome report database from CDN. Normally runs nightly.

sync-trust-db.sh           # Download if older than 24h
sync-trust-db.sh --force   # Force re-download

status.sh

Show database status and sync information.

status.sh          # DB age, endpoint count, file size
status.sh --full   # Also show pending reports and config

dashboard.sh

Manage the local web dashboard for outcome data exploration.

dashboard.sh start    # Start dashboard (http://127.0.0.1:18921)
dashboard.sh stop     # Stop dashboard
dashboard.sh status   # Check if running
dashboard.sh url      # Print dashboard URL

Automatic Enforcement

fraud-filter runs as an OpenClaw plugin. Endpoint checks and failure reporting happen automatically — no hook configuration required.

Policy settings (configurable via dashboard Settings tab or POST /api/config):

Setting	Values	Default	Meaning
`on_block`	`block` \| `warn`	`block`	What to do when recommendation is `block`
`on_caution`	`warn` \| `block` \| `allow`	`warn`	What to do when recommendation is `caution`

The plugin fails open — if fraud-filter encounters an internal error, the payment proceeds rather than being blocked. Unknown endpoints always proceed.

When the plugin blocks a payment, your tool call will fail with an error message like:

fraud-filter: blocked payment to https://... (hotlisted — surge of recent failure reports)

When this happens:

Do not retry — the block is intentional
Tell the user exactly why: quote the endpoint and the reason from the error message
Ask for explicit confirmation if they want to override
If they confirm, use POST /api/config to set on_block: "warn" temporarily, execute the payment, then restore the setting

Pre-Transaction Verification

The plugin checks every payment tool call automatically — you do not need to run check-endpoint.sh manually before each payment. If the endpoint is hotlisted or has a low satisfaction score, the tool call will be blocked before money moves (see Automatic Enforcement above).

Use check-endpoint.sh manually when:

The user asks you to assess a specific endpoint before committing
You want detailed data (score, success rate, price history) to inform a decision
You want to check for price anomalies on a high-cost call

check-endpoint.sh https://api.example.com/data             # full assessment
check-endpoint.sh https://api.example.com/data --price 2.50  # include price anomaly check

Unknown endpoints always return allow. No data is not a risk signal — the ecosystem is new. Never treat known: false as a reason to warn or block.

On price anomalies, check anomaly_type:

suspicious — price is high and endpoint has low satisfaction score; warn the user
market_outlier — price is high but endpoint is otherwise trusted; inform the user but proceed

Post-Transaction Reporting

The plugin automatically detects failure outcomes and queues an anonymous report to data/pending-reports.jsonl on your local machine — no network call is made. Reports are only submitted to the network when you explicitly enable participate_in_network and trigger a flush. You do not need to run report.sh for failures the plugin can detect (empty, garbage, or error tool responses).

Use report.sh manually when the plugin couldn't have known it was a failure: the service returned something that looked valid at the protocol level but was actually wrong or useless from your perspective as the agent. This is a quality judgment only you can make.

Always include a --reason. Write it from your perspective: what you needed, what the endpoint claimed to provide, and what you actually got. One to three factual sentences.

When to queue manually:

Service returned HTTP 200 with plausible-looking data that turned out to be wrong, stale, or fabricated
Service returned less than you paid for with no error (partial fulfillment)

report.sh \x3Curl> post_payment_failure 0.05 --reason "Needed current AAPL price. Service returned HTTP 200 with an empty data array."
report.sh \x3Curl> pre_payment_failure 0 --reason "DNS resolution failed. Could not reach endpoint to initiate payment."

Queue locally without waiting for human confirmation
Notify the user: "I queued an anonymous outcome report for \x3Chostname> — paid but received a poor result. It will be sent to the network only if you enable participate_in_network."

Never report success. Absence of failure reports is the positive signal.

Reading the Data Directly

The outcome report database is a flat JSON file at data/trust.json. You can read it directly and reason over it yourself — there is no query API because you don't need one. Use this when the user asks questions like "which endpoints have I transacted with most?" or "show me everything flagged as caution" — just read the file and answer.

When to Use

User asks whether an endpoint is trustworthy → check-endpoint.sh \x3Curl>
Price seems high → check-endpoint.sh \x3Curl> --price \x3Camount> to detect anomalies
Service returned valid-looking data that was actually wrong or useless → report.sh \x3Curl> post_payment_failure \x3Camount> — notify user
User asks about outcome data → status.sh for DB status, read data/trust.json directly for deeper questions, or dashboard.sh start for visual exploration
Trust data seems stale → sync-trust-db.sh to refresh

Important

Queue all failure reports locally. The plugin does this automatically for detectable failures; run report.sh for quality-judgment failures only you can identify.
Always notify the user when queuing. One line: what endpoint, what outcome, and that the report stays local until they enable participate_in_network.
Never report success. Absence of failure reports is the positive signal.
Never block on unknown endpoints. False blocks on legitimate services make this skill useless.

Usage Guidance

What to check and consider before installing or enabling this skill: - Confirm reporter and trust-db implementations: the README/TECHNICAL.md say that URLs are SHA-256 hashed and only hashed endpoint identifiers are ever sent, and that reasons/amounts are bucketed. But I could not fully verify the reporter.js / trust-db.js source in this listing — review those specific files to ensure no raw URLs, exact amounts, install IDs, or other identifying data are transmitted during submission. - Network downloads: by default the skill automatically downloads the trust DB and hourly hotlist from api.fraud-filter.com. This is expected for its function, but it means the remote host will see your IP and fetch timing. If you require air-gapped operation, set sync_hotlist: false and do not run sync-trust-db.sh. - Opt-in submission semantics: queued reports are stored locally by default, but a config change to participate_in_network:true plus an explicit flush will send pending reports. Decide whether to allow agents to change that config automatically; restrict agent permissions if you want human approval before enabling network submission. - False positives / overreach: the regex used to detect "payment" tools is broad and could match non-payment tool names containing substrings like "pay" or "wallet". Expect some false interceptions; test in a safe environment. - File protections: the code promises data files are created with mode 0600. Verify data/pending-reports.jsonl and data/config.json permissions after install to ensure pending reports remain local and readable only by the agent user. - If you are not able to audit reporter.js and trust-db.js yourself, run the skill in an isolated environment (VM/container) first and observe network traffic (which endpoints are contacted and what payloads are sent) before deploying it in production. If you want, I can: (a) open and inspect server/reporter.js and server/trust-db.js for concrete verification of the hashing and network submission logic, or (b) produce a short test plan you can run to confirm that no raw URLs/amounts leave your machine.

Capability Analysis

Type: OpenClaw Skill Name: fraud-filter Version: 0.4.0 The fraud-filter skill is a well-architected reputation system for AI agent payment endpoints that prioritizes privacy and user consent. Key security features include SHA-256 hashing of URLs to prevent leaking full endpoint histories, price bucketing to anonymize transaction values, and a strictly opt-in network reporting model (participate_in_network is false by default). The code is dependency-free, binds its local dashboard only to 127.0.0.1, and includes proactive security measures like proxy detection in server.js and failing open on internal errors to prevent service disruption.

Capability Tags

cryptorequires-walletcan-make-purchasesrequires-sensitive-credentials

Capability Assessment

ℹ Purpose & Capability

Name/description match required artifacts: the skill needs Node and registers pre- and post-payment hooks to check endpoints and queue reports. The declared remote endpoints (api.fraud-filter.com) and local files (data/*.json) are consistent with the stated purpose. Minor mismatch: package.json version (0.2.0) vs registry version (0.4.0) — likely benign but incongruent.

ℹ Instruction Scope

SKILL.md and the hook scripts confine behavior to payment-looking tool calls and to local queuing by default. The plugin also exposes API endpoints (localhost dashboard) and instructs agents on how to change policy (POST /api/config). However: the skill includes automatic enforcement hooks that can block/warn on payment tool calls (expected), and SKILL.md instructs agents how to temporarily override blocking via config changes. The payment-detection regex is broad and could match non-payment tools, causing unexpected interception/blocking.

✓ Install Mechanism

No external install script/URL is included (instruction-only install), and the skill requires only the Node runtime. There are no downloads from obscure hosts in the skill files themselves; external network activity is limited to fetching trust DB/hotlist from the documented api.fraud-filter.com CDN, which is coherent for the purpose.

ℹ Credentials

The skill declares no required environment variables or credentials, and its network calls target api.fraud-filter.com which matches the homepage. Default config disables outbound report submission (participate_in_network: false). Caveats: the skill does automatically fetch a hotlist and (by default) the trust DB from the remote CDN — these are download-only but still reveal your agent's IP and timing to that host. The code references environment overrides for hotlist/trust URLs (FRAUD_FILTER_HOTLIST_URL, etc.), which is reasonable but gives administrators knobs that affect remote endpoints.

ℹ Persistence & Privilege

The skill is not always:true and is user-invocable (normal). It registers persistent hooks (before_tool_call, tool_result_persist) as expected for a plugin that enforces and observes payments. One risk to note: because the platform allows autonomous skill invocation, an agent with intent could change plugin config (e.g., enable network submission) and flush queued reports without explicit human action unless other platform controls prevent it — SKILL.md asks the agent to get explicit confirmation, but that is guidance rather than an enforced safeguard.

Version History

v0.4.0

Security fixes: remove wildcard CORS from dashboard server; add sync_hotlist config option to disable automatic hotlist download; fix documentation contradictions between auto-report and opt-in behavior; rewrite sync-trust-db.sh to use Node built-in fetch (eliminates undeclared curl/wget dependency)

v0.3.0

plugin-based enforcement, block surfacing, pre/post-transaction reporting rewrites

Metadata

Slug fraud-filter

Version 0.4.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 2

Frequently Asked Questions

What is Fraud Filter?

Community trust scores for AI agent payment endpoints — checks endpoint reputation before payment and queues anonymous failure reports locally (network repor... It is an AI Agent Skill for Claude Code / OpenClaw, with 115 downloads so far.

How do I install Fraud Filter?

Run "/install fraud-filter" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Fraud Filter free?

Yes, Fraud Filter is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Fraud Filter support?

Fraud Filter is cross-platform and runs anywhere OpenClaw / Claude Code is available (macos, linux).

Who created Fraud Filter?

It is built and maintained by mattpolly (@mattpolly); the current version is v0.4.0.

More Skills

Fraud Filter