/install amazon-review-workbook
Amazon Review Workbook
Turn an Amazon product or review link into a two-phase delivery workbook.
This skill is designed to be portable: the scripts live inside the skill folder and do not depend on dashcamauto or any other local repo.
Quick Path
- If this is the first run on a machine, read references/setup.md.
- Run a quick health check:
python scripts/amazon_review_workbook.py doctor --url "\x3Camazon-url>"
- Run factual collection:
python scripts/amazon_review_workbook.py intake --url "\x3Camazon-url>" --output-dir "\x3Cworkspace>/amazon-review-output"
- If DeepLX is configured and reachable, fill
评论中文版:
python scripts/amazon_review_workbook.py translate --input-json "\x3Cworkspace>/amazon-review-output/amazon_\x3Casin>_review_rows_factual.json" --output-dir "\x3Cworkspace>/amazon-review-output"
- Check coverage before deciding whether keyword expansion is worth the extra requests:
python scripts/amazon_review_workbook.py coverage-check --url "\x3Camazon-url>" --db-path "\x3Cworkspace>/amazon-review-output/amazon_review_cache.sqlite3"
- Build canonical tags and a lightweight tagging payload:
python scripts/amazon_review_workbook.py taxonomy-bootstrap --input-json "\x3Cworkspace>/amazon-review-output/amazon_\x3Casin>_review_rows_translated.json" --output-dir "\x3Cworkspace>/amazon-review-output"
python scripts/amazon_review_workbook.py prepare-tagging --input-json "\x3Cworkspace>/amazon-review-output/amazon_\x3Casin>_review_rows_translated.json" --output-dir "\x3Cworkspace>/amazon-review-output" --canonical-tags-json "\x3Cworkspace>/amazon-review-output/canonical_tags.json"
taxonomy-bootstrap is only for building a stable canonical vocabulary for the batch. prepare-tagging consumes the full factual or translated JSON and emits a trimmed *_tagging_input.json that contains pending rows only plus cache metadata. Do not use that trimmed file as the merge source.
- Read references/tagging-guidelines.md, let the model fill only the pending rows in a separate labels JSON, then merge the labels back into the full base JSON and build the final workbook:
python scripts/amazon_review_workbook.py merge-build --base-json "\x3Cworkspace>/amazon-review-output/amazon_\x3Casin>_review_rows_translated.json" --labels-json "\x3Cworkspace>/amazon-review-output/amazon_\x3Casin>_labels.json" --output-dir "\x3Cworkspace>/amazon-review-output" --taxonomy-version "v1" --strict
Workflow
1. Verify prerequisites
- Confirm
doctorreports a validasin. - Confirm
chrome_debug_readyistrue. - If you plan to use
translate, confirmdeeplx_env_readyistrue. - If
deeplx_reachableisfalse, do not block the workflow; let the model fill评论中文版during tagging.
If any of these fail, read references/setup.md before continuing.
2. Use the smallest command that fits
- For raw review collection only: use
collect - For factual extraction plus workbook scaffolding: use
intake - For deciding whether a keyword pass is still needed: use
coverage-check - For rebuilding the tuned keyword state from historical data: use
keyword-autotune - For machine translation of
评论中文版: usetranslate - For canonical tag sampling: use
taxonomy-bootstrap - For cache-aware lightweight model input: use
prepare-tagging - For writing the final labeled workbook: use
merge-build
Examples:
python scripts/amazon_review_workbook.py collect --url "\x3Camazon-url>" --output-dir "\x3Cworkspace>/amazon-review-output"
python scripts/amazon_review_workbook.py translate --input-json "\x3Cworkspace>/amazon-review-output/amazon_\x3Casin>_review_rows_factual.json" --output-dir "\x3Cworkspace>/amazon-review-output"
python scripts/amazon_review_workbook.py coverage-check --url "\x3Camazon-url>" --db-path "\x3Cworkspace>/amazon-review-output/amazon_review_cache.sqlite3"
python scripts/amazon_review_workbook.py keyword-autotune --output-dir "\x3Cworkspace>/amazon-review-output" --db-path "\x3Cworkspace>/amazon-review-output/amazon_review_cache.sqlite3"
python scripts/amazon_review_workbook.py taxonomy-bootstrap --input-json "\x3Cworkspace>/amazon-review-output/amazon_\x3Casin>_review_rows_translated.json" --output-dir "\x3Cworkspace>/amazon-review-output"
python scripts/amazon_review_workbook.py prepare-tagging --input-json "\x3Cworkspace>/amazon-review-output/amazon_\x3Casin>_review_rows_translated.json" --output-dir "\x3Cworkspace>/amazon-review-output" --canonical-tags-json "\x3Cworkspace>/amazon-review-output/canonical_tags.json"
python scripts/amazon_review_workbook.py merge-build --base-json "\x3Cworkspace>/amazon-review-output/amazon_\x3Casin>_review_rows_translated.json" --labels-json "\x3Cworkspace>/amazon-review-output/amazon_\x3Casin>_labels.json" --output-dir "\x3Cworkspace>/amazon-review-output" --taxonomy-version "v1" --strict
3. Keep the workbook stable
The factual and final workbooks always use the 14-column schema in references/output-schema.md.
Do not silently add or remove columns. If a field is unavailable from the page, leave it blank rather than inventing a value.
4. Tag rows only after grounding on the factual file
The model should not invent from the product page alone. Ground semantic tagging on the factual JSON/workbook created by intake or translate.
Keep the two JSON shapes distinct:
*_tagging_input.jsonfromprepare-taggingis the cropped machine prompt payload for the model--base-jsonformerge-buildmust be the full factual/translated record set, not the cropped tagging payload--labels-jsonis the model's completed semantic output for the pending rows only
If translate prints translation_mode=model_fallback, fill 评论中文版 in the same tagging pass instead of waiting for DeepLX.
Use references/tagging-guidelines.md when filling:
评论概括情感倾向类别分类标签重点标记
The preferred fast path is:
taxonomy-bootstrapto build a canonical tag vocabulary for this batchprepare-taggingto create a minimal pending-row payload- model labeling only for pending rows, written into a separate labels JSON
merge-buildto update cache and export the final workbook from the full base JSON
Collection Defaults
intakeandcollectno longer run keyword expansion implicitly indeepmode.deepnow means the 18 combo pass only.- Run
coverage-checkafter intake to compare current rows vs Amazon's visiblereviewscount before deciding to spend more requests. - Use
--keywordsonly when you explicitly want a keyword pass. - Use
--keywordswith no values to run the built-in keyword preset for the selected--keyword-profile. - Use
--keywords foo bar bazto provide an explicit keyword list. - Default pacing now inserts a
2.5sgap between combos/keywords to reduce rate-limit risk. - Built-in profiles:
generic: universal consumer-product termselectronics: universal terms + common app/setup/hardware termsdashcam: electronics profile + recording/night/parking/GPS/Wi-Fi/mount terms
- Default keyword reuse policy is
successful: keywords that have produced results before are skipped on later runs; recent zero-result keywords are also suppressed for72hto avoid immediate retries. - If you really want to brute-force rerun every keyword, use
--keyword-reuse-scope none. - A tuned state file at
\x3Coutput-dir>/keyword_tuning_state.jsonis now read automatically when present, and refreshed after keyword runs so the skill gradually reorders towards higher-yield terms. keyword-autotunecan also ingest old keyword-run JSON reports via--report-globto seed the tuned state from historical experiments.
Failure Boundaries
Do not claim success if any of these is true:
- The script did not reach a real review page.
- The expected XLSX/CSV for the current phase was not generated.
- Review links, review time, or helpful votes were guessed rather than extracted.
- The model tagged rows without first grounding on the factual JSON/workbook.
- The cropped
*_tagging_input.jsonwas used as--base-jsonformerge-build. - The model re-labeled rows that were already cached for the same taxonomy version.
- The workflow still claims a 13-column contract after
评论用户名was added as a real output column.
Resources
- references/setup.md: first-run machine setup and environment requirements
- references/output-schema.md: fixed 14-column workbook contract
- references/tagging-guidelines.md: semantic labeling rules after factual collection
- scripts/amazon_review_workbook.py: portable CLI for doctor/collect/intake/coverage-check/keyword-autotune/translate/taxonomy-bootstrap/prepare-tagging/merge-build
- scripts/review_delivery_schema.py: workbook schema, normalization, and XLSX/CSV writer
- scripts/deeplx_translate.py: optional DeepLX translation helper
- scripts/label_workflow.py: cache, heuristics, bootstrap, and merge logic for faster labeling
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install amazon-review-workbook - After installation, invoke the skill by name or use
/amazon-review-workbook - Provide required inputs per the skill's parameter spec and get structured output
What is Amazon Review Workbook?
Collect all customer reviews from an Amazon product URL or product-reviews URL through a logged-in Chrome session on port 9222, export a 14-column factual wo... It is an AI Agent Skill for Claude Code / OpenClaw, with 110 downloads so far.
How do I install Amazon Review Workbook?
Run "/install amazon-review-workbook" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is Amazon Review Workbook free?
Yes, Amazon Review Workbook is completely free, licensed under MIT-0. You can download, install and use it at no cost.
Which platforms does Amazon Review Workbook support?
Amazon Review Workbook is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created Amazon Review Workbook?
It is built and maintained by aduo6668 (@aduo6668); the current version is v1.0.3.