/install open-browser-use
Open Browser Use
Overview
Open Browser Use connects an MV3 Chrome extension, a local native messaging host, a CLI, SDKs, and an optional stdio MCP server so agents can automate a real Chrome profile. It is not Codex.app-specific; adapt the commands, MCP config, and SDK examples to the agent runtime you are operating in.
Core Workflow
- Check setup with
open-browser-use pingorobu ping. If it fails because setup is missing, read references/installation.md. - Pick the right Chrome profile if multiple are installed. See "Multi-profile handling" below before issuing browser commands.
- Choose a unique browser session id for the current agent task before opening or claiming tabs. Prefer the surrounding runtime's conversation/session id when available; otherwise create a short unique id such as
obu-\x3Ctask-slug>-\x3Ctimestamp>. Reuse that same id for every Open Browser Use command in this task. - Name the current browser task group before opening or claiming tabs. Use a short task label followed by
- OBU; if no better task label is available, useTask - OBU. - Use the CLI for simple inspection or one-shot actions:
info,tabs,user-tabs,history,open-tab,navigate,cdp, andcall. - Use
open-browser-use run/obu runfor CLI-level multi-step orchestration when a small line-oriented action plan is enough and writing SDK code would be unnecessary. - If the surrounding agent runtime supports local MCP servers, configure
obu mcpand call the exposed browser tools directly. Use therun_action_planMCP tool for the same line-oriented orchestration from MCP. Read references/sdk-and-protocol.md. - Use the JavaScript, Python, or Go SDK for larger multi-step workflows, event subscriptions, richer control flow, or when the surrounding agent runtime already runs code. Read references/sdk-and-protocol.md.
- Before ending browser work, release or keep session tabs with
open-browser-use finalize-tabs --session-id "$OBU_SESSION_ID" --keep '\x3Cjson-array>', the MCPfinalize_tabstool, or the SDKfinalizeTabs/finalize_tabs/FinalizeTabsmethod. - If communication fails after setup, read references/troubleshooting.md.
Operating Rules
- Treat the browser as the user's real Chrome profile. Do not inspect cookies, passwords, session stores, or unrelated browser data.
- Ask the user before installing the extension, opening Chrome for them, enabling extension permissions, uploading local files, reading/writing clipboard data, submitting forms, purchasing, deleting, sending, or making other externally visible changes.
- Do not assume Codex.app helpers, Node REPL globals, or a bundled plugin UI are available. Use the installed
open-browser-use/obuCLI or the published SDKs. - Do not guess tab ids. List tabs first, then use ids returned by
tabs,user-tabs,open-tab, or SDK calls. - Prefer
claim-tab/claimUserTabfor existing user tabs. Claiming should be based on the currentuser-tabsresult and visible evidence such as URL, title, recency, or group. - Use
--socketonly when the user or runtime provides an explicit socket. Otherwise let the CLI and SDKs discover the active socket registry. - Do not rely on the CLI fallback session
obu-clifor agent tasks. Always pass a task-unique--session-idto CLI and MCP commands, or setsessionId/session_id/SessionIDin SDK clients. The fallback exists for quick manual use and can reuse stale task groups across unrelated agent sessions. - Direct CLI subcommands and
open-browser-use runcan share the same browser session only when they use the same explicit--session-id. Finalize that same session before ending browser work. - Use
call --method \x3Cmethod> --params '\x3Cjson>'only when no safer convenience command or SDK wrapper exists.
Multi-profile handling
Some users run Chrome with several profiles (work, personal, side accounts). If more than one profile has the Open Browser Use extension installed, the agent must decide which profile this task should operate on rather than silently picking whatever Chrome window happens to be active.
-
Before any browser command, list installed profiles:
open-browser-use profiles --connectedColumns:
DIRECTORY(stable id likeDefault,Profile 1),DISPLAY NAME(what the user sees in the Chrome avatar menu),VERSION, andCONNECTED(whether that profile's host is currently reachable). JSON output is available via--json. -
If exactly one profile is installed and connected, proceed without asking. If it is installed but not connected, ask the user to open Chrome on that profile before running browser commands.
-
If multiple profiles are installed and the user did not already specify which one to use, ask before the first browser command. List both directory name and display name so the user can recognize them, and include whether each profile is connected.
-
If the chosen profile is not connected, ask the user to open Chrome on that profile before retrying. Do not silently fall back to a different connected profile.
-
After the user has chosen, pass
--profile \x3Cselector>to every CLI / MCP command for the rest of the task. The selector accepts either the directory name (Default,Profile 1) or the display name (Eva,cookiy.com), case-insensitive. Do not switch profiles mid-task. -
If
--profiledoes not match any running host, the CLI prints which profiles are currently connected. Ask the user to open Chrome on the chosen profile, then retry; do not silently fall back to a different profile. -
For MCP, lock the profile at server start:
[mcp_servers.open_browser_use] command = "obu" args = ["mcp", "--session-id", "obu-\x3Ctask-id>", "--profile", "\x3Cselector>"]Do not pass profile as a per-tool-call argument — the MCP server applies the start-time selector to every call.
-
Do not remember the user's profile choice across unrelated tasks. A future task may belong to a different profile; ask again rather than assuming.
Common CLI Actions
export OBU_SESSION_ID="obu-docs-scan-$(date +%Y%m%d%H%M%S)"
open-browser-use ping --session-id "$OBU_SESSION_ID"
open-browser-use info --session-id "$OBU_SESSION_ID"
open-browser-use name-session --session-id "$OBU_SESSION_ID" --name "Task - OBU"
open-browser-use tabs --session-id "$OBU_SESSION_ID"
open-browser-use user-tabs --session-id "$OBU_SESSION_ID"
open-browser-use history --session-id "$OBU_SESSION_ID" --query "example" --limit 20
open-browser-use open-tab --session-id "$OBU_SESSION_ID" --url https://example.com
open-browser-use navigate --session-id "$OBU_SESSION_ID" --tab-id \x3Ctab-id> --url https://example.com
open-browser-use cdp --session-id "$OBU_SESSION_ID" --tab-id \x3Ctab-id> --method Runtime.evaluate --params '{"expression":"document.title"}'
open-browser-use finalize-tabs --session-id "$OBU_SESSION_ID" --keep '[]'
For CLI-level orchestration without writing SDK code, use a line-oriented action plan:
open-browser-use run --session-id "$OBU_SESSION_ID" -c '
name-session "Docs scan - OBU"
open-tab https://docs.browser-use.com
wait-load domcontentloaded
page-info
finalize-tabs []
'
Each action line shares one session/turn. open-tab and claim-tab set the
default tab for later tab-scoped actions such as wait-load, page-info,
navigate, cdp, move-mouse, and wait-file-chooser.
Use obu as the short alias when available.
MCP Usage
For runtimes that can launch local MCP servers over stdio, use:
[mcp_servers.open_browser_use]
command = "obu"
args = ["mcp", "--session-id", "obu-\x3Ctask-or-conversation-id>"]
Use a fresh --session-id value per agent task or conversation. If the runtime
has a stable conversation/session id, derive the MCP --session-id from it.
The MCP server exposes tools including user_tabs, open_tab, claim_tab,
navigate, wait_load, page_info, cdp, history, run_action_plan,
finalize_tabs, and unrestricted call.
Use run_action_plan when the runtime wants to execute the same compact action
plan format available through open-browser-use run without shelling out for
each individual browser operation.
Tab Lifecycle
- Session tabs are tabs Open Browser Use has created or claimed for the current agent workflow.
- Use one unique session id per agent task or conversation. Do not share the fallback
obu-clisession across unrelated tasks. - Task session groups should be named from the task, using the pattern
\x3Cshort task> - OBU. UseTask - OBUas the fallback name. - Keep no tabs by default:
open-browser-use finalize-tabs --session-id "$OBU_SESSION_ID" --keep '[]'. - Keep a tab only when the user needs that live page after the turn. Omit research, source, search, intermediate, duplicate, blank, error, and login/navigation tabs after extracting what you need.
- Keep a tab with
status: "deliverable"when the tab itself is the user-facing output or requested open page, such as a created or edited document, dashboard, checkout/cart, submitted form result, or a page the user explicitly asked to inspect directly. - Keep a tab with
status: "handoff"only when the task is still in progress and the user or a later turn should continue from the current task group, such as a page waiting for user input, login, approval, payment, CAPTCHA, or an unfinished workflow. - Handoff tabs stay in the task session group. Deliverable tabs move to the shared
✅ Open Browser Usetab group. - Run finalization as the last Open Browser Use browser action for the turn. Do not call Open Browser Use browser tools after finalizing; if more browser work is needed, do it first and finalize once with the final tab disposition.
File Choosers, Downloads, And Clipboard
- File uploads use the intercepted file chooser flow: start waiting, trigger the chooser in the page, then set absolute local paths with
set-file-chooser-filesor the SDK equivalent. - Downloads can be observed with SDK notification handlers or Browser Use methods such as
waitForDownloadanddownloadPath. - Clipboard helpers operate through the current controlled tab and should be treated as sensitive user actions.
References
- references/installation.md: one-time CLI and browser extension setup, including cases where user cooperation is required.
- references/sdk-and-protocol.md: JavaScript, Python, Go, socket, and JSON-RPC usage details.
- references/troubleshooting.md: connection failures, stale sockets, extension/native host checks, and permission issues.
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install open-browser-use - 安装完成后,直接呼叫该 Skill 的名称或使用
/open-browser-use触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
Open Browser Use 是什么?
Platform-neutral guidance for using Open Browser Use, the open-source Chrome automation stack for AI agents. Use when an agent needs to install, verify, trou... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 66 次。
如何安装 Open Browser Use?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install open-browser-use」即可一键安装,无需额外配置。
Open Browser Use 是免费的吗?
是的,Open Browser Use 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
Open Browser Use 支持哪些平台?
Open Browser Use 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 Open Browser Use?
由 Leo(@ifuryst)开发并维护,当前版本 v0.1.0。