Browser Automation with browser-use CLI
The browser-use command provides fast, persistent browser automation. It maintains browser sessions across commands, enabling complex multi-step workflows.
Installation
bash
1# Run without installing (recommended for one-off use)
2uvx browser-use[cli] open https://example.com
3
4# Or install permanently
5uv pip install browser-use[cli]
6
7# Install browser dependencies (Chromium)
8browser-use install
Quick Start
bash
1browser-use open https://example.com # Navigate to URL
2browser-use state # Get page elements with indices
3browser-use click 5 # Click element by index
4browser-use type "Hello World" # Type text
5browser-use screenshot # Take screenshot
6browser-use close # Close browser
Core Workflow
- Navigate:
browser-use open <url> - Opens URL (starts browser if needed)
- Inspect:
browser-use state - Returns clickable elements with indices
- Interact: Use indices from state to interact (
browser-use click 5, browser-use input 3 "text")
- Verify:
browser-use state or browser-use screenshot to confirm actions
- Repeat: Browser stays open between commands
Browser Modes
bash
1browser-use --browser chromium open <url> # Default: headless Chromium
2browser-use --browser chromium --headed open <url> # Visible Chromium window
3browser-use --browser real open <url> # User's Chrome with login sessions
4browser-use --browser remote open <url> # Cloud browser (requires API key)
- chromium: Fast, isolated, headless by default
- real: Uses your Chrome with cookies, extensions, logged-in sessions
- remote: Cloud-hosted browser with proxy support (requires BROWSER_USE_API_KEY)
Commands
Navigation
bash
1browser-use open <url> # Navigate to URL
2browser-use back # Go back in history
3browser-use scroll down # Scroll down
4browser-use scroll up # Scroll up
Page State
bash
1browser-use state # Get URL, title, and clickable elements
2browser-use screenshot # Take screenshot (outputs base64)
3browser-use screenshot path.png # Save screenshot to file
4browser-use screenshot --full path.png # Full page screenshot
Interactions (use indices from browser-use state)
bash
1browser-use click <index> # Click element
2browser-use type "text" # Type text into focused element
3browser-use input <index> "text" # Click element, then type text
4browser-use keys "Enter" # Send keyboard keys
5browser-use keys "Control+a" # Send key combination
6browser-use select <index> "option" # Select dropdown option
Tab Management
bash
1browser-use switch <tab> # Switch to tab by index
2browser-use close-tab # Close current tab
3browser-use close-tab <tab> # Close specific tab
JavaScript & Data
bash
1browser-use eval "document.title" # Execute JavaScript, return result
2browser-use extract "all product prices" # Extract data using LLM (requires API key)
Python Execution (Persistent Session)
bash
1browser-use python "x = 42" # Set variable
2browser-use python "print(x)" # Access variable (outputs: 42)
3browser-use python "print(browser.url)" # Access browser object
4browser-use python --vars # Show defined variables
5browser-use python --reset # Clear Python namespace
6browser-use python --file script.py # Execute Python file
The Python session maintains state across commands. The browser object provides:
browser.url - Current page URL
browser.title - Page title
browser.goto(url) - Navigate
browser.click(index) - Click element
browser.type(text) - Type text
browser.screenshot(path) - Take screenshot
browser.scroll() - Scroll page
browser.html - Get page HTML
Agent Tasks (Requires API Key)
bash
1browser-use run "Fill the contact form with test data" # Run AI agent
2browser-use run "Extract all product prices" --max-steps 50
Agent tasks use an LLM to autonomously complete complex browser tasks. Requires BROWSER_USE_API_KEY or configured LLM API key (OPENAI_API_KEY, ANTHROPIC_API_KEY, etc).
Session Management
bash
1browser-use sessions # List active sessions
2browser-use close # Close current session
3browser-use close --all # Close all sessions
Server Control
bash
1browser-use server status # Check if server is running
2browser-use server stop # Stop server
3browser-use server logs # View server logs
Setup
bash
1browser-use install # Install Chromium and system dependencies
Global Options
| Option | Description |
|---|
--session NAME | Use named session (default: "default") |
--browser MODE | Browser mode: chromium, real, remote |
--headed | Show browser window (chromium mode) |
--profile NAME | Chrome profile (real mode only) |
--json | Output as JSON |
--api-key KEY | Override API key |
Session behavior: All commands without --session use the same "default" session. The browser stays open and is reused across commands. Use --session NAME to run multiple browsers in parallel.
API Key Configuration
Some features (run, extract, --browser remote) require an API key. The CLI checks these locations in order:
--api-key command line flag
BROWSER_USE_API_KEY environment variable
~/.config/browser-use/config.json file
To configure permanently:
bash
1mkdir -p ~/.config/browser-use
2echo '{"api_key": "your-key-here"}' > ~/.config/browser-use/config.json
Examples
bash
1browser-use open https://example.com/contact
2browser-use state
3# Shows: [0] input "Name", [1] input "Email", [2] textarea "Message", [3] button "Submit"
4browser-use input 0 "John Doe"
5browser-use input 1 "john@example.com"
6browser-use input 2 "Hello, this is a test message."
7browser-use click 3
8browser-use state # Verify success
Multi-Session Workflows
bash
1browser-use --session work open https://work.example.com
2browser-use --session personal open https://personal.example.com
3browser-use --session work state # Check work session
4browser-use --session personal state # Check personal session
5browser-use close --all # Close both sessions
bash
1browser-use open https://example.com/products
2browser-use python "
3products = []
4for i in range(20):
5 browser.scroll('down')
6browser.screenshot('products.png')
7"
8browser-use python "print(f'Captured {len(products)} products')"
Using Real Browser (Logged-In Sessions)
bash
1browser-use --browser real open https://gmail.com
2# Uses your actual Chrome with existing login sessions
3browser-use state # Already logged in!
Tips
- Always run
browser-use state first to see available elements and their indices
- Use
--headed for debugging to see what the browser is doing
- Sessions persist - the browser stays open between commands
- Use
--json for parsing output programmatically
- Python variables persist across
browser-use python commands within a session
- Real browser mode preserves your login sessions and extensions
- CLI aliases:
bu, browser, and browseruse all work identically to browser-use
Troubleshooting
Browser won't start?
bash
1browser-use install # Install/reinstall Chromium
2browser-use server stop # Stop any stuck server
3browser-use --headed open <url> # Try with visible window
Element not found?
bash
1browser-use state # Check current elements
2browser-use scroll down # Element might be below fold
3browser-use state # Check again
Session issues?
bash
1browser-use sessions # Check active sessions
2browser-use close --all # Clean slate
3browser-use open <url> # Fresh start
Cleanup
Always close the browser when done. Run this after completing browser automation: