Ideal for Automation Agents requiring persistent browser sessions and multi-step workflow capabilities. Intelligent multi-agent system for personalized coding education. Features 7 AI agents, adaptive learning, secure code execution, gamification, and social learning. Built with FastAPI, React, PostgreSQL, and Docker. 356 tests, 90%+ coverage. Spec-driven development with Kiro CLI.

How do I install browser-use?

Run the command: npx killer-skills add khushparag/agentic_learning_coach. It works with Cursor, Windsurf, VS Code, Claude Code, and 19+ other IDEs.

What are the use cases for browser-use?

Key use cases include: Automating multi-step browser workflows, Debugging browser-based applications with persistent sessions, Generating browser automation scripts for Chromium.

Which IDEs are compatible with browser-use?

This skill is compatible with Cursor, Windsurf, VS Code, Trae, Claude Code, OpenClaw, Aider, Codex, OpenCode, Goose, Cline, Roo Code, Kiro, Augment Code, Continue, GitHub Copilot, Sourcegraph Cody, and Amazon Q Developer. Use the Killer-Skills CLI for universal one-command installation.

Are there any limitations for browser-use?

Requires installation of browser dependencies (Chromium). Dependent on uvx and pip for installation and execution.

Browser Automation with browser-use CLI

Name: browser-use
Availability: InStock
Author: khushparag

The browser-use command provides fast, persistent browser automation. It maintains browser sessions across commands, enabling complex multi-step workflows.

Installation

bash
1# Run without installing (recommended for one-off use)
2uvx browser-use[cli] open https://example.com
3
4# Or install permanently
5uv pip install browser-use[cli]
6
7# Install browser dependencies (Chromium)
8browser-use install

Quick Start

bash
1browser-use open https://example.com           # Navigate to URL
2browser-use state                              # Get page elements with indices
3browser-use click 5                            # Click element by index
4browser-use type "Hello World"                 # Type text
5browser-use screenshot                         # Take screenshot
6browser-use close                              # Close browser

Core Workflow

Navigate: browser-use open <url> - Opens URL (starts browser if needed)
Inspect: browser-use state - Returns clickable elements with indices
Interact: Use indices from state to interact (browser-use click 5, browser-use input 3 "text")
Verify: browser-use state or browser-use screenshot to confirm actions
Repeat: Browser stays open between commands

Browser Modes

bash
1browser-use --browser chromium open <url>      # Default: headless Chromium
2browser-use --browser chromium --headed open <url>  # Visible Chromium window
3browser-use --browser real open <url>          # User's Chrome with login sessions
4browser-use --browser remote open <url>        # Cloud browser (requires API key)

chromium: Fast, isolated, headless by default
real: Uses your Chrome with cookies, extensions, logged-in sessions
remote: Cloud-hosted browser with proxy support (requires BROWSER_USE_API_KEY)

Commands

bash
1browser-use open <url>                    # Navigate to URL
2browser-use back                          # Go back in history
3browser-use scroll down                   # Scroll down
4browser-use scroll up                     # Scroll up

Page State

bash
1browser-use state                         # Get URL, title, and clickable elements
2browser-use screenshot                    # Take screenshot (outputs base64)
3browser-use screenshot path.png           # Save screenshot to file
4browser-use screenshot --full path.png    # Full page screenshot

Interactions (use indices from `browser-use state`)

bash
1browser-use click <index>                 # Click element
2browser-use type "text"                   # Type text into focused element
3browser-use input <index> "text"          # Click element, then type text
4browser-use keys "Enter"                  # Send keyboard keys
5browser-use keys "Control+a"              # Send key combination
6browser-use select <index> "option"       # Select dropdown option

Tab Management

bash
1browser-use switch <tab>                  # Switch to tab by index
2browser-use close-tab                     # Close current tab
3browser-use close-tab <tab>               # Close specific tab

JavaScript & Data

bash
1browser-use eval "document.title"         # Execute JavaScript, return result
2browser-use extract "all product prices"  # Extract data using LLM (requires API key)

Python Execution (Persistent Session)

bash
1browser-use python "x = 42"               # Set variable
2browser-use python "print(x)"             # Access variable (outputs: 42)
3browser-use python "print(browser.url)"   # Access browser object
4browser-use python --vars                 # Show defined variables
5browser-use python --reset                # Clear Python namespace
6browser-use python --file script.py       # Execute Python file

The Python session maintains state across commands. The browser object provides:

browser.url - Current page URL
browser.title - Page title
browser.goto(url) - Navigate
browser.click(index) - Click element
browser.type(text) - Type text
browser.screenshot(path) - Take screenshot
browser.scroll() - Scroll page
browser.html - Get page HTML

Agent Tasks (Requires API Key)

bash
1browser-use run "Fill the contact form with test data"    # Run AI agent
2browser-use run "Extract all product prices" --max-steps 50

Agent tasks use an LLM to autonomously complete complex browser tasks. Requires BROWSER_USE_API_KEY or configured LLM API key (OPENAI_API_KEY, ANTHROPIC_API_KEY, etc).

Session Management

bash
1browser-use sessions                      # List active sessions
2browser-use close                         # Close current session
3browser-use close --all                   # Close all sessions

Server Control

bash
1browser-use server status                 # Check if server is running
2browser-use server stop                   # Stop server
3browser-use server logs                   # View server logs

Setup

bash
1browser-use install                       # Install Chromium and system dependencies

Global Options

Option	Description
`--session NAME`	Use named session (default: "default")
`--browser MODE`	Browser mode: chromium, real, remote
`--headed`	Show browser window (chromium mode)
`--profile NAME`	Chrome profile (real mode only)
`--json`	Output as JSON
`--api-key KEY`	Override API key

Session behavior: All commands without --session use the same "default" session. The browser stays open and is reused across commands. Use --session NAME to run multiple browsers in parallel.

API Key Configuration

Some features (run, extract, --browser remote) require an API key. The CLI checks these locations in order:

--api-key command line flag
BROWSER_USE_API_KEY environment variable
~/.config/browser-use/config.json file

To configure permanently:

bash
1mkdir -p ~/.config/browser-use
2echo '{"api_key": "your-key-here"}' > ~/.config/browser-use/config.json

Examples

Form Submission

bash
1browser-use open https://example.com/contact
2browser-use state
3# Shows: [0] input "Name", [1] input "Email", [2] textarea "Message", [3] button "Submit"
4browser-use input 0 "John Doe"
5browser-use input 1 "john@example.com"
6browser-use input 2 "Hello, this is a test message."
7browser-use click 3
8browser-use state  # Verify success

Multi-Session Workflows

bash
1browser-use --session work open https://work.example.com
2browser-use --session personal open https://personal.example.com
3browser-use --session work state    # Check work session
4browser-use --session personal state  # Check personal session
5browser-use close --all             # Close both sessions

Data Extraction with Python

bash
1browser-use open https://example.com/products
2browser-use python "
3products = []
4for i in range(20):
5    browser.scroll('down')
6browser.screenshot('products.png')
7"
8browser-use python "print(f'Captured {len(products)} products')"

Using Real Browser (Logged-In Sessions)

bash
1browser-use --browser real open https://gmail.com
2# Uses your actual Chrome with existing login sessions
3browser-use state  # Already logged in!

Tips

Always run browser-use state first to see available elements and their indices
Use --headed for debugging to see what the browser is doing
Sessions persist - the browser stays open between commands
Use --json for parsing output programmatically
Python variables persist across browser-use python commands within a session
Real browser mode preserves your login sessions and extensions
CLI aliases: bu, browser, and browseruse all work identically to browser-use

Troubleshooting

Browser won't start?

bash
1browser-use install                   # Install/reinstall Chromium
2browser-use server stop               # Stop any stuck server
3browser-use --headed open <url>       # Try with visible window

Element not found?

bash
1browser-use state                     # Check current elements
2browser-use scroll down               # Element might be below fold
3browser-use state                     # Check again

Session issues?

bash
1browser-use sessions                  # Check active sessions
2browser-use close --all               # Clean slate
3browser-use open <url>                # Fresh start

Cleanup

Always close the browser when done. Run this after completing browser automation:

bash
1browser-use close

browser-use — community browser-use, agentic_learning_coach, community, ide skills, Claude Code, Cursor, Windsurf

About this Skill

Agent Capability Analysis

Ideal Agent Persona

Core Value

↓ Capabilities Granted for browser-use

! Prerequisites & Limits

Browser Sandbox Environment

⚡️ Ready to unleash?

browser-use