browsing — community browsing, superpowers-chrome, community, ide skills, Claude Code, Cursor, Windsurf

Verified
v1.0.0
GitHub

About this Skill

Ideal for AI Agents like Claude Code and AutoGPT requiring seamless Chrome browser control via DevTools Protocol. Claude Code plugin for direct Chrome browser control via DevTools Protocol - zero dependencies

obra obra
[0]
[0]
Updated: 3/5/2026

Agent Capability Analysis

The browsing skill by obra is an open-source community AI agent skill for Claude Code and other IDE workflows, helping agents execute tasks with better context, repeatability, and domain-specific guidance.

Ideal Agent Persona

Ideal for AI Agents like Claude Code and AutoGPT requiring seamless Chrome browser control via DevTools Protocol.

Core Value

Empowers agents to control Chrome directly, enabling features like authenticated session management and multi-tab handling through the DevTools Protocol, all without dependencies and with a unified interface.

Capabilities Granted for browsing

Automating authenticated sessions
Managing multiple tabs in a running browser instance
Debugging web applications with direct browser control

! Prerequisites & Limits

  • Requires Chrome browser installation
  • Dependent on DevTools Protocol compatibility
  • Not suitable for generating screenshots or PDFs
Labs Demo

Browser Sandbox Environment

⚡️ Ready to unleash?

Experience this Agent in a zero-setup browser environment powered by WebContainers. No installation required.

Boot Container Sandbox

browsing

Install browsing, an AI agent skill for AI agent workflows and automation. Works with Claude Code, Cursor, and Windsurf with one-command setup.

SKILL.md
Readonly

Browsing with Chrome Direct

Overview

Control Chrome via DevTools Protocol using the use_browser MCP tool. Single unified interface with auto-starting Chrome.

Announce: "I'm using the browsing skill to control Chrome."

When to Use

Use this when:

  • Controlling authenticated sessions
  • Managing multiple tabs in running browser
  • Playwright MCP unavailable or excessive

Use Playwright MCP when:

  • Need fresh browser instances
  • Generating screenshots/PDFs
  • Prefer higher-level abstractions

Auto-Capture

Every DOM action (navigate, click, type, select, eval, keyboard_press) automatically saves:

  • {prefix}.png — viewport screenshot
  • {prefix}.md — page content as structured markdown
  • {prefix}.html — full rendered DOM
  • {prefix}-console.txt — browser console messages

Files are saved to the session directory with sequential prefixes (001-navigate, 002-click, etc.). You must check these before using extract or screenshot actions.

The use_browser Tool

Single MCP tool with action-based interface. Chrome auto-starts on first use.

Parameters:

  • action (required): Operation to perform
  • tab_index (optional): Tab to operate on (default: 0)
  • selector (optional): CSS selector for element operations
  • payload (optional): Action-specific data
  • timeout (optional): Timeout in ms for await operations (default: 5000)

Actions Reference

  • navigate: Navigate to URL

    • payload: URL string
    • Example: {action: "navigate", payload: "https://example.com"}
  • await_element: Wait for element to appear

    • selector: CSS selector
    • timeout: Max wait time in ms
    • Example: {action: "await_element", selector: ".loaded", timeout: 10000}
  • await_text: Wait for text to appear

    • payload: Text to wait for
    • Example: {action: "await_text", payload: "Welcome"}

Interaction

  • click: Click element

    • selector: CSS selector
    • Example: {action: "click", selector: "button.submit"}
  • type: Type text into input (append \n to submit)

    • selector: CSS selector
    • payload: Text to type
    • Example: {action: "type", selector: "#email", payload: "user@example.com\n"}
  • select: Select dropdown option

    • selector: CSS selector
    • payload: Option value(s)
    • Example: {action: "select", selector: "select[name=state]", payload: "CA"}

Extraction

  • extract: Get page content

    • payload: Format ('markdown'|'text'|'html')
    • selector: Optional - limit to element
    • Example: {action: "extract", payload: "markdown"}
    • Example: {action: "extract", payload: "text", selector: "h1"}
  • attr: Get element attribute

    • selector: CSS selector
    • payload: Attribute name
    • Example: {action: "attr", selector: "a.download", payload: "href"}
  • eval: Execute JavaScript

    • payload: JavaScript code
    • Example: {action: "eval", payload: "document.title"}

Export

  • screenshot: Capture screenshot of a specific element
    • payload: Filename
    • selector: Optional - screenshot specific element
    • Viewport screenshots are auto-captured after every DOM action. Use this only when you need a specific element.
    • Example: {action: "screenshot", payload: "/tmp/chart.png", selector: ".chart"}

Tab Management

  • list_tabs: List all open tabs

    • Example: {action: "list_tabs"}
  • new_tab: Create new tab

    • Example: {action: "new_tab"}
  • close_tab: Close tab

    • tab_index: Tab to close
    • Example: {action: "close_tab", tab_index: 2}

Browser Mode Control

  • show_browser: Make browser window visible (headed mode)

    • Example: {action: "show_browser"}
    • ⚠️ WARNING: Restarts Chrome, reloads pages via GET, loses POST state
  • hide_browser: Switch to headless mode (invisible browser)

    • Example: {action: "hide_browser"}
    • ⚠️ WARNING: Restarts Chrome, reloads pages via GET, loses POST state
  • browser_mode: Check current browser mode, port, and profile

    • Example: {action: "browser_mode"}
    • Returns: {"headless": true|false, "mode": "headless"|"headed", "running": true|false, "port": 9222, "profile": "name", "profileDir": "/path"}

Profile Management

  • set_profile: Change Chrome profile (must kill Chrome first)

    • Example: {action: "set_profile", "payload": "browser-user"}
    • ⚠️ WARNING: Chrome must be stopped first
  • get_profile: Get current profile name and directory

    • Example: {action: "get_profile"}
    • Returns: {"profile": "name", "profileDir": "/path"}

Default behavior: Chrome starts in headless mode with "superpowers-chrome" profile on a dynamically allocated port (range 9222-12111). Override with CHROME_WS_PORT env var or MCP --port=N flag.

Critical caveats when toggling modes:

  1. Chrome must restart - Cannot switch headless/headed mode on running Chrome
  2. Pages reload via GET - All open tabs are reopened with GET requests
  3. POST state is lost - Form submissions, POST results, and POST-based navigation will be lost
  4. Session state is lost - Any client-side state (JavaScript variables, etc.) is cleared
  5. Cookies/auth may persist - Uses same user data directory, so logged-in sessions may survive

When to use headed mode:

  • Debugging visual rendering issues
  • Demonstrating browser behavior to user
  • Testing features that only work with visible browser
  • Debugging issues that don't reproduce in headless mode

When to stay in headless mode (default):

  • All other cases - faster, cleaner, less intrusive
  • Screenshots work perfectly in headless mode
  • Most automation works identically in both modes

Profile management: Profiles store persistent browser data (cookies, localStorage, extensions, auth sessions).

Profile locations:

  • macOS: ~/Library/Caches/superpowers/browser-profiles/{name}/
  • Linux: ~/.cache/superpowers/browser-profiles/{name}/
  • Windows: %LOCALAPPDATA%/superpowers/browser-profiles/{name}/

When to use separate profiles:

  • Default profile ("superpowers-chrome"): General automation, shared sessions
  • Agent-specific profiles: Isolate different agents' browser state
    • Example: browser-user agent uses "browser-user" profile
  • Task-specific profiles: Testing with different user contexts
    • Example: "test-logged-in" vs "test-logged-out"

Profile data persists across:

  • Chrome restarts
  • Mode toggles (headless ↔ headed)
  • System reboots (data is in cache directory)

To use a different profile:

  1. Kill Chrome if running: await chromeLib.killChrome()
  2. Set profile: {action: "set_profile", "payload": "my-profile"}
  3. Start Chrome: Next navigate/action will use new profile

Quick Start Pattern

Navigate and extract:
{action: "navigate", payload: "https://example.com"}
{action: "await_element", selector: "h1"}
{action: "extract", payload: "text", selector: "h1"}

Common Patterns

Fill and Submit Form

{action: "navigate", payload: "https://example.com/login"}
{action: "await_element", selector: "input[name=email]"}
{action: "type", selector: "input[name=email]", payload: "user@example.com"}
{action: "type", selector: "input[name=password]", payload: "pass123\n"}
{action: "await_text", payload: "Welcome"}

The \n at the end of the password submits the form.

Multi-Tab Workflow

{action: "list_tabs"}
{action: "click", tab_index: 2, selector: "a.email"}
{action: "await_element", tab_index: 2, selector: ".content"}
{action: "extract", tab_index: 2, payload: "text", selector: ".amount"}

Dynamic Content

{action: "navigate", payload: "https://example.com"}
{action: "type", selector: "input[name=q]", payload: "query"}
{action: "click", selector: "button.search"}
{action: "await_element", selector: ".results"}
{action: "extract", payload: "text", selector: ".result-title"}
{action: "navigate", payload: "https://example.com"}
{action: "await_element", selector: "a.download"}
{action: "attr", selector: "a.download", payload: "href"}

Execute JavaScript

{action: "eval", payload: "document.querySelectorAll('a').length"}
{action: "eval", payload: "Array.from(document.querySelectorAll('a')).map(a => a.href)"}

Resize Viewport (Responsive Testing)

Use eval to resize the browser window for testing responsive layouts:

{action: "eval", payload: "window.resizeTo(375, 812); 'Resized to mobile'"}
{action: "eval", payload: "window.resizeTo(768, 1024); 'Resized to tablet'"}
{action: "eval", payload: "window.resizeTo(1920, 1080); 'Resized to desktop'"}

Note: This resizes the window, not device emulation. It won't change:

  • Device pixel ratio (retina displays)
  • Touch events
  • User-Agent string

For most responsive testing, window resize is sufficient.

Clear Cookies

Use eval to clear cookies accessible to JavaScript:

{action: "eval", payload: "document.cookie.split(';').forEach(c => { document.cookie = c.trim().split('=')[0] + '=;expires=Thu, 01 Jan 1970 00:00:00 GMT;path=/'; }); 'Cookies cleared'"}

Note: This clears cookies accessible to JavaScript. It won't clear:

  • httpOnly cookies (server-side only)
  • Cookies from other domains

For most logout/reset scenarios, this is sufficient.

Scroll Page

{action: "eval", payload: "window.scrollTo(0, document.body.scrollHeight); 'Scrolled to bottom'"}
{action: "eval", payload: "window.scrollTo(0, 0); 'Scrolled to top'"}
{action: "eval", payload: "document.querySelector('.target').scrollIntoView(); 'Scrolled to element'"}

Tips

Always wait before interaction: Don't click or fill immediately after navigate - pages need time to load.

// BAD - might fail if page slow
{action: "navigate", payload: "https://example.com"}
{action: "click", selector: "button"}  // May fail!

// GOOD - wait first
{action: "navigate", payload: "https://example.com"}
{action: "await_element", selector: "button"}
{action: "click", selector: "button"}

Use specific selectors: Avoid generic selectors that match multiple elements.

// BAD - matches first button
{action: "click", selector: "button"}

// GOOD - specific
{action: "click", selector: "button[type=submit]"}
{action: "click", selector: "#login-button"}

Submit forms with \n: Append newline to text to submit forms automatically.

{action: "type", selector: "#search", payload: "query\n"}

Check content first: Extract page content to verify selectors before building workflow.

{action: "extract", payload: "html"}

Troubleshooting

Element not found:

  • Use await_element before interaction
  • Verify selector with extract action using 'html' format

Timeout errors:

  • Increase timeout: {timeout: 30000} for slow pages
  • Wait for specific element instead of text

Tab index out of range:

  • Use list_tabs to get current indices
  • Tab indices change when tabs close

eval returns [object Object]:

  • Use JSON.stringify() for complex objects: {action: "eval", payload: "JSON.stringify({name: 'test'})"}
  • For async functions: {action: "eval", payload: "JSON.stringify(await yourAsyncFunction())"}

Test Automation (Advanced)

<details> <summary>Click to expand test automation guidance</summary>

When building test automation, you have two approaches:

Approach 1: use_browser MCP (Simple Tests)

Best for: Single-step tests, direct Claude control during conversation

json
1{"action": "navigate", "payload": "https://app.com"} 2{"action": "click", "selector": "#test-button"} 3{"action": "eval", "payload": "JSON.stringify({passed: document.querySelector('.success') !== null})"}

Approach 2: chrome-ws CLI (Complex Tests)

Best for: Multi-step test suites, standalone automation scripts

Key insight: chrome-ws is the reference implementation showing proper Chrome DevTools Protocol usage. When use_browser doesn't work as expected, examine how chrome-ws handles the same operation.

bash
1# Example: Automated form testing 2./chrome-ws navigate 0 "https://app.com/form" 3./chrome-ws fill 0 "#email" "test@example.com" 4./chrome-ws click 0 "button[type=submit]" 5./chrome-ws wait-text 0 "Success"

When use_browser Fails

  1. Check chrome-ws source code - It shows the correct CDP pattern
  2. Use chrome-ws to verify - Test the same operation via CLI
  3. Adapt the pattern - Apply the working CDP approach to use_browser

Common Test Automation Patterns

  • Form validation: Fill forms, check error states
  • UI state testing: Click elements, verify DOM changes
  • Performance testing: Measure load times, capture metrics
  • Screenshot comparison: Capture before/after states
</details>

Advanced Usage

For command-line usage outside Claude Code, see COMMANDLINE-USAGE.md.

For detailed examples, see EXAMPLES.md.

Protocol Reference

Full CDP documentation: https://chromedevtools.github.io/devtools-protocol/

FAQ & Installation Steps

These questions and steps mirror the structured data on this page for better search understanding.

? Frequently Asked Questions

What is browsing?

Ideal for AI Agents like Claude Code and AutoGPT requiring seamless Chrome browser control via DevTools Protocol. Claude Code plugin for direct Chrome browser control via DevTools Protocol - zero dependencies

How do I install browsing?

Run the command: npx killer-skills add obra/superpowers-chrome. It works with Cursor, Windsurf, VS Code, Claude Code, and 19+ other IDEs.

What are the use cases for browsing?

Key use cases include: Automating authenticated sessions, Managing multiple tabs in a running browser instance, Debugging web applications with direct browser control.

Which IDEs are compatible with browsing?

This skill is compatible with Cursor, Windsurf, VS Code, Trae, Claude Code, OpenClaw, Aider, Codex, OpenCode, Goose, Cline, Roo Code, Kiro, Augment Code, Continue, GitHub Copilot, Sourcegraph Cody, and Amazon Q Developer. Use the Killer-Skills CLI for universal one-command installation.

Are there any limitations for browsing?

Requires Chrome browser installation. Dependent on DevTools Protocol compatibility. Not suitable for generating screenshots or PDFs.

How To Install

  1. 1. Open your terminal

    Open the terminal or command line in your project directory.

  2. 2. Run the install command

    Run: npx killer-skills add obra/superpowers-chrome. The CLI will automatically detect your IDE or AI agent and configure the skill.

  3. 3. Start using the skill

    The skill is now active. Your AI agent can use browsing immediately in the current project.

Related Skills

Looking for an alternative to browsing or another community skill for your workflow? Explore these related open-source skills.

View All

widget-generator

Logo of f
f

f.k.a. Awesome ChatGPT Prompts. Share, discover, and collect prompts from the community. Free and open source — self-host for your organization with complete privacy.

149.6k
0
AI

flags

Logo of vercel
vercel

flags is a Next.js feature management skill that enables developers to efficiently add or modify framework feature flags, streamlining React application development.

138.4k
0
Browser

zustand

Logo of lobehub
lobehub

The ultimate space for work and life — to find, build, and collaborate with agent teammates that grow with you. We are taking agent harness to the next level — enabling multi-agent collaboration, effortless agent team design, and introducing agents as the unit of work interaction.

72.8k
0
AI

data-fetching

Logo of lobehub
lobehub

The ultimate space for work and life — to find, build, and collaborate with agent teammates that grow with you. We are taking agent harness to the next level — enabling multi-agent collaboration, effortless agent team design, and introducing agents as the unit of work interaction.

72.8k
0
AI