What is agent-device?

Perfect for Mobile Automation Agents needing advanced device control and debugging capabilities for iOS and Android devices. agent-device is a command-line interface (CLI) tool for controlling iOS and Android devices, designed for AI agents to facilitate mobile automation and testing.

How do I install agent-device?

Run the command: npx killer-skills add callstackincubator/agent-device. It works with Cursor, Windsurf, VS Code, Claude Code, and 19+ other IDEs.

What are the use cases for agent-device?

Key use cases include: Automating mobile device interactions for reproducible testing, Debugging crashes and issues on iOS and Android devices, Conducting structured exploratory QA bug hunts and reporting.

Which IDEs are compatible with agent-device?

This skill is compatible with Cursor, Windsurf, VS Code, Trae, Claude Code, OpenClaw, Aider, Codex, OpenCode, Goose, Cline, Roo Code, Kiro, Augment Code, Continue, GitHub Copilot, Sourcegraph Cody, and Amazon Q Developer. Use the Killer-Skills CLI for universal one-command installation.

Are there any limitations for agent-device?

Requires device setup and configuration. Limited to iOS and Android devices. Needs specific mode selection (Normal interaction flow, Debug/crash flow, Replay maintenance flow).

Mobile Automation with agent-device

Name: agent-device
Availability: InStock
Author: callstackincubator

For exploration, use snapshot refs. For deterministic replay, use selectors. For structured exploratory QA bug hunts and reporting, use ../dogfood/SKILL.md.

Start Here (Read This First)

Use this skill as a router, not a full manual.

Pick one mode:
- Normal interaction flow
- Debug/crash flow
- Replay maintenance flow
Run one canonical flow below.
Open references only if blocked.

Decision Map

No target context yet: devices -> pick target -> open.
Normal UI task: open -> snapshot -i -> press/fill -> diff snapshot -i -> close
Debug/crash: open <app> -> logs clear --restart -> reproduce -> network dump -> logs path -> targeted grep
Replay drift: replay -u <path> -> verify updated selectors
Remote multi-tenant run: allocate lease -> point client at remote daemon base URL -> run commands with tenant isolation flags -> heartbeat/release lease
Device-scope isolation run: set iOS simulator set / Android allowlist -> run selectors within scope only

Canonical Flows

1) Normal Interaction Flow

bash
1agent-device open Settings --platform ios
2agent-device snapshot -i
3agent-device press @e3
4agent-device diff snapshot -i
5agent-device fill @e5 "test"
6agent-device close

2) Debug/Crash Flow

bash
1agent-device open MyApp --platform ios
2agent-device logs clear --restart
3agent-device network dump 25
4agent-device logs path

Logging is off by default. Enable only for debugging windows. logs clear --restart requires an active app session (open <app> first).

3) Replay Maintenance Flow

bash
1agent-device replay -u ./session.ad

4) Remote Tenant Lease Flow (HTTP JSON-RPC)

bash
1# Client points directly at the remote daemon HTTP base URL.
2export AGENT_DEVICE_DAEMON_BASE_URL=http://mac-host.example:4310
3export AGENT_DEVICE_DAEMON_AUTH_TOKEN=<token>
4
5# Allocate lease
6curl -sS "${AGENT_DEVICE_DAEMON_BASE_URL}/rpc" \
7  -H "content-type: application/json" \
8  -H "Authorization: Bearer <token>" \
9  -d '{"jsonrpc":"2.0","id":"alloc-1","method":"agent_device.lease.allocate","params":{"runId":"run-123","tenantId":"acme","ttlMs":60000}}'
10
11# Use lease in tenant-isolated command execution
12agent-device \
13  --tenant acme \
14  --session-isolation tenant \
15  --run-id run-123 \
16  --lease-id <lease-id> \
17  session list --json
18
19# Heartbeat and release
20curl -sS "${AGENT_DEVICE_DAEMON_BASE_URL}/rpc" \
21  -H "content-type: application/json" \
22  -H "Authorization: Bearer <token>" \
23  -d '{"jsonrpc":"2.0","id":"hb-1","method":"agent_device.lease.heartbeat","params":{"leaseId":"<lease-id>","ttlMs":60000}}'
24curl -sS "${AGENT_DEVICE_DAEMON_BASE_URL}/rpc" \
25  -H "content-type: application/json" \
26  -H "Authorization: Bearer <token>" \
27  -d '{"jsonrpc":"2.0","id":"rel-1","method":"agent_device.lease.release","params":{"leaseId":"<lease-id>"}}'

Notes:

AGENT_DEVICE_DAEMON_BASE_URL makes the CLI skip local daemon discovery/startup and call the remote HTTP daemon directly.
AGENT_DEVICE_DAEMON_AUTH_TOKEN is sent in both the JSON-RPC request token and HTTP auth headers.
In remote daemon mode, --debug does not tail a local daemon.log; inspect logs on the remote host instead.

Command Skeleton (Minimal)

bash
1agent-device devices
2agent-device devices --platform ios --ios-simulator-device-set /tmp/tenant-a/simulators
3agent-device devices --platform android --android-device-allowlist emulator-5554,device-1234
4agent-device ensure-simulator --device "iPhone 16" --ios-simulator-device-set /tmp/tenant-a/simulators
5agent-device ensure-simulator --device "iPhone 16" --runtime com.apple.CoreSimulator.SimRuntime.iOS-18-4 --ios-simulator-device-set /tmp/tenant-a/simulators --boot
6agent-device open [app|url] [url]
7agent-device open [app] --relaunch
8agent-device close [app]
9agent-device install <app> <path-to-binary>
10agent-device reinstall <app> <path-to-binary>
11agent-device session list

Use boot only as fallback when open cannot find/connect to a ready target. For Android emulators by AVD name, use boot --platform android --device <avd-name>. For Android emulators without GUI, add --headless. Use --target mobile|tv with --platform (required) to pick phone/tablet vs TV targets (AndroidTV/tvOS).

Isolation scoping quick reference:

--ios-simulator-device-set <path> scopes iOS simulator discovery + command execution to one simulator set.
--android-device-allowlist <serials> scopes Android discovery/selection to comma/space separated serials.
Scope is applied before selectors (--device, --udid, --serial); out-of-scope selectors fail with DEVICE_NOT_FOUND.
With iOS simulator-set scope enabled, iOS physical devices are not enumerated.

Simulator provisioning quick reference:

Use ensure-simulator to create or reuse a named iOS simulator inside a device set before starting a session.
--device <name> is required (e.g. "iPhone 16 Pro"). --runtime <id> pins the runtime; omit to use the newest compatible one.
--boot boots it immediately. Returns udid, device, runtime, ios_simulator_device_set, created, booted.
Idempotent: safe to call repeatedly; reuses an existing matching simulator by default.

TV quick reference:

AndroidTV: open/apps use TV launcher discovery automatically.
TV target selection works on emulators/simulators and connected physical devices (AndroidTV + AppleTV).
tvOS: runner-driven interactions and snapshots are supported (snapshot, wait, press, fill, get, scroll, back, home, app-switcher, record and related selector flows).
tvOS back/home/app-switcher map to Siri Remote actions (menu, home, double-home) in the runner.
tvOS follows iOS simulator-only command semantics for helpers like pinch, settings, and push.

Snapshot and targeting

bash
1agent-device snapshot -i
2agent-device diff snapshot -i
3agent-device find "Sign In" click
4agent-device press @e1
5agent-device fill @e2 "text"
6agent-device is visible 'id="anchor"'

press is canonical tap command; click is an alias.

Utilities

bash
1agent-device appstate
2agent-device clipboard read
3agent-device clipboard write "token"
4agent-device keyboard status
5agent-device keyboard dismiss
6agent-device perf --json
7agent-device network dump [limit] [summary|headers|body|all]
8agent-device push <bundle|package> <payload.json|inline-json>
9agent-device trigger-app-event screenshot_taken '{"source":"qa"}'
10agent-device get text @e1
11agent-device screenshot out.png
12agent-device settings permission grant notifications
13agent-device settings permission reset camera
14agent-device trace start
15agent-device trace stop ./trace.log

Batch (when sequence is already known)

bash
1agent-device batch --steps-file /tmp/batch-steps.json --json

Performance Check

Use agent-device perf --json (or metrics --json) after open.
For detailed metric semantics, caveats, and interpretation guidance, see references/perf-metrics.md.

Guardrails (High Value Only)

Re-snapshot after UI mutations (navigation/modal/list changes).
Prefer snapshot -i; scope/depth only when needed.
Use refs for discovery, selectors for replay/assertions.
find "<query>" click --json returns { ref, locator, query, x, y } — all derived from the matched snapshot node. Do not rely on these fields from raw press/click responses for observability; use find instead.
Use fill for clear-then-type semantics; use type for focused append typing.
Use install for in-place app upgrades (keep app data when platform permits), and reinstall for deterministic fresh-state runs.
App binary format support for install/reinstall: Android .apk/.aab, iOS .app/.ipa.
Android .aab requires bundletool in PATH, or AGENT_DEVICE_BUNDLETOOL_JAR=<path-to-bundletool-all.jar> with java in PATH.
Android .aab optional: set AGENT_DEVICE_ANDROID_BUNDLETOOL_MODE=<mode> to control bundletool build-apks --mode (default: universal).
iOS .ipa: extract/install from Payload/*.app; when multiple app bundles are present, <app> is used as a bundle id/name hint.
iOS appstate is session-scoped; Android appstate is live foreground state. iOS responses include device_udid and ios_simulator_device_set for isolation verification.
iOS open responses include device_udid and ios_simulator_device_set to confirm which simulator handled the session.
Clipboard helpers: clipboard read / clipboard write <text> are supported on Android and iOS simulators; iOS physical devices are not supported yet.
Android keyboard helpers: keyboard status|get|dismiss report keyboard visibility/type and dismiss via keyevent when visible.
network dump is best-effort and parses HTTP(s) entries from the session app log file.
Biometric settings: iOS simulator supports settings faceid|touchid <match|nonmatch|enroll|unenroll>; Android supports settings fingerprint <match|nonmatch> where runtime tooling is available.
For AndroidTV/tvOS selection, always pair --target with --platform (ios, android, or apple alias); target-only selection is invalid.
push simulates notification delivery:
- iOS simulator uses APNs-style payload JSON.
- Android uses broadcast action + typed extras (string/boolean/number).
trigger-app-event requires app-defined deep-link hooks and URL template configuration (AGENT_DEVICE_APP_EVENT_URL_TEMPLATE or platform-specific variants).
trigger-app-event requires an active session or explicit selectors (--platform, --device, --udid, --serial); on iOS physical devices, custom-scheme triggers require active app context.
Canonical trigger behavior and caveats are documented in website/docs/docs/commands.md under App event triggers.
Permission settings are app-scoped and require an active session app: settings permission <grant|deny|reset> <camera|microphone|photos|contacts|notifications> [full|limited]
iOS simulator permission alerts: use alert wait then alert accept/dismiss — accept/dismiss retry internally for up to 2 s so you do not need manual sleeps. See references/permissions.md.
full|limited mode applies only to iOS photos; other targets reject mode.
On Android, non-ASCII fill/type may require an ADB keyboard IME on some system images; only install IME APKs from trusted sources and verify checksum/signature.
If using --save-script, prefer explicit path syntax (--save-script=flow.ad or ./flow.ad).
For tenant-isolated remote runs, always pass --tenant, --session-isolation tenant, --run-id, and --lease-id together.
Use short lease TTLs and heartbeat only while work is active; release leases immediately after run completion/failure.
Env equivalents for scoped runs: AGENT_DEVICE_IOS_SIMULATOR_DEVICE_SET (compat IOS_SIMULATOR_DEVICE_SET) and AGENT_DEVICE_ANDROID_DEVICE_ALLOWLIST (compat ANDROID_DEVICE_ALLOWLIST).
For explicit remote client mode, prefer AGENT_DEVICE_DAEMON_BASE_URL / --daemon-base-url instead of relying on local daemon metadata or loopback-only ports.

Security and Trust Notes

Prefer a preinstalled agent-device binary over on-demand package execution.
If install is required, pin an exact version (for example: npx --yes agent-device@<exact-version> --help).
Signing/provisioning environment variables are optional, sensitive, and only for iOS physical-device setup.
Logs/artifacts are written under ~/.agent-device; replay scripts write to explicit paths you provide.
For remote daemon mode, prefer AGENT_DEVICE_DAEMON_SERVER_MODE=http|dual on the host plus client-side AGENT_DEVICE_DAEMON_BASE_URL, with AGENT_DEVICE_HTTP_AUTH_HOOK and tenant-scoped lease admission where needed.
Keep logging off unless debugging and use least-privilege/isolated environments for autonomous runs.

Common Mistakes

Mixing debug flow into normal runs (keep logs off unless debugging).
Continuing to use stale refs after screen transitions.
Using URL opens with Android --activity (unsupported combination).
Treating boot as default first step instead of fallback.

agent-device — mobile automation for AI agents agent-device, community, mobile automation for AI agents, ide skills, iOS and Android device control, deterministic replay with agent-device, agent-device install, agent-device documentation, Claude Code, Cursor, Windsurf

About this Skill

Features

# Core Topics

Agent Capability Analysis

Ideal Agent Persona

Core Value

↓ Capabilities Granted for agent-device

! Prerequisites & Limits

Browser Sandbox Environment

⚡️ Ready to unleash?

agent-device

Mobile Automation with agent-device

Start Here (Read This First)

Decision Map

Canonical Flows

1) Normal Interaction Flow

2) Debug/Crash Flow

3) Replay Maintenance Flow

4) Remote Tenant Lease Flow (HTTP JSON-RPC)

Command Skeleton (Minimal)

Session and navigation

Snapshot and targeting

Utilities

Batch (when sequence is already known)

Performance Check

Guardrails (High Value Only)

Security and Trust Notes

Common Mistakes

References

FAQ & Installation Steps

? Frequently Asked Questions

What is agent-device?

How do I install agent-device?

What are the use cases for agent-device?

Which IDEs are compatible with agent-device?

Are there any limitations for agent-device?

↓ How To Install

Related Skills

Looking for an alternative to agent-device or another community skill for your workflow? Explore these related open-source skills.

widget-generator

flags

zustand

data-fetching