Perfect for Performance Optimization Agents needing advanced benchmarking and regression detection capabilities. Benchmark is a performance measurement tool that helps developers detect regressions, measure performance baselines, and compare stack alternatives for optimal code performance.

How do I install benchmark?

Run the command: npx killer-skills add affaan-m/everything-claude-code/benchmark. It works with Cursor, Windsurf, VS Code, Claude Code, and 19+ other IDEs.

What are the use cases for benchmark?

Key use cases include: Automating performance benchmarking for web applications, Detecting regressions before and after pull requests, Comparing performance of different stack alternatives, Measuring API performance under load, Optimizing build performance for faster development feedback loops.

Which IDEs are compatible with benchmark?

This skill is compatible with Cursor, Windsurf, VS Code, Trae, Claude Code, OpenClaw, Aider, Codex, OpenCode, Goose, Cline, Roo Code, Kiro, Augment Code, Continue, GitHub Copilot, Sourcegraph Cody, and Amazon Q Developer. Use the Killer-Skills CLI for universal one-command installation.

Are there any limitations for benchmark?

Requires browser MCP for page performance measurement. Needs API endpoints for API performance benchmarking. Limited to specific metrics, such as Core Web Vitals and API latency.

benchmark

Optimize your code with the benchmark AI agent skill, measuring performance baselines and detecting regressions to ensure seamless user experiences. Try it...

SKILL.md

Readonly

Benchmark — Performance Baseline & Regression Detection

Name: benchmark
Availability: InStock
Author: affaan-m

When to Use

Before and after a PR to measure performance impact
Setting up performance baselines for a project
When users report "it feels slow"
Before a launch — ensure you meet performance targets
Comparing your stack against alternatives

How It Works

Mode 1: Page Performance

Measures real browser metrics via browser MCP:

1. Navigate to each target URL
2. Measure Core Web Vitals:
   - LCP (Largest Contentful Paint) — target < 2.5s
   - CLS (Cumulative Layout Shift) — target < 0.1
   - INP (Interaction to Next Paint) — target < 200ms
   - FCP (First Contentful Paint) — target < 1.8s
   - TTFB (Time to First Byte) — target < 800ms
3. Measure resource sizes:
   - Total page weight (target < 1MB)
   - JS bundle size (target < 200KB gzipped)
   - CSS size
   - Image weight
   - Third-party script weight
4. Count network requests
5. Check for render-blocking resources

Mode 2: API Performance

Benchmarks API endpoints:

1. Hit each endpoint 100 times
2. Measure: p50, p95, p99 latency
3. Track: response size, status codes
4. Test under load: 10 concurrent requests
5. Compare against SLA targets

Mode 3: Build Performance

Measures development feedback loop:

1. Cold build time
2. Hot reload time (HMR)
3. Test suite duration
4. TypeScript check time
5. Lint time
6. Docker build time

Mode 4: Before/After Comparison

Run before and after a change to measure impact:

/benchmark baseline    # saves current metrics
# ... make changes ...
/benchmark compare     # compares against baseline

Output:

| Metric | Before | After | Delta | Verdict |
|--------|--------|-------|-------|---------|
| LCP | 1.2s | 1.4s | +200ms | WARNING: WARN |
| Bundle | 180KB | 175KB | -5KB | ✓ BETTER |
| Build | 12s | 14s | +2s | WARNING: WARN |

Output

Stores baselines in .ecc/benchmarks/ as JSON. Git-tracked so the team shares baselines.

Integration

CI: run /benchmark compare on every PR
Pair with /canary-watch for post-deploy monitoring
Pair with /browser-qa for full pre-ship checklist

benchmark — for Claude Code benchmark, everything-claude-code, official, for Claude Code, ide skills, performance benchmarking, regression detection, Core Web Vitals measurement, API endpoint benchmarking, build performance tracking, Claude Code

# Core Topics

Agent Capability Analysis

Ideal Agent Persona

Core Value

↓ Capabilities Granted for benchmark

! Prerequisites & Limits

Browser Sandbox Environment

⚡️ Ready to unleash?

benchmark

Benchmark — Performance Baseline & Regression Detection

When to Use

How It Works

Mode 1: Page Performance

Mode 2: API Performance

Mode 3: Build Performance

Mode 4: Before/After Comparison

Output

Integration

FAQ & Installation Steps

? Frequently Asked Questions

What is benchmark?

How do I install benchmark?

What are the use cases for benchmark?

Which IDEs are compatible with benchmark?

Are there any limitations for benchmark?

↓ How To Install

Related Skills

Looking for an alternative to benchmark or another official skill for your workflow? Explore these related open-source skills.

flags

extract-errors

fix

flow

Related Collections

benchmark — for Claude Code benchmark, everything-claude-code, official, for Claude Code, ide skills, performance benchmarking, regression detection, Core Web Vitals measurement, API endpoint benchmarking, build performance tracking, Claude Code

About this Skill

Features

# Core Topics

Agent Capability Analysis

Ideal Agent Persona

Core Value

↓ Capabilities Granted for benchmark

! Prerequisites & Limits

Browser Sandbox Environment

⚡️ Ready to unleash?

benchmark

Benchmark — Performance Baseline & Regression Detection

When to Use

How It Works

Mode 1: Page Performance

Mode 2: API Performance

Mode 3: Build Performance

Mode 4: Before/After Comparison

Output

Integration

FAQ & Installation Steps

? Frequently Asked Questions

What is benchmark?

How do I install benchmark?

What are the use cases for benchmark?

Which IDEs are compatible with benchmark?

Are there any limitations for benchmark?

↓ How To Install

Related Skills

Looking for an alternative to benchmark or another official skill for your workflow? Explore these related open-source skills.

flags

extract-errors

fix

flow

Related Collections