benchmark — for Claude Code benchmark, everything-claude-code, official, for Claude Code, ide skills, performance benchmarking, regression detection, Core Web Vitals measurement, API endpoint benchmarking, build performance tracking, Claude Code

Verified
v1.0.0
GitHub

About this Skill

Perfect for Performance Optimization Agents needing advanced benchmarking and regression detection capabilities. Benchmark is a performance measurement tool that helps developers detect regressions, measure performance baselines, and compare stack alternatives for optimal code performance.

Features

Measure page performance using Core Web Vitals
Benchmark API endpoints for latency and response size
Track build performance metrics, including cold build time and hot reload time
Compare performance before and after code changes
Store baselines in .ecc/benchmarks/ for team reference

# Core Topics

affaan-m affaan-m
[116.8k]
[15188]
Updated: 3/30/2026

Agent Capability Analysis

The benchmark skill by affaan-m is an open-source official AI agent skill for Claude Code and other IDE workflows, helping agents execute tasks with better context, repeatability, and domain-specific guidance. Optimized for for Claude Code, performance benchmarking, regression detection.

Ideal Agent Persona

Perfect for Performance Optimization Agents needing advanced benchmarking and regression detection capabilities.

Core Value

Empowers agents to measure performance baselines, detect regressions, and compare stack alternatives using Core Web Vitals, API latency, and build performance metrics, including LCP, CLS, INP, FCP, and TTFB, via JSON output and Git-tracked baselines.

Capabilities Granted for benchmark

Automating performance benchmarking for web applications
Detecting regressions before and after pull requests
Comparing performance of different stack alternatives
Measuring API performance under load
Optimizing build performance for faster development feedback loops

! Prerequisites & Limits

  • Requires browser MCP for page performance measurement
  • Needs API endpoints for API performance benchmarking
  • Limited to specific metrics, such as Core Web Vitals and API latency
Labs Demo

Browser Sandbox Environment

⚡️ Ready to unleash?

Experience this Agent in a zero-setup browser environment powered by WebContainers. No installation required.

Boot Container Sandbox

benchmark

Optimize your code with the benchmark AI agent skill, measuring performance baselines and detecting regressions to ensure seamless user experiences. Try it...

SKILL.md
Readonly

Benchmark — Performance Baseline & Regression Detection

When to Use

  • Before and after a PR to measure performance impact
  • Setting up performance baselines for a project
  • When users report "it feels slow"
  • Before a launch — ensure you meet performance targets
  • Comparing your stack against alternatives

How It Works

Mode 1: Page Performance

Measures real browser metrics via browser MCP:

1. Navigate to each target URL
2. Measure Core Web Vitals:
   - LCP (Largest Contentful Paint) — target < 2.5s
   - CLS (Cumulative Layout Shift) — target < 0.1
   - INP (Interaction to Next Paint) — target < 200ms
   - FCP (First Contentful Paint) — target < 1.8s
   - TTFB (Time to First Byte) — target < 800ms
3. Measure resource sizes:
   - Total page weight (target < 1MB)
   - JS bundle size (target < 200KB gzipped)
   - CSS size
   - Image weight
   - Third-party script weight
4. Count network requests
5. Check for render-blocking resources

Mode 2: API Performance

Benchmarks API endpoints:

1. Hit each endpoint 100 times
2. Measure: p50, p95, p99 latency
3. Track: response size, status codes
4. Test under load: 10 concurrent requests
5. Compare against SLA targets

Mode 3: Build Performance

Measures development feedback loop:

1. Cold build time
2. Hot reload time (HMR)
3. Test suite duration
4. TypeScript check time
5. Lint time
6. Docker build time

Mode 4: Before/After Comparison

Run before and after a change to measure impact:

/benchmark baseline    # saves current metrics
# ... make changes ...
/benchmark compare     # compares against baseline

Output:

| Metric | Before | After | Delta | Verdict |
|--------|--------|-------|-------|---------|
| LCP | 1.2s | 1.4s | +200ms | WARNING: WARN |
| Bundle | 180KB | 175KB | -5KB | ✓ BETTER |
| Build | 12s | 14s | +2s | WARNING: WARN |

Output

Stores baselines in .ecc/benchmarks/ as JSON. Git-tracked so the team shares baselines.

Integration

  • CI: run /benchmark compare on every PR
  • Pair with /canary-watch for post-deploy monitoring
  • Pair with /browser-qa for full pre-ship checklist

FAQ & Installation Steps

These questions and steps mirror the structured data on this page for better search understanding.

? Frequently Asked Questions

What is benchmark?

Perfect for Performance Optimization Agents needing advanced benchmarking and regression detection capabilities. Benchmark is a performance measurement tool that helps developers detect regressions, measure performance baselines, and compare stack alternatives for optimal code performance.

How do I install benchmark?

Run the command: npx killer-skills add affaan-m/everything-claude-code/benchmark. It works with Cursor, Windsurf, VS Code, Claude Code, and 19+ other IDEs.

What are the use cases for benchmark?

Key use cases include: Automating performance benchmarking for web applications, Detecting regressions before and after pull requests, Comparing performance of different stack alternatives, Measuring API performance under load, Optimizing build performance for faster development feedback loops.

Which IDEs are compatible with benchmark?

This skill is compatible with Cursor, Windsurf, VS Code, Trae, Claude Code, OpenClaw, Aider, Codex, OpenCode, Goose, Cline, Roo Code, Kiro, Augment Code, Continue, GitHub Copilot, Sourcegraph Cody, and Amazon Q Developer. Use the Killer-Skills CLI for universal one-command installation.

Are there any limitations for benchmark?

Requires browser MCP for page performance measurement. Needs API endpoints for API performance benchmarking. Limited to specific metrics, such as Core Web Vitals and API latency.

How To Install

  1. 1. Open your terminal

    Open the terminal or command line in your project directory.

  2. 2. Run the install command

    Run: npx killer-skills add affaan-m/everything-claude-code/benchmark. The CLI will automatically detect your IDE or AI agent and configure the skill.

  3. 3. Start using the skill

    The skill is now active. Your AI agent can use benchmark immediately in the current project.

Related Skills

Looking for an alternative to benchmark or another official skill for your workflow? Explore these related open-source skills.

View All

flags

Logo of facebook
facebook

Use when you need to check feature flag states, compare channels, or debug why a feature behaves differently across release channels.

243.6k
0
Developer

extract-errors

Logo of facebook
facebook

Use when adding new error messages to React, or seeing unknown error code warnings.

243.6k
0
Developer

fix

Logo of facebook
facebook

Use when you have lint errors, formatting issues, or before committing code to ensure it passes CI.

243.6k
0
Developer

flow

Logo of facebook
facebook

Use when you need to run Flow type checking, or when seeing Flow type errors in React code.

243.6k
0
Developer