e2e
The platform for LLM evaluations and AI agent testing
Browse and install thousands of AI Agent skills in the Killer-Skills directory. Supports Claude Code, Windsurf, Cursor, and more.
The platform for LLM evaluations and AI agent testing
Self-hosted observability for AI coding agents. Clone. Configure. See.
Claude Code Session Dashboard — local observability for ~/.claude sessions
Debug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards.
Traceway: observability for LLM's
Observability is a skill that enables real-time monitoring of PAI multi-agent activity, providing insights into agent performance and behavior through WebSocket streaming.
Observability-guidelines is a set of principles and guidelines for ensuring comprehensive visibility into distributed systems and microservices, promoting modular design and test-driven development
Claude Code plugin for mobile app observability: crash reporting, performance monitoring, and instrumentation for iOS, Android, and React Native
A Model Context Protocol (MCP) server for Langfuse, enabling AI agents to query Langfuse trace data for enhanced debugging and observability
Reviews and authors Cloudflare Workers code against production best practices. Load when writing new Workers, reviewing Worker code, configuring wrangler.jsonc, or checking for common Workers anti-patterns (streaming, floating promises, global state, secrets, bindings, observability). Biases towards retrieval from Cloudflare docs over pre-trained knowledge.
AgentStack is a production-grade multi-agent framework built on Mastra, delivering 50+ enterprise tools, 25+ specialized agents, and A2A/MCP orchestration for scalable AI systems. Focuses on financial intelligence, RAG pipelines, observability, and secure governance. ACP Openclaw, Gemini CLI, Opencode
FastAPI Cursor Starter Kit: production backend template for Cursor/AI teams. API-only, layered (endpoints→services→repos), Celery+RabbitMQ RPC, PostgreSQL+Alembic. Built-in observability and quality (metrics, idempotency, Black, mypy, pre-commit). Cursor rules keep AI-generated code aligned so you extend without fighting the structure.