Browsing:

Browse and install thousands of AI Agent skills in the Killer-Skills directory. Supports Claude Code, Windsurf, Cursor, and more.

22 available skills

typescript-sdk

Logo of comet-ml
comet-ml

Debug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards.

17.8k
0
AI

hugging-face-evaluation

[ Official ]
Logo of huggingface
huggingface

Add and manage evaluation results in Hugging Face model cards. Supports extracting eval tables from README content, importing scores from Artificial Analysis API, and running custom model evaluations with vLLM/lighteval. Works with the model-index metadata format.

8.2k
0
AI

agent-evaluation

Logo of oimiragieo
oimiragieo

agent-evaluation is a LLM-as-judge evaluation framework that assesses AI-generated content quality using a weighted composite score and structured verdict with evidence citations.

14
0
Developer

evaluation

Logo of mshraditya
mshraditya

Evaluation is a process for assessing agent systems, requiring approaches that account for dynamic decision-making and non-deterministic behavior.

0
0
Developer

e2e

Logo of langwatch
langwatch

The platform for LLM evaluations and AI agent testing

2.8k
0
AI

deep-research

[ Featured ]
Logo of affaan-m
affaan-m

deep-research is a skill that utilizes firecrawl and exa MCPs to synthesize findings from multiple sources, delivering comprehensive reports with source attribution.

105.8k
0
Developer

debug-stuck-eval

Logo of METR
METR

Running UK AISI's Inspect in the Cloud

20
0
AI

huggingface-community-evals

[ Official ]
Logo of huggingface
huggingface

huggingface-community-evals is a skill for running local evaluations of Hugging Face models using inspect-ai and lighteval.

10.0k
0
AI

api-rules

Logo of skysheng7
skysheng7

api-rules is a Python-based skill for evaluating different Large Language Models (LLMs) using the OpenAI API and supporting libraries.

0
0
Developer

context-loader

Logo of miyataSUPER
miyataSUPER

Multi-LLM comparison and evaluation framework for coaching scenarios

0
0
Developer

prd-phase

Logo of wiggitywhitney
wiggitywhitney

Research and evaluation framework for AI-powered telemetry instrumentation agents

0
0
Developer

sibyl-supervisor

Logo of Sibyl-Research-Team
Sibyl-Research-Team

sibyl-supervisor is a fully autonomous AI research system with self-evolution capabilities, designed to automate AI coding tasks on Claude Code.

150
0
Developer