KS
Killer-Skills

web-scraping — Categories.community

v1.0.0
GitHub

About this Skill

Ideal for Data Crawler Agents requiring efficient web page scraping and markdown file generation. Web scraper CLI and MCP built for human and coding agents

AstraBert AstraBert
[0]
[0]
Updated: 3/5/2026

Quality Score

Top 5%
34
Excellent
Based on code quality & docs
Installation
SYS Universal Install (Auto-Detect)
Cursor IDE Windsurf IDE VS Code IDE
> npx killer-skills add AstraBert/scpr

Agent Capability Analysis

The web-scraping MCP Server by AstraBert is an open-source Categories.community integration for Claude and other AI agents, enabling seamless task automation and capability expansion.

Ideal Agent Persona

Ideal for Data Crawler Agents requiring efficient web page scraping and markdown file generation.

Core Value

Empowers agents to scrape web pages using the `scpr` command line interface, handling recursive and parallel scraping with options like `--recursive` and `--max`, while saving outputs as markdown files.

Capabilities Granted for web-scraping MCP Server

Scraping single web pages for data extraction
Recursively scraping linked pages within a domain for comprehensive data collection
Speeding up data collection with parallel scraping

! Prerequisites & Limits

  • Requires CLI access
  • Limited to scraping pages with allowed domains
  • Dependent on network connectivity for web page access
Project
SKILL.md
1.2 KB
.cursorrules
1.2 KB
package.json
240 B
Ready
UTF-8

# Tags

[No tags]
SKILL.md
Readonly

When asked to scrape a web page, use the scpr command line interface.

Basic usage (scrape a single page):

bash
1scpr --url https://example.com --output ./scraped

This will scrape the page and save it as a markdown file in the ./scraped folder.

Recursive scraping

To scrape a page and all linked pages within the same domain:

bash
1scpr --url https://example.com --output ./scraped --recursive --allowed example.com --max 3

Parallel scraping

Speed up recursive scraping with multiple threads:

bash
1scpr --url https://example.com --output ./scraped --recursive --allowed example.com --max 2 --parallel 5

Additional options

  • --log - Set logging level (info, debug, warn, error)
  • --max - Maximum depth of pages to follow (default: 1)
  • --parallel - Number of concurrent threads (default: 1)
  • --allowed - Allowed domains for recursive scraping (can be specified multiple times)

For more details, run:

bash
1scpr --help

Once you are done with scraping, you should scan the output folder to find the content the user asked you for, here is an example flow:

bash
1scpr --url https://example.com --output ./scraped --recursive --allowed example.com --max 2 2cd ./scraped 3grep -r "pattern of interest"

Related Skills

Looking for an alternative to web-scraping or building a Categories.community AI Agent? Explore these related open-source MCP Servers.

View All

widget-generator

Logo of f
f

f.k.a. Awesome ChatGPT Prompts. Share, discover, and collect prompts from the community. Free and open source — self-host for your organization with complete privacy.

149.6k
0
Design

flags

Logo of vercel
vercel

flags is a Next.js feature management skill that enables developers to efficiently add or modify framework feature flags, streamlining React application development.

138.4k
0
Browser

zustand

Logo of lobehub
lobehub

The ultimate space for work and life — to find, build, and collaborate with agent teammates that grow with you. We are taking agent harness to the next level — enabling multi-agent collaboration, effortless agent team design, and introducing agents as the unit of work interaction.

72.8k
0
Communication

data-fetching

Logo of lobehub
lobehub

The ultimate space for work and life — to find, build, and collaborate with agent teammates that grow with you. We are taking agent harness to the next level — enabling multi-agent collaboration, effortless agent team design, and introducing agents as the unit of work interaction.

72.8k
0
Communication