Web Scraper

from shob460

Extract and crawl website content at scale — supports direct HTTP, browser rendering, and batch extraction for structured JSON output.

triggers:scrapecrawlextract contentmonitor websiteextract tablesbatch url processing

GitHub SKILL.md

What it does

This skill provides a general-purpose web scraping and crawling capability. It can fetch pages via HTTP, fall back to a headless browser for client-rendered sites, and return structured JSON containing extracted text, tables, and lists. Use it to extract article bodies, product data, or to monitor site changes programmatically. It supports batch URL processing and pagination handling for multi-page datasets.

When to use it

Use this skill when you need dependable content extraction from static or JavaScript-heavy sites, to collect product information, scrape tables, or build datasets from public web pages. Prefer it for research, data-mining, monitoring site updates, and cases where you want well-structured JSON output. Respect site terms of service and rate limits.

What's included

Scripts: none bundled with SKILL.md (examples present in the body).
References: none attached.
Instructions: clear task JSON format examples, extraction modes (auto, curl-only, browser-only), and best-practice guidance for delays and error handling. The body describes usage patterns for single-page extraction, batch processing, and data-mining workflows.

Compatible agents

Works with agent tooling that can run HTTP requests or control headless browsers (examples: Claude/Code-style agents, Cursor/Copilot integrations, or custom Node/Python runtimes).

Not yet audited

This skill has not been reviewed by our automated audit pipeline yet.

Information

Repository: shob
Stars: 460

Related Skills

AWP (Agent Work Protocol)

Tooling and scripts for onboarding, staking, allocation, and managing agents on the AWP network (Base/Ethereum/Arbitrum/BSC). Includes safe, opt-in daemon and r

Markdrop

Convert PDFs to structured Markdown or interactive HTML and generate AI-powered descriptions for images and tables using multiple LLM providers.

Cost Tracker

Monitor agent session costs, set budget alerts, and get actionable token-spend optimizations to keep multi-session workflows within budget.

ClawPod / Massive Unblocker

Bypass anti-bot restrictions and fetch rendered HTML or structured search results via Massive's Unblocker API (handles CAPTCHAs, JS rendering, geo-restrictions)

Canary — Post-Deploy Visual Monitor

Run a short post-deploy monitor that captures screenshots, checks console errors, and compares performance against baselines to detect regressions and page fail

Monitoring Stack Deployer

Deploy and configure production-ready monitoring stacks (Prometheus, Grafana, Datadog) with collectors, dashboards, and alerting rules for Kubernetes, Docker, o

Azure External Attack Surface Management

Provides expert guidance for Azure External Attack Surface Management (EASM): quotas, configuration, integrations, and exporting inventory to analytics platform

Cross-Project Analytics

Query local, privacy-safe cross-project analytics to report on agent, skill, hook, and team performance; replay sessions and estimate token costs.

Back to Skills

Web Scraper

from shob460

Extract and crawl website content at scale — supports direct HTTP, browser rendering, and batch extraction for structured JSON output.

triggers:scrapecrawlextract contentmonitor websiteextract tablesbatch url processing

GitHub SKILL.md

What it does

When to use it

What's included

Scripts: none bundled with SKILL.md (examples present in the body).
References: none attached.
Instructions: clear task JSON format examples, extraction modes (auto, curl-only, browser-only), and best-practice guidance for delays and error handling. The body describes usage patterns for single-page extraction, batch processing, and data-mining workflows.

Compatible agents

Works with agent tooling that can run HTTP requests or control headless browsers (examples: Claude/Code-style agents, Cursor/Copilot integrations, or custom Node/Python runtimes).

Not yet audited

This skill has not been reviewed by our automated audit pipeline yet.

Information

Repository: shob
Stars: 460

Related Skills

AWP (Agent Work Protocol)

Tooling and scripts for onboarding, staking, allocation, and managing agents on the AWP network (Base/Ethereum/Arbitrum/BSC). Includes safe, opt-in daemon and r

Markdrop

Convert PDFs to structured Markdown or interactive HTML and generate AI-powered descriptions for images and tables using multiple LLM providers.

Cost Tracker

Monitor agent session costs, set budget alerts, and get actionable token-spend optimizations to keep multi-session workflows within budget.

ClawPod / Massive Unblocker

Bypass anti-bot restrictions and fetch rendered HTML or structured search results via Massive's Unblocker API (handles CAPTCHAs, JS rendering, geo-restrictions)

Canary — Post-Deploy Visual Monitor

Run a short post-deploy monitor that captures screenshots, checks console errors, and compares performance against baselines to detect regressions and page fail

Monitoring Stack Deployer

Deploy and configure production-ready monitoring stacks (Prometheus, Grafana, Datadog) with collectors, dashboards, and alerting rules for Kubernetes, Docker, o

Azure External Attack Surface Management

Provides expert guidance for Azure External Attack Surface Management (EASM): quotas, configuration, integrations, and exporting inventory to analytics platform

Cross-Project Analytics

Query local, privacy-safe cross-project analytics to report on agent, skill, hook, and team performance; replay sessions and estimate token costs.

Web Scraper

What it does

When to use it

What's included

Compatible agents

Tags

Not yet audited

Information

Related Skills

Web Scraper

What it does

When to use it

What's included

Compatible agents

Tags

Not yet audited

Information

Related Skills