PDF Skill

Name: PDF Skill
Rating: 67 (1 reviews)
Author: sciclaw

Trust Score 67/100

Handles common PDF tasks: extract text and tables, merge/split files, rotate/watermark pages, and run OCR to make scanned documents searchable.

triggers:extract text from pdfocr this scanned documentmerge or split pdfsrotate pageswatermark pdfextract tables from pdf

GitHub SKILL.md

What it does

The PDF Skill provides pragmatic, scriptable operations for working with PDF artifacts in reproducible workflows. It helps agents extract text and tables, merge or split documents, rotate or watermark pages, and perform OCR on scanned pages. The skill emphasises preservation of page order and metadata, validation of output page counts, and recording provenance for transformations.

When to use it

Use this skill whenever a user asks to read, transform, or extract data from PDF files — for example: extracting tables for data analysis, converting scanned reports to searchable text, splitting a multi-article PDF into separate files, or applying consistent watermarks to produced documents. It is intended for automated pipelines where reproducibility and validation matter.

What's included

Scripts: none bundled in this SKILL.md (examples reference pypdf usage)
References: skill notes reference common Python libraries and reproducible rules
Instructions: working rules include preserving metadata, validating page counts, and recording command-level provenance. Quick code snippets show how to inspect page counts with pypdf.

Compatible agents

This skill is language- and tool-agnostic but clearly targets agents with Python runtime support (Copilot/Codex, Claude Code, and other automation agents that can run Python snippets). It is well suited for CLI-capable agents integrated into reproducible research pipelines.

Audit Summary

The PDF skill provides minimal instructions for common PDF tasks like text extraction, merging, splitting, and OCR. It contains only a brief SKILL.md with a short pypdf code snippet and no bundled scripts. The skill is essentially a thin wrapper around pypdf with no executable automation, making it more of a reference card than a functional skill.

Watch Out

No scripts included — relies entirely on agent improvisation with pypdf
No error handling or dependency management guidance

Notes

Very thin skill — mostly a trigger phrase list and a single pypdf code snippet. Attribution to Anthropic official source but the content is minimal. Would benefit from actual scripts for common operations and clearer output contracts.

Information

Repository: sciclaw
Stars: 9

Trust Score

Overall67

Security98

Code Quality38

Architecture35

Usefulness55

Related Skills

Development Worktree

Create an isolated git worktree for feature work, auto-run project setup, and verify a clean test baseline before development.

WRDS Query & ETL Enforcement

Standards and enforcement guidance for querying WRDS data and running SAS/ETL on the WRDS grid—includes query validation, SGE submission patterns, and performan

Academic Research Search

Search academic literature across multiple sources, deduplicate results, resolve DOIs, and surface trusted papers with concise takeaways.

Readwise Reader Document Management

Manage Readwise Reader documents: list, save, search, move, tag, highlight, export and bulk-edit via official and custom CLIs.

Bounty Hunter — Atlas

Persona skill: 'Atlas' — a profit-focused developer persona for discovering, evaluating and executing paid bounties or freelance tasks with ROI-aware workflows.

Junshi — Research Advisor

Daily strategic research advisor that scans arXiv/venues, digests papers, and proposes bold, ranked research ideas tailored to the user's profile.

Full Stack Builder

End-to-end builder that scaffolds, implements, tests, and optionally deploys web and API applications from a natural-language specification.

ezBookkeeping API Tools

Command-line API tools for ezBookkeeping: record and query transactions, retrieve accounts/categories/tags, and fetch exchange rates for self-hosted personal fi

Back to Skills