PDF Processing (OpenAkita)

Name: PDF Processing (OpenAkita)
Rating: 85 (1 reviews)
Author: openakita

Trust Score 85/100

Process PDFs: extract text and tables, merge/split/rotate pages, OCR scanned PDFs, manipulate metadata, protect or watermark files, and generate new PDFs progra

triggers:pdfOCRextract tablesmerge PDFssplit PDFrotate pages

GitHub SKILL.md

What it does

This skill equips an agent to handle common PDF tasks using Python libraries and command-line tools: extract text and tables (pdfplumber, pypdf), merge and split PDFs, rotate pages, add watermarks, perform OCR on scanned documents (pytesseract + pdf2image), extract images, fill forms, and create PDFs (reportlab). It also documents command-line utilities (pdftotext, qpdf, pdftk) and provides runnable code snippets.

When to use it

Use when the user mentions a .pdf file or asks to read, extract, transform, or produce PDFs—examples include converting scanned documents to searchable text, extracting tabular data into spreadsheets, merging reports, adding watermarks or passwords, and programmatically generating reports.

What's included

Scripts: examples and utility snippets for Python (pypdf, pdfplumber, reportlab) and CLI tools (pdftotext, qpdf, pdftk) (has_scripts=true)
References: FORMS.md, REFERENCE.md and other repository docs for advanced usage
Instructions: step-by-step snippets for merging, splitting, extracting text/tables, OCR workflow, creating watermarks, and password-protecting PDFs.

Compatible agents

Compatible with agents that can run Python or command-line tools and have access to the filesystem for reading/writing PDF files.

Audit Summary

Comprehensive PDF processing skill covering text extraction, merging, splitting, OCR, form filling, and watermarking. 9 scripts provided; only the bounding-box tests ran cleanly — 6 scripts failed on missing Python deps (pypdf, pdfplumber, pdf2image, Pillow), 2 needed CLI args. SKILL.md is well-structured with clear instructions and progressive disclosure to REFERENCE.md and FORMS.md.

Watch Out

Requires pypdf, pdfplumber, pdf2image, Pillow — not installed by default
fill_fillable_fields.py monkey-patches pypdf internally which may break on version changes
Most scripts require specific CLI args and exit with usage help if omitted

Missing Dependencies

pypdfpdf2imagepdfplumberPillow

Notes

No security concerns. The monkey-patching in fill_fillable_fields.py is a workaround for a pypdf issue — functional but fragile. Skill is well-documented with practical examples. The proprietary license is worth noting.

Information

Repository: openakita
Stars: 1,766

Trust Score

Overall85

Security95

Code Quality73

Architecture78

Usefulness82

More from openakita

Gmail Automation via Rube MCP

Automate Gmail actions (send, reply, search, labels, drafts, attachments) through a Rube MCP Gmail toolkit with best-practice tool sequences and pitfalls noted.

Nano Banana 2 — Gemini 3.1 Flash Image Preview

Run Google Gemini 3.1 Flash Image Preview via inference.sh CLI: text-to-image, image editing, multi-image input, and Google Search grounding.

XLSX / Spreadsheet Skill

Handle creation, editing, cleaning, and conversion of spreadsheet files (.xlsx, .xlsm, .csv, .tsv) with robust formulas, formatting, and recalculation workflows

Baidu Search

Enable agents to perform real-time Chinese web and image searches via Baidu Qianfan API, including time filtering and relevance/authority signals.

PPTX — Presentation (.pptx) Skill

Create, read, edit, and QA .pptx presentations: extract text, generate thumbnails, edit templates, and convert slides for visual QA and exports.

Related Skills

Development Worktree

Create an isolated git worktree for feature work, auto-run project setup, and verify a clean test baseline before development.

Readwise Reader Document Management

Manage Readwise Reader documents: list, save, search, move, tag, highlight, export and bulk-edit via official and custom CLIs.

Bounty Hunter — Atlas

Persona skill: 'Atlas' — a profit-focused developer persona for discovering, evaluating and executing paid bounties or freelance tasks with ROI-aware workflows.

Junshi — Research Advisor

Daily strategic research advisor that scans arXiv/venues, digests papers, and proposes bold, ranked research ideas tailored to the user's profile.

Full Stack Builder

End-to-end builder that scaffolds, implements, tests, and optionally deploys web and API applications from a natural-language specification.

ezBookkeeping API Tools

Command-line API tools for ezBookkeeping: record and query transactions, retrieve accounts/categories/tags, and fetch exchange rates for self-hosted personal fi

Feishu Voice Sender

Convert MP3s and send them as native Feishu voice messages (playable voice clips) to users or groups.

Claw Bench

Benchmarking skill that guides an agent through a structured suite of capability tests and reporting steps for leaderboard submission.

Back to Skills