Paper Claim Audit

Trust Score 91/100

Zero-context verification that every numeric claim in a paper matches raw result files; detects rounding, cherry-picking, config mismatches and scope overclaims

triggers:paper-claim-auditverify numberscheck paper claims审查论文数据verify numbers against raw results

GitHub SKILL.md

What it does

A zero-context auditor that compares quantitative claims in LaTeX paper sources against raw experiment result files to verify numbers, percentages, and comparative statements. It enforces strict rules (no executor summaries, standard rounding only) and produces machine-readable audit artifacts (JSON + Markdown) with detailed per-claim status.

When to use it

Run before submission or after automated paper-generation steps when you need a forensic check that the paper's reported numbers are faithful to raw outputs. Useful as a pre-flight for reproducibility, CI checks on paper pipelines, or editorial review.

What's included

Scripts: none bundled in this skill (has_scripts=false)
References: none packaged (has_references=false)
Instructions: stepwise audit protocol, output schema for PAPER_CLAIM_AUDIT.json, and clear rules for allowed inputs (only .tex + raw result files; explicit exclusion of summaries). Also documents reviewer model and trace-saving conventions.

Compatible agents

Best fit for Claude Code / Codex-style agents and any agent orchestration that can spawn a fresh reasoning thread and read files. The skill describes using strong reasoning models (e.g., GPT-5.x-class) as a separate reviewer thread.

Audit Summary

Paper claim audit skill for verifying numeric claims in academic papers against raw result files. SKILL.md-only skill with no executable scripts. Exceptionally detailed and well-structured documentation covering a comprehensive audit protocol with specific failure modes, cross-model reviewer pattern, and JSON schema output contract. Targets a niche but real need for ML researchers preparing paper submissions. No security concerns — uses allowed-tools broadly but appropriately for a file auditing task. Source path in DB pointed to wrong directory (other/ vs analysis/), and raw content fetch returned 404, suggesting the skill moved or was reorganized.

Watch Out

Source path in DB (skills/other/) returns 404; actual path is skills/analysis/
Niche audience — primarily ML researchers doing paper submission prep
Depends on spawn_agent/cross-model reviewer pattern that may not be available in all agent environments

Notes

SKILL.md body was null in fetch output and GitHub raw URL returned 404 at the stored path. Content was located via GitHub API at skills/analysis/ instead of skills/other/. The skill is well-crafted with detailed audit protocol, zero-context reviewer pattern, and proper verdict decision table. Uses allowed-tools: Bash(*) which is broad but appropriate for the task scope.

Information

Repository: claude-skill-registry
Stars: 466

Trust Score

Overall91

Security95

Code Quality85

Architecture88

Usefulness45

More from claude-skill-registry

Uloop: Execute Dynamic Code

Run small C# snippets in the Unity Editor via the uloop CLI for editor automation tasks like prefab wiring, AddComponent flows, and scene edits—without writing

Bookmarklet Creation

Generates browser-executable JavaScript bookmarklets with strict formatting (IIFE wrapper, block comments) and provides ready-to-install links or installer inst

Overnight — Autonomous Long-Running Coding

Orchestrates long-running coding goals: decomposes objectives into atomic tasks, dispatches isolated worktree workers, verifies acceptance criteria, and merges

Bexio API (Swiss CRM & Invoicing)

Integrate and manage Bexio contacts, quotes, invoices, orders and products via the Bexio API. Useful for CRM and Swiss business document workflows.

Content Research Writer

A writing-partner skill that helps research, outline, draft, cite, and iteratively improve articles, tutorials, and thought pieces.

Agent Hierarchy Diagram

Generate visual hierarchy diagrams (ASCII, Mermaid, GraphML) that show agent roles, levels, and delegation for documentation and onboarding.

Review Pull Request

Automated, structured PR reviewer: gathers metadata, diffs, CI results, dependency changes and provides a concise verdict with testing and documentation recomme

Agent Ops — Testing Workflow

Guidance for designing, running, and analyzing test suites for agents: test isolation, execution patterns, and coverage-based enforcement.

libagent

Agent orchestration library for conversational AI — coordinates LLM completions, memory, tool execution, and multi-turn flows; useful for building chat agents a

Raindrop.io API

Manage Raindrop.io bookmarks, collections, tags and highlights via the Raindrop REST API with helper scripts and examples.

Related Skills

Development Worktree

Create an isolated git worktree for feature work, auto-run project setup, and verify a clean test baseline before development.

Readwise Reader Document Management

Manage Readwise Reader documents: list, save, search, move, tag, highlight, export and bulk-edit via official and custom CLIs.

Bounty Hunter — Atlas

Persona skill: 'Atlas' — a profit-focused developer persona for discovering, evaluating and executing paid bounties or freelance tasks with ROI-aware workflows.

Junshi — Research Advisor

Daily strategic research advisor that scans arXiv/venues, digests papers, and proposes bold, ranked research ideas tailored to the user's profile.

Full Stack Builder

End-to-end builder that scaffolds, implements, tests, and optionally deploys web and API applications from a natural-language specification.

ezBookkeeping API Tools

Command-line API tools for ezBookkeeping: record and query transactions, retrieve accounts/categories/tags, and fetch exchange rates for self-hosted personal fi

Feishu Voice Sender

Convert MP3s and send them as native Feishu voice messages (playable voice clips) to users or groups.

Claw Bench

Benchmarking skill that guides an agent through a structured suite of capability tests and reporting steps for leaderboard submission.

Paper Claim Audit

What it does

When to use it

What's included

Compatible agents