Skill Creator

Trust Score 84/100

Create, improve, and evaluate Agent Skills with a guided workflow: capture intent, draft SKILL.md, run evals and benchmarks, and optimize triggering description

triggers:create skillwrite SKILL.mdrun evalsbenchmark skilloptimize descriptionskill creator

GitHub SKILL.md

What it does

A complete authoring and evaluation workflow for Agent Skills. Guides an author through interviewing the user, drafting SKILL.md content, creating test cases, running with-skill and baseline evaluations, grading results, and producing a reviewer report and benchmark. Also includes tools for iterating on descriptions to improve triggering accuracy.

When to use it

Use this skill when you need to create a new Agent Skill from a user conversation, improve an existing SKILL.md, run repeatable evals and benchmarks, or optimize a skill's frontmatter and triggers for better activation. Useful when you want structured test cases, reproducible grading, and an HTML reviewer for human feedback.

What's included

Scripts: yes (eval/viewer and packaging scripts present)
References: yes (grading, analyzer, comparator schemas and conventions)
Instructions: step-by-step guidance for capturing intent, writing SKILL.md, spawning eval runs (with-skill and baseline), drafting assertions, grading, aggregating benchmarks, and launching a static reviewer. Also covers description optimization and packaging.

Compatible agents

Best suited for agents supporting subagents and script execution (Claude Code, CLI-capable agents, environments that can run Python).

Audit Summary

Comprehensive skill-creator workflow for building, evaluating, and iterating on agent skills. Includes eval framework with A/B testing, description optimization loop, and HTML report generation. Scripts are well-structured but most fail outside the repo context due to module path assumptions (from scripts.X import Y) and missing anthropic dependency. Only utils.py runs standalone cleanly.

Watch Out

Scripts use 'from scripts.X' imports — must run from repo root
Requires anthropic Python SDK for description optimization
Requires claude CLI for run_eval.py trigger testing
YAML frontmatter parser is simple — may break on complex multiline descriptions

Missing Dependencies

anthropic

Notes

Well-designed skill with clear progressive disclosure. SKILL.md is thorough with good instructions for both creating and improving skills. The eval/benchmark infrastructure is sophisticated. Main issue is scripts not designed to run independently — they assume repo-root execution context.

Information

Repository: claude-superskills
Stars: 23

Trust Score

Overall84

Security93

Code Quality72

Architecture78

Usefulness72

Related Skills

Development Worktree

Create an isolated git worktree for feature work, auto-run project setup, and verify a clean test baseline before development.

Readwise Reader Document Management

Manage Readwise Reader documents: list, save, search, move, tag, highlight, export and bulk-edit via official and custom CLIs.

Bounty Hunter — Atlas

Persona skill: 'Atlas' — a profit-focused developer persona for discovering, evaluating and executing paid bounties or freelance tasks with ROI-aware workflows.

Junshi — Research Advisor

Daily strategic research advisor that scans arXiv/venues, digests papers, and proposes bold, ranked research ideas tailored to the user's profile.

Full Stack Builder

End-to-end builder that scaffolds, implements, tests, and optionally deploys web and API applications from a natural-language specification.

ezBookkeeping API Tools

Command-line API tools for ezBookkeeping: record and query transactions, retrieve accounts/categories/tags, and fetch exchange rates for self-hosted personal fi

Feishu Voice Sender

Convert MP3s and send them as native Feishu voice messages (playable voice clips) to users or groups.

Claw Bench

Benchmarking skill that guides an agent through a structured suite of capability tests and reporting steps for leaderboard submission.

Back to Skills