This skill provides a structured, interactive workflow for creating experiment.yaml configuration files for the AEC Bench tool. It guides the user through the process of selecting tasks, choosing AI agents and models, and defining execution settings to ensure a valid benchmark run.
Use this skill when a user wants to start a new benchmark experiment, modify an existing configuration, or plan a trial run with a dry run preview.
aec-bench.toml), task selection via datasets or disk scanning, agent configuration (including model selection from a compatibility matrix), and execution settings.Designed for agents capable of executing shell commands and reading/writing YAML files, such as Claude Code or similar autonomous IDE agents.
This skill has not been reviewed by our automated audit pipeline yet.