
from arize-skills19
Create, run, and analyze Arize experiments for evaluating and comparing model performance using the ax CLI.
Provides step-by-step CLI guidance to create, export, run, and compare Arize experiments. It explains the experiment/run/dataset concepts, how to export datasets and collect runs, and how to run statistical comparisons and exports for further analysis. Concrete workflows and command examples (ax CLI) are included for common tasks such as exporting runs, creating experiments, and piping outputs into analysis tools.
Use this skill when you need to evaluate model performance, run A/B model comparisons, export experiment runs for analysis, or automate experiment creation from dataset exports. Trigger when the user asks about creating experiments, exporting runs, comparing models, benchmarking, or measuring accuracy.
Best suited for agents with shell/CLI capabilities and access to the ax CLI and networked model provider SDKs (e.g., Claude Code, Codex, Copilot/CLI-enabled agents).
This skill has not been reviewed by our automated audit pipeline yet.