Rhesis enables agents to design, generate, and execute test suites against AI endpoints. It covers discovery (probing an endpoint's domain and behavior), structured plan creation (behaviors, test sets, metrics), test generation and execution, and result analysis — all via the Rhesis MCP server tools.
Use this skill when you need to validate or stress-test an AI model or chatbot: exploring capabilities, building reproducible test sets, running automated evaluations, or comparing test runs to detect regressions. It is appropriate for engineers and QA teams automating LLM evaluation workflows.
references/ for strategies and analysis.references/ to shape test generation and interpretation.Best for agents that can interact with MCP servers and async tasks (Claude Code, agents using MCP tooling, or other agent runtimes that can call platform tools).
This skill has not been reviewed by our automated audit pipeline yet.