SWARM Safety Skill
SWARM is a research and simulation framework for studying emergent risks in multi-agent AI systems. It focuses on soft (probabilistic) labels rather than binary judgments and provides agents, scenarios, metrics, and governance mechanisms to explore behaviors like collusion, deception, and policy interventions.
What it helps with
- Running local simulations of heterogeneous agent populations (honest, opportunistic, deceptive, adversarial, LLM-driven)
- Measuring soft metrics: toxicity rate, quality gap, conditional loss, incoherence
- Experimenting with governance levers: transaction taxes, reputation decay, circuit breakers, audits, staking, collusion detection
- Exporting results for analysis (JSON/CSV) and running reproducible scenarios
Example usage
- Install via pip (swarm-safety) or run from source
- Use provided CLI to list and run scenarios, or the Python API to programmatically configure orchestration and agents
- Start a local API for experiment management (binds to localhost by default)
Security notes
- The API binds to 127.0.0.1 by default; do not expose to untrusted networks without adding authentication and proper DB/infra. Do not include real API keys or PII in scenarios.
Pas encore audité
Cette compétence n'a pas encore été examinée par notre pipeline d'audit automatisé.