
from ISC-Bench775
Benchmark and pipeline for evaluating model safety (Internal Safety Collapse) using the TVD framework; builds prompts, runs models, extracts harmful content, an
ISC-Bench is a reproducible pipeline for assessing Internal Safety Collapse (ISC) in large language models. It provides scripts to build benchmark prompts, run them against target models via OpenRouter, extract harmful content from model outputs, and score results on a 1–5 harmfulness scale. The repo supports multiple benchmarks (JailbreakBench, HarmBench, AdvBench, StrongREJECT) and agent modes for automated evaluation.
Use ISC-Bench when you need to quantitatively evaluate model safety, compare models across standard benchmarks, measure jailbreak/vulnerability rates, or run controlled experiments (zero-shot, few-shot, ICL). Appropriate for safety researchers, red-teamers, and engineering teams validating model behaviour.
experiment/ folders).uv run usage, Docker agent mode, and troubleshooting tips.Works with CLI-capable agents and evaluation tooling that can run Python/Docker workflows; integrates with OpenRouter-accessible models and typical LLM runners.
ISC-Bench is a safety benchmarking skill for evaluating LLM vulnerability to Internal Safety Collapse. The Python script is well-structured with concurrent execution and progress reporting, but failed to run due to missing 'openai' dependency. The shell script requires a template directory argument. SKILL.md body was empty in the fetch, limiting full assessment.
openaiLegitimate AI safety research tool. No hardcoded credentials, no destructive commands, no exfiltration. Shell script has minor style issues with inline Python but is functional. Niche audience — AI safety researchers and red-teamers.