Improving instruction hierarchy in frontier LLMs
OpenAI introduces IH-Challenge — a training approach that teaches models to correctly prioritise instructions from trusted sources over injected or untrusted ones. Improves resistance to prompt injection and makes models more steerable by operators. Research with direct implications for building safer agentic systems.

