Unrolling the Codex agent loop
A technical breakdown of the Codex agent loop, detailing how the Codex CLI manages models and prompts via the Responses API. Essential reading for those building agentic CLI tools.
Le meilleur de l'écosystème IA et MCP, sélectionné chaque jour.
Sources
A technical breakdown of the Codex agent loop, detailing how the Codex CLI manages models and prompts via the Responses API. Essential reading for those building agentic CLI tools.

The MCP Core Maintainer team announces changes for 2026: Inna Harper and Basil Hosmer are stepping back, with new maintainers joining to keep the project moving. Also covers what's ahead for MCP governance as the protocol moves under the Agentic AI Foundation. Good context for anyone tracking MCP's organisational trajectory.
OpenAI shares their architecture for scaling PostgreSQL to handle millions of queries per second. The system leverages a combination of replicas, aggressive caching, and workload isolation to maintain performance for 800 million users.

OpenAI explains how to build a systematic eval harness for agent skills — turning skills into testable, scoreable artefacts you can improve over time. The post covers designing eval cases, running them via the Codex evals framework, and iterating on skill quality based on results. A must-read for anyone shipping agent skills to production.

Anthropic shares what they learned from three iterations of a performance engineering take-home assessment that Claude kept solving correctly — forcing them to redesign it to remain useful for hiring. The post is a practical guide to crafting technical evals that test genuine human reasoning rather than tasks AI can trivially handle. Useful for any team still using take-home coding challenges.
Hugging Face marks one year since DeepSeek-R1 shocked the AI world by matching frontier performance at a fraction of the cost, fundamentally changing assumptions about what open-source models can achieve. The post reflects on how that moment catalysed a wave of open-weight model development and reshaped the competitive landscape between open and closed AI.
OpenAI launches ChatGPT Go, providing worldwide access to the GPT-5.2 Instant model. It features higher usage limits and expanded memory to make advanced AI more accessible and affordable globally.
Hugging Face launched Open Responses — an open-source implementation of the Responses API spec, letting developers use the same agent-friendly API pattern against any open-weight model instead of being locked to OpenAI's hosted version. Compatible with OpenAI's SDK, making migration straightforward. Important infrastructure for the open-source agent ecosystem.

Skyscanner integrated Codex CLI with JetBrains IDEs via MCP, enabling in-IDE AI assistance for debugging, test generation, and development workflows without switching to a separate tool. The post details their MCP server setup, the JetBrains plugin integration, and productivity gains observed across the engineering team. A solid real-world MCP adoption case study.

A practical breakdown of how to design, run, and interpret evaluations specifically for agentic systems — covering trajectory evals, outcome evals, LLM-as-judge approaches, and the unique challenges agents pose vs. static model benchmarks. Anthropic draws on their experience evaluating Claude Code and research agents. One of the most useful practical guides on agent evals published to date.
OpenAI introduces enterprise-grade AI for healthcare, focusing on HIPAA compliance and reducing administrative burden in clinical workflows.
A showcase of a voice-first AI companion leveraging GPT-5.1. It focuses on low-latency responses and memory-driven personalities for natural, real-time conversations.
OpenAI launches ChatGPT Health, a secure experience that integrates personal health data and apps. The feature emphasizes physician-informed design and strict privacy protections.

OpenAI's year-end retrospective for developers covering the biggest model, API, and platform milestones of 2025 — from the Responses API launch to Codex, GPT-5.4, Realtime API, and the shift toward production-grade agentic systems. A useful reference for understanding where the platform stands heading into 2026.

ServiceNow AI presents AprielGuard, a new safety guardrail designed to improve adversarial robustness in LLM systems. The framework focuses on preventing malicious prompts and ensuring safe model outputs in production environments.

OpenAI released new audio model snapshots and expanded Custom Voices access to production voice apps. The updates improve speech quality and latency in the Realtime API and make it easier to deploy branded voices in customer-facing products. Key update for teams building voice agents or voice-first interfaces.
OpenAI is utilizing automated red teaming and reinforcement learning to proactively identify and patch prompt injection vulnerabilities in ChatGPT Atlas. This is critical for the stability and security of AI agents interacting with browser environments.

The MCP Transport Working Group outlines plans to evolve beyond Streamable HTTP for enterprise-scale remote deployments — addressing the stateful connection bottleneck that makes load balancing and managed services difficult. Proposes a direction that preserves backward compatibility while enabling stateless, horizontally-scalable MCP server deployments. Critical reading for teams deploying MCP at scale.
OpenAI presents a framework for monitoring internal chain-of-thought reasoning in AI models. The research indicates that observing internal reasoning is more effective for scalable control than monitoring final outputs alone.
OpenAI released a system card addendum for GPT-5.2-Codex, detailing the technical specifications and safety evaluations for the coding-specialized model variant.