AI News — March 26, 2026

The latest from the AI and MCP ecosystem, curated daily.

LangChain shares their internal methodology for building evals for Deep Agents — how they source data, design metrics, and run targeted experiments over time to make agents more accurate and reliable. The key principle: the best evals directly measure a specific agent behaviour you care about, not proxies. Practical guidance backed by LangChain's own production experience.

Today's stories:

How We Build Evals for Deep Agents (LangChain Blog) — LangChain shares their internal methodology for building evals for Deep Agents — how they source data, design metrics, and run targeted experiments over time to make agents more accurate and reliable. The key principle: the best evals directly measure a specific agent behaviour you care about, not proxies. Practical guidance backed by LangChain's own production experience.
How Middleware Lets You Customize Your Agent Harness (LangChain Blog) — LangChain introduces Agent Middleware — a pattern for customising agent harnesses by injecting logic between the LLM and its environment without modifying core harness code. Covers the design pattern, use cases (logging, auth, rate limiting, retry logic), and how it enables teams to build application-specific harnesses on top of LangChain's Deep Agents framework.
Improving Composer Through Real-Time RL (Cursor Blog) — Cursor applies online reinforcement learning to Composer — serving new model checkpoints to production and using real user interactions as reward signals to ship improved checkpoints multiple times per day. A practical account of how they close the loop between production usage and model training at high cadence. Fascinating engineering for anyone building RL-trained coding models.
Build real-time conversational agents with Gemini 3.1 Flash Live (Google AI Blog) — Google released Gemini 3.1 Flash Live — a low-latency model variant optimised for real-time conversational agent use cases, with sub-500ms response times for voice and interactive chat. Covers the architecture, streaming API, and how to build agents that feel instant rather than waiting on generation. Key for anyone building voice or real-time agent interfaces.

LangChain BlogMar 26, 2026

How We Build Evals for Deep Agents

agentsevalslangchaindeveloper-tools

Read original

Google AI BlogMar 26, 2026

Build real-time conversational agents with Gemini 3.1 Flash Live

Google launched Gemini 3.1 Flash Live and a Live API in Google AI Studio to power real-time voice and vision agents. This is significant for developers building conversational agents with low-latency audio/video streams and multi-modal state, enabling new live agent use cases like voice assistants and live vision-based workflows.

geminirealtimeagentsvoice

Read original

LangChain BlogMar 26, 2026

How Middleware Lets You Customize Your Agent Harness

LangChain introduces Agent Middleware — a pattern for customising agent harnesses by injecting logic between the LLM and its environment without modifying core harness code. Covers the design pattern, use cases (logging, auth, rate limiting, retry logic), and how it enables teams to build application-specific harnesses on top of LangChain's Deep Agents framework.

agentslangchainagent-harnessdeveloper-tools

Read original

Cursor BlogMar 26, 2026

Improving Composer Through Real-Time RL

Cursor applies online reinforcement learning to Composer — serving new model checkpoints to production and using real user interactions as reward signals to ship improved checkpoints multiple times per day. A practical account of how they close the loop between production usage and model training at high cadence. Fascinating engineering for anyone building RL-trained coding models.

cursorresearchreinforcement-learningcoding

Read original

Today's stories:

How We Build Evals for Deep Agents (LangChain Blog) — LangChain shares their internal methodology for building evals for Deep Agents — how they source data, design metrics, and run targeted experiments over time to make agents more accurate and reliable. The key principle: the best evals directly measure a specific agent behaviour you care about, not proxies. Practical guidance backed by LangChain's own production experience.
How Middleware Lets You Customize Your Agent Harness (LangChain Blog) — LangChain introduces Agent Middleware — a pattern for customising agent harnesses by injecting logic between the LLM and its environment without modifying core harness code. Covers the design pattern, use cases (logging, auth, rate limiting, retry logic), and how it enables teams to build application-specific harnesses on top of LangChain's Deep Agents framework.
Improving Composer Through Real-Time RL (Cursor Blog) — Cursor applies online reinforcement learning to Composer — serving new model checkpoints to production and using real user interactions as reward signals to ship improved checkpoints multiple times per day. A practical account of how they close the loop between production usage and model training at high cadence. Fascinating engineering for anyone building RL-trained coding models.
Build real-time conversational agents with Gemini 3.1 Flash Live (Google AI Blog) — Google released Gemini 3.1 Flash Live — a low-latency model variant optimised for real-time conversational agent use cases, with sub-500ms response times for voice and interactive chat. Covers the architecture, streaming API, and how to build agents that feel instant rather than waiting on generation. Key for anyone building voice or real-time agent interfaces.

How We Build Evals for Deep Agents

agentsevalslangchaindeveloper-tools

Read original

Build real-time conversational agents with Gemini 3.1 Flash Live

geminirealtimeagentsvoice

Read original

How Middleware Lets You Customize Your Agent Harness

agentslangchainagent-harnessdeveloper-tools

Read original

Improving Composer Through Real-Time RL

cursorresearchreinforcement-learningcoding

Read original