
Introducing
Cursor introduces Organizations, allowing enterprises to manage multiple teams with centralized control over budgets, security, and governance. This simplifies scaling AI-powered development across large companies.
The latest from the AI and MCP ecosystem, curated daily.
Sources

Cursor introduces Organizations, allowing enterprises to manage multiple teams with centralized control over budgets, security, and governance. This simplifies scaling AI-powered development across large companies.

Anthropic shares strategies for using Claude to drive self-service data insights. The post highlights practical approaches to maximizing the model's analytical capabilities for data-driven decision making.

A deep dive into how Anthropic built and scaled hundreds of internal skills for Claude Code. It provides valuable lessons on agentic tooling and the architecture of specialized capabilities.
A practical implementation of the Model Context Protocol (MCP) to add tools to the Reachy Mini robot. This showcases how MCP can bridge LLMs with physical robotic hardware.

Anthropic provides a guide on effectively using Claude Cowork compared to Claude Code or standard chat. It focuses on workflow delegation and concrete steps for integrating agentic collaboration into development.
Hcompany introduces Holo3.1, focusing on high-performance local computer use agents. This release emphasizes reduced latency and improved local execution for agentic UI interactions.
OpenAI reports on how Codex is evolving into a broad productivity tool for knowledge work, automating research and data analysis workflows. This signals a shift toward more integrated AI-driven professional productivity.

Claude Code now supports dynamic multi-agent harnesses, allowing it to orchestrate complex workflows on the fly. This enables more robust task execution through specialized, temporary agent structures.

JetBrains has released Mellum2, a 12B parameter Mixture-of-Experts (MoE) model. The model is designed for high-performance AI tasks, continuing JetBrains' push into specialized developer-centric AI models.

IBM Research explores the necessity of 'agent logic' for scaling AI in enterprises. The piece argues that moving beyond simple LLM prompts to structured agentic workflows is key to reliable AI adoption.
OpenAI frontier models and Codex are now generally available on AWS. This integration allows enterprises to leverage OpenAI's capabilities within their existing AWS environments and procurement workflows.

NVIDIA has released Cosmos 3, an open omni-model designed for physical AI reasoning and action. This model aims to bridge the gap between digital intelligence and physical world interaction, providing a foundation for robotics and autonomous systems.
Braintrust engineers are utilizing Codex and GPT-5.5 to automate the transition from customer requests to production code. This demonstrates an advanced agentic workflow for accelerating experimental development and deployment cycles.
A comprehensive beginner's guide to using torch.profiler for PyTorch performance analysis. Essential for developers looking to identify bottlenecks and optimize model execution speed.
Endava leverages OpenAI Codex to transition toward an agentic organization, significantly reducing requirements analysis time from weeks to hours. This case study highlights the practical impact of AI agents on accelerating enterprise software delivery.

Claude Code now supports dynamic workflows, enabling the execution of dozens to hundreds of parallel subagents for complex tasks. Includes built-in verification steps to ensure accuracy before delivering results.

A new benchmark, ITBench-AA, reveals that even frontier models struggle with enterprise IT agentic tasks, with most scoring below 50%. This highlights a significant performance gap in deploying reliable AI agents for complex IT infrastructure management.
Explores the development of self-improving agents using Codex to automate complex tax filings. Demonstrates a practical pattern for accuracy improvement and workflow acceleration in specialized agentic domains.

Anthropic details the technical architectural patterns used to sandbox and contain Claude's execution as agentic capabilities scale. Essential reading for developers building autonomous agents with a focus on safety and blast-radius limitation.

CodeRabbit implements a structured planning layer on top of Claude to orchestrate coding agents. This approach allows teams to review and refine a coding plan before any actual code is generated, increasing reliability in complex agentic workflows.