
Bringing the latest Gemini models to Apple developers
Google brings Gemini models to Apple developers via the Foundation Models framework and Xcode integration. This allows for more secure, cloud-hosted model calls within the Apple ecosystem.
The latest from the AI and MCP ecosystem, curated daily.
Sources

Google brings Gemini models to Apple developers via the Foundation Models framework and Xcode integration. This allows for more secure, cloud-hosted model calls within the Apple ecosystem.

Google releases Quantization-Aware Training (QAT) checkpoints for Gemma 4. These optimizations reduce memory overhead and significantly improve performance for on-device deployment on laptops and mobile devices.

Google has launched local development capabilities for Kaggle Benchmarks, allowing developers to create and test AI benchmarks more efficiently. This streamlines the evaluation process for AI models on local machines.

Google releases Gemma 4 12B, a high-performance multimodal model designed for local execution on laptops. It features a unified, encoder-free architecture to bring advanced intelligence to the edge.

Google's I/O 2026 highlights new tools for building agentic applications, including updates to Google Antigravity and an enhanced Gemini API. Focus is on reducing the friction between prompt engineering and production-ready apps.

Google AI Studio introduces native Android vibe coding support and new Google Workspace integrations. These updates aim to accelerate the transition from prompt to production for AI developers.

Google introduces managed agents for the Gemini API, allowing developers to define agents as files and execute them within secure cloud sandboxes. This streamlines deployment and provides a controlled environment for agentic workflows.

Google updates the Gemini API File Search tool to support multimodal retrieval. This allows developers to build more efficient and verifiable RAG systems by indexing and searching across diverse file types.

Google introduces Multi-Token Prediction (MTP) drafters for Gemma 4, achieving up to 3x faster inference speeds. This architectural improvement significantly reduces latency for real-time AI applications.

Google introduces event-driven webhooks for the Gemini API, allowing developers to move from inefficient polling to a push-based notification system. This significantly reduces latency and friction for managing long-running AI jobs.

Google and Kaggle are launching a 5-day AI Agents intensive course focused on 'vibe coding'. This is a practical resource for developers looking to build and deploy agentic workflows.

Google introduces Deep Research and Deep Research Max, a new generation of autonomous research agents designed to handle complex information retrieval and synthesis.

Google AI Pro and Ultra subscribers now get increased usage limits in Google AI Studio. This update broadens access to high-tier models and experimental capabilities for developers looking to build AI-integrated applications.

Google AI Studio now supports prepay billing for the Gemini API, allowing developers more precise control over their spending and budget management.

Google Colab added Learn Mode, a set of Gemini-powered features that turn Colab into an interactive coding tutor. It gives developers step-by-step guidance, inline explanations, and targeted exercises inside notebooks — useful for onboarding, teaching, and iterative debugging. This reduces friction for experimenting with Gemini-powered workflows and makes Colab a stronger learning environment for developer-focused model adoption.

Google introduced two new inference tiers for the Gemini API — Flex and Priority — to let developers trade off cost and latency more granularly. Flex is lower-cost with relaxed latency SLAs for background workloads, while Priority offers lower latency for interactive use cases; both aim to reduce wasted spending and make inference choices explicit. This is useful for teams optimizing agent responsiveness and budgets across mixed workloads.

Google released Gemma 4 — their most capable open model family yet, with multimodal understanding and strong performance across reasoning, coding, and instruction following tasks. Designed to run on-device and at edge scale, Gemma 4 closes the gap with frontier closed models while remaining fully open-weight. A significant update to the most widely-used Google open model series.

Google released two developer tools to reduce stale code generation from agents: a Gemini API Docs MCP and complementary Agent Skills that surface up-to-date API docs at runtime. These tools help agents produce current SDK usage and avoid hallucinated or outdated snippets by integrating authoritative API references into the agent stack. For teams deploying coding agents, this reduces developer friction and increases trust in generated code.

Google released Veo 3.1 Lite, a cost-optimized video generation model available in paid preview via the Gemini API and Google AI Studio. For developers, this lowers the barrier to experimenting with programmatic video generation and integrating media workflows into apps without the full compute cost of larger models.

Google launched Gemini 3.1 Flash Live and a Live API in Google AI Studio to power real-time voice and vision agents. This is significant for developers building conversational agents with low-latency audio/video streams and multi-modal state, enabling new live agent use cases like voice assistants and live vision-based workflows.