Google I/O 2026: Gemini 3.5 Flash Debuts, AI Search Era Begins and More
While there was no mention of the upcoming Googlebook, it is clear Google wants to make sure everyone, including its competition, is aware of how much it is focused on creating an effortless flow across devices when working with AI.
A $180–190 Billion Bet on Custom Silicon
Before getting to the products, the scale matters. Google has committed to spending between $180 billion and $190 billion in infrastructure investment this year, roughly six times its 2022 capital expenditure figure. A significant portion of that spending flows into Google's custom Tensor Processing Unit program, now entering its eighth generation exactly a decade after the original TPU debuted.Two Chips, Two Jobs: TPU 8t and TPU 8i
For the first time, Google is splitting its TPU architecture into two purpose-built chips rather than a single unified design.
The TPU 8t is optimized for large-scale pretraining, delivering nearly three times the raw computing power of the previous generation. Google now claims it can distribute training workloads across more than one million TPUs globally, scaling beyond the limits of any single data center. The TPU 8i, meanwhile, is designed purely for inference, the high-volume, latency-sensitive workload that consumer-facing AI products generate continuously. Both chips deliver up to twice the performance-per-watt compared to prior-generation silicon.
Gemini 3.5 Flash: Speed Meets Frontier Intelligence
On the model side, Google's headline announcement is Gemini 3.5 Flash, the company's first model designed to combine frontier-level capability with low-latency execution. According to Google, it outperforms the older Gemini 3.1 Pro across virtually all internal benchmarks, with a particularly significant jump in complex coding tasks. The company says 3.5 Flash delivers four times the output token generation speed of competing frontier models.
Google CEO Sundar Pichai revealed the company is now processing more than 3.2 quadrillion tokens per month, a massive increase from 480 trillion tokens per month at I/O 2025.
Antigravity 2.0: The Developer Control Hub
Google also updated Antigravity, its agentic development platform, to version 2.0. Antigravity features two primary views: an Editor view that functions like a familiar IDE interface with an agent sidebar, and a Manager view that serves as a control center for orchestrating multiple agents working in parallel across workspaces. Within this environment, Google claims a hyper-optimized version of Gemini 3.5 Flash can reach significantly faster generation speeds than standard frontier model deployments.
Search, YouTube, and the Web Get Permanent AI Agents
The compute horsepower is being funneled directly into Google's core properties. Google said it is placing AI agents directly inside the Search box, capable of handling tasks such as completing purchases, checking ticket availability, and managing schedules in real time. Search will also support longer, more natural language queries.
Google is also deploying persistent "information agents" that run continuously on Google Cloud virtual machines, tracking user objectives and building personalized, interactive dashboards, a significant departure from the traditional query-and-result model.
Other notable integrations announced at I/O include:
- Ask YouTube: A coming update that allows users to query complex concepts across the video platform, with the ability to automatically navigate timelines and jump to the most contextually relevant moment.
- Gemini Omni Flash: A native multimodal architecture that handles mixed media inputs and outputs text, audio, and video simultaneously.
- SynthID Expansion: Google's invisible digital watermarking system is expanding its partner ecosystem. OpenAI, Nvidia, ElevenLabs, and Kakao have agreed to integrate SynthID, establishing a cross-industry standard for AI-generated content verification across Chrome and Search.