Announced at Google Cloud Next on April 22, Google's eighth-generation TPUs mark the first time it has split its flagship accelerator into two specialized chips: the TPU 8t scales to 9,600-chip superpods with 2 petabytes of shared HBM and 121 ExaFlops for model training, while the TPU 8i prioritizes low-latency inference with 384MB of on-chip SRAM (3× the prior generation) optimized for serving agentic workloads. Google claims 80% better price-performance for the 8i and up to 2.8× gain for the 8t versus the prior Ironwood generation. Both chips target availability later in 2026 via Google's AI Hypercomputer.