Google Unveils Eighth-Generation TPUs: TPU 8t for Training and TPU 8i for Inference
Google has introduced its eighth generation of custom Tensor Processing Units (TPUs), specifically designed to address the diverging demands of AI model training and inference. The novel chips, named TPU 8t and TPU 8i, represent a strategic shift toward workload-specific hardware within Google’s AI Hypercomputer platform.

According to Google Cloud’s official announcement, TPU 8t is engineered for frontier-model training, capable of handling the most complex models on a single, massive pool of memory. TPU 8i, conversely, is optimized for large-scale inference and reinforcement learning, enabling AI agents to complete multi-step workflows quickly for responsive user experiences.
This specialization reflects a broader industry trend where cloud providers are tailoring AI infrastructure to distinct phases of the model lifecycle. As noted by industry analysts, the separation allows enterprises to avoid overpaying for training-grade capabilities when running inference workloads, improving cost efficiency and resource utilization.
The TPU 8t and TPU 8i chips are expected to be made generally available later in 2026 as part of Google’s integrated AI Hypercomputer architecture, which combines purpose-built hardware, software, and networking to support the full AI lifecycle from training to deployment.
Google emphasizes that these chips are not merely incremental upgrades but are co-designed at the system level to meet the specific operational intensities of modern AI workloads, including mixture-of-experts models and agentic AI applications requiring long context windows and complex reasoning.
By splitting its TPU offerings into training- and inference-optimized variants, Google aims to provide clearer performance and cost profiles for different stages of AI development, potentially simplifying fleet management and lowering total costs for model providers and enterprises alike.