Google unveils new AI chips for inference and challenges Nvidia
Google is developing a new generation of AI chips focused on inference — a direct challenge to Nvidia. Cerebras announced IPO plans months after withdrawing…
AI-processed from Bloomberg Tech; edited by Hamidun News
Google is challenging Nvidia: the company is developing a new generation of its own AI chips focused on inference — the final stage of neural network operation, when a trained model processes user requests in real time. Inference chips are fundamentally different from those needed for model training. Training large language systems is months of computing on thousands of GPUs, a one-time capital expenditure.
Inference is a constant load: every time a user sends a request to Gemini, ChatGPT, or any other AI service, the chip performs inference. As AI applications reach hundreds of millions of users, inference costs become the main expense item for technology companies. By some estimates, by 2027 they will exceed training costs three to four times over.
Google has long been building its own silicon strategy. The company has been developing TPU (Tensor Processing Units) since 2015 — long before AI chips became front-page news. Until now, these processors have been used mainly within Google Cloud and for training Gemini models.
Now the company intends to create specialized hardware specifically for inference — with higher throughput and lower energy consumption per request. For Nvidia, whose H100 and H200 chips have become the standard for data centers worldwide, this is a direct challenge. Google is one of Nvidia's largest customers in the world, and shifting even part of the load to proprietary hardware means significant losses for the Santa Clara-based company.
In parallel, another event is brewing in the AI chip sector: Cerebras Systems announced plans to go public. The company is known for its flagship product — the Wafer-Scale Engine, essentially an entire silicon wafer functioning as a single processor. This architecture eliminates latencies when transferring data between individual chips and accelerates processing of large models.
Cerebras already attempted an IPO earlier but withdrew the application. Returning to the IPO path is a signal to the market: despite cooling off from some AI investors, alternative chip architectures continue to be perceived as promising assets. The company positions its solutions as particularly effective for working with open models in closed corporate environments — a rapidly growing segment driven by security and data sovereignty requirements.
The third story of the week comes from space. Blue Origin successfully launched the New Glenn rocket and landed its reusable booster — an important technical milestone toward reducing launch costs. However, the payload, AST SpaceMobile satellite, did not reach its calculated orbit.
AST SpaceMobile is building a global broadband internet network that works directly from smartphones without special terminals. The orbital failure immediately affected the stock price: the company's shares fell. The market once again reminded that in space, technical success and commercial success are different things.
The three stories together paint a portrait of the technological economy of the mid-2020s. The AI chip race is moving beyond the duopoly: beyond Nvidia and AMD, technology giants with their own silicon and specialized startups like Cerebras are entering the battle for infrastructure. The question of who will control the physical infrastructure of the AI era remains open.
Want to stop reading about AI and start using it?
AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.