Why the brain is hundreds of millions of times more efficient than GPT-4 and where neuromorphic chips are heading
The human brain expends millions of times less energy for cognitive acts than modern LLMs, and it's not just about hardware. The key difference lies in…
AI-processed from Habr AI; edited by Hamidun News
Comparing the human brain and modern LLMs reveals an uncomfortable fact for the AI industry: even the most powerful models remain extremely power-hungry. The brain operates at roughly 20 watts, while large language models during inference can require kilowatts, and during training—megawatts of power. If we look not at marketing benchmarks but at the cost of a single thought, the difference is colossal: biology still does the same work orders of magnitude cheaper than silicon.
The article begins comparison with baseline numbers. The brain is estimated to perform about 10^16 synaptic operations per second while consuming around 20 watts. For modern LLMs, a comparable computational scale is achieved through GPUs and TPUs, but the cost of each operation is much higher.
In terms of energy per action, we're talking attajoules for the brain versus picojoules and higher for digital accelerators. The author provides a more vivid example: to answer a simple question like the difference between methane and ethane, the brain activates only a small fraction of neurons and spends roughly tenths of a joule, while GPT-4 must load a massive array of parameters and perform a giant volume of matrix operations. In this framing, the gap can reach hundreds of millions of times.
The reason is not that engineers simply have bad hardware, but in the computational principles themselves. The brain works analogously: neurons and synapses operate on continuous gradients, membrane potentials, and ionic currents. A single biological element simultaneously stores state and participates in computation.
LLMs are different: data is represented as bits, computation is separated from memory, and each matrix operation breaks down into a long chain of digital switches. The brain's second advantage is recurrency and temporal dynamics. The same neuron is engaged multiple times in signal processing, and time becomes part of the computation.
A Transformer, by contrast, pushes each token through a fixed set of layers and pays for this with a huge number of parallel operations. The third difference is sparsity. In the brain, only a small fraction of neurons are active simultaneously, so the system doesn't waste energy on total network activation.
In LLMs, at each step, huge arrays of weights are engaged, even if the task is relatively simple. The fourth factor is local learning. The biological system changes specific synapses where new experience arose, rather than running global backpropagation through a gigantic network.
The fifth is the physics of the substrate itself: ion channels and biochemical processes operate near the thermodynamic minimum, while even advanced transistors switch with much greater losses. Finally, the brain gets part of its structure for free: the visual cortex, hippocampus, cerebellum, and other specialized modules came to it as a result of evolution, while LLMs must learn the structure of the world anew through massive datasets and very expensive training. This does not mean large models have no future.
Rather, the conclusion is that the current Transformer architecture has hit the energetic cost of its own convenience. The industry is already seeking workarounds: quantization to 4–8 bits, sparse Transformers, mixture of experts, liquid and spiking networks. Some approaches already yield a 5–10x improvement, but this is insufficient to approach the biological level.
Therefore, increasingly more attention is shifting toward neuromorphic hardware. Such systems already exist: SpiNNaker2 is deployed as a specialized supercomputer and can even be rented via the cloud, BrainChip Akida is oriented toward edge AI, SynSense Xylo and Innatera Pulsar are aimed at microwatt and sensor scenarios. However, even the best of these solutions currently lag behind the brain by roughly three orders of magnitude in energy efficiency and require a completely different software stack.
The practical horizon also looks sober. In 2026–2028, neuromorphic chips are most likely to grow in robotics, industrial controllers, sensors, and autonomous systems, where latency and energy consumption are critical. Consumer electronics like smartphones and watches, if they get such coprocessors at all, will do so closer to 2030 and beyond.
The main bottleneck here is not only chip manufacturing but also software: familiar tools like PyTorch and TensorFlow don't work with spiking networks without serious adaptation, and there's no universal training standard for such systems yet. The main conclusion is simple: the brain today is not just smarter at individual tasks, but radically more economical as a computing machine. Therefore, the next big leap in AI will probably come not from an even larger LLM on an even bigger GPU cluster, but from a shift in the fundamental computational paradigm.
For now, GPT-4 and its successors remain a very powerful, but energetically expensive way to obtain intelligence-like behavior.
Want to stop reading about AI and start using it?
AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.