Eltex engineer explained how to run federated learning on edge devices with 256 MB of memory
Eltex engineer Alexander Loshkarev published an article about federated learning on edge devices with less than 256 MB of RAM. The focus is not on FL theory…
AI-processed from Habr AI; edited by Hamidun News
Federated learning is typically discussed in the context of smartphones, cars, and large IoT networks, but in practice the main barrier often turns out to be far more mundane: the device simply doesn't have enough memory. This is precisely what Eltex engineer Alexander Loshkarev focuses on in his material, prepared based on a talk for AiConf. The topic sounds narrow, but in reality it concerns almost any project where ML needs to be moved from the cloud to the network edge and made to work on hardware with very modest resources.
We're talking about scenarios where an edge device has less than 256 MB of RAM. For server teams this looks like an almost extreme constraint, but for industrial electronics, gateways, telecom equipment, embedded systems and specialized controllers such a configuration is quite realistic. Under such conditions the task is no longer simply about taking a ready-made model and loading it into memory.
It's necessary to simultaneously fit the model itself, data, buffers, system processes and update exchange logic without losing device stability. Federated learning in this context is interesting because it allows training or fine-tuning models without centralized transmission of raw data. Instead, computations are performed locally, and only parameters or their changes are sent out.
This approach helps better control privacy, reduces dependence on a constant communication channel and makes edge scenarios more viable. But it has a flip side: the local FL client itself requires memory, computation and careful pipeline organization. The weaker the device, the more harshly you have to economize on every megabyte.
Judging by the description of the first part, the material addresses precisely the engineering side of this problem, not abstract theory. For teams implementing ML on the periphery, this is the most painful area: a model can be accurate in the lab but turn out useless in production if it doesn't fit in memory or causes degradation of other services. On such devices, what matters is not only the weight sizes, but also temporary memory consumption spikes during inference, batch preparation, update serialization and network exchange.
Even if a model looks compact statically, runtime behavior can make it impossible to run. In this sense, the very framing about a device for which 1 GB sounds like luxury quite accurately describes the gap between a typical ML stack and the real embedded world. Many tools and practices familiar from server development simply don't work here without adaptation.
You can't endlessly increase batch size, keep extra tensor copies or rely on ample system memory reserves. Any mistake in assessing the resource profile quickly turns into restarts, hangs or loss of function that the model was meant to provide in the first place. It's particularly important to note that this is not just about running inference on a small device, but specifically about federated learning.
This is a more complex mode: the system needs to periodically receive the global model, execute training steps locally, store intermediate states and send the result back. With limited memory, you need to reconsider literally everything: model size, data representation format, synchronization frequency, length of local sessions, and sometimes even the client architecture itself. From the announcement it's clear that the author frames the question correctly: before discussing model quality, you need to understand whether it's possible to maintain it on real edge hardware without failures and constant trade-offs in reliability.
For the market this is an important signal. Interest in AI at the network edge is growing, but real implementation runs into memory, energy and resilience constraints rather than beautiful demos. That's why such materials are useful not only to ML engineers but also to backend, embedded and product teams: they bring the conversation back from the level of promises to the level of systems engineering.
If the first part sets the framework of the problem, the main conclusion is already clear: in edge ML, victory goes not to the trendiest model, but to the one that the device can actually sustain.
Want to stop reading about AI and start using it?
AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.