SberZdorovye: neural network non-determinism is a pipeline failure, not a model property
SberZdorovye architect Ruslan Cherkas challenges the popular claim of “inherent” neural network non-determinism. His position is that with identical inputs…
AI-processed from Habr AI; edited by Hamidun News
SberZdorovie: Non-determinism in neural networks is a pipeline failure, not a model property
Ruslan Cherkas, architect at SberZdorovie, spoke out against the popular thesis that neural networks are inherently non-deterministic by nature. His main argument: if input data, model weights, and environment are fixed, the system must produce the same result, and any discrepancies are a signal of a failure in the pipeline, code, or infrastructure.
Where the dispute comes from
The occasion for this analysis was a typical situation from ML practice: a team tries to reproduce an experiment but gets different metrics or a different model response. Such cases are often explained by the very nature of neural networks, especially when it comes to LLMs, training on GPUs, and complex chains of libraries and services. Cherkas disputes precisely this explanation and proposes looking at the problem more rigorously, as an engineering defect rather than an inevitable feature of the technology.
According to his logic, a mathematical model cannot be "random by itself" if all its arguments are known and unchanged. For a neural network, this means fixed input, fixed weights, and identical execution conditions. In this mode, the formula must lead to the same conclusion every time. If this doesn't happen, then somewhere between the data, hardware, libraries, and algorithm there is an unaccounted variable that the team simply doesn't control.
Four sources of failures
The author breaks down the most common explanations typically used to justify fluctuating results and reduces them to four classes of problems. His overall view is strict: non-determinism is not a useful "feature" if it arises without changing input conditions. This matters not only for science but also for deploying the model in production, where any unexplained discrepancy quickly becomes a risk.
- Undefined input data — data itself, initial weights, seed, or internal states change randomly.
- Hardware failures — defects in equipment, differences in operation order, or unstable execution environment affect the result.
- Software discrepancies — library versions differ, optimization settings, caching, or other environment variables change.
- Algorithmic errors — floating computation order, race conditions, and incorrect parallelization break reproducibility.
Cherkas separately emphasizes that references to "external factors" like quantum effects do not absolve developers of responsibility. If a factor influences the model's output, it must either be included in the arguments or isolated. Otherwise, this is not a philosophical question about the nature of AI, but an ordinary implementation error.
The article also contains a short formula for the author's position:
"Non-determinism is an error that must and can be eliminated."
How to achieve reproducibility
The practical conclusion from the article is simple: first you need to recognize the problem, then localize its source. If a model behaves differently with identical runs, the team should break down the incident by layers: verify the data, compare weights, fix the seed, ensure identical hardware, library versions, and complete runtime environment. For production systems, this is no longer a question of convenience but of confidence in results and the ability to properly investigate failures.
The author also cautions that you cannot blindly pay for speed with the loss of computation order. If parallelization or optimization changes the order of operations so that results start to fluctuate, such an implementation cannot be considered correct for critical scenarios. This especially applies to systems where business decisions, medical recommendations, security, or other high-stakes processes depend on the model. In these cases, a deterministic pipeline should be a separate engineering goal, not a side effect of successful tuning.
What this means
SberZdorovie's material is useful because it shifts the conversation about "neural network magic" into the realm of ordinary engineering. The more actively companies embed models in important processes, the less acceptable it is to explain unpredictability by the abstract nature of AI. In practice, those teams will win that can demonstrate reproducibility, describe sources of randomness, and prove that the system remains manageable even in complex scenarios.
Want to stop reading about AI and start using it?
AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.