Habr AI→ original

PHP and RubixML transition from arrays to GPU: how the ecosystem's approach to ML is changing

ML architecture itself is changing in the PHP ecosystem. While math calculations used to be attempted on arrays, focus is now shifting to Tensor, NDArray…

AI-processed from Habr AI; edited by Hamidun News
PHP and RubixML transition from arrays to GPU: how the ecosystem's approach to ML is changing
Source: Habr AI. Collage: Hamidun News.
◐ Listen to article

The PHP ecosystem is increasingly moving away from the idea that machine learning can be seriously built on ordinary arrays. The main shift right now is not in new algorithms, but in a change of the language's role itself: PHP is ceasing to be a computational engine and is increasingly becoming an orchestrator that launches pipelines, links libraries, and delegates heavy mathematics to native extensions and GPUs. Against this backdrop, the pivot of RubixML and related projects like Tensor and NumPower is telling: for matrix operations, the difference is already measured not in percentages but in orders of magnitude, from tens of seconds on arrays to fractions of a second on native structures and experimental GPU backends.

Initially, PHP's path into ML looked quite logical. The first libraries, including PHP-ML and early versions of RubixML, stored matrices as arrays of arrays and performed all operations directly in the interpreter. For developers, this was convenient: nothing needed to be compiled, the code was transparent, it was easy to debug, and basic algorithms like k-NN, logistic regression, or simple classifiers could be assembled in literally an evening.

This approach is still useful in education and prototyping, because it allows you to manually work through the entire path from scalar product to the forward pass of the simplest neural network. But as soon as data begins to grow, convenience quickly turns into a bottleneck. The reason does not lie in unsuccessful loops or poor implementation of individual libraries, but in PHP's internal model itself.

Numbers here live not as compact primitives, but as heavier zval structures. Arrays in the general case are organized as hash tables, not as dense continuous blocks of memory. Because of this, each element access costs more, CPU cache is used worse, there is no automatic SIMD vectorization, and the copy-on-write mechanism can unexpectedly bloat memory and add unnecessary allocations.

Even useful tricks like preliminary matrix transposition or extracting count from loops give only limited gains. They make the code somewhat less slow, but do not turn PHP into an environment for efficient linear algebra. The next stage of evolution began when developers abandoned the very idea of storing matrices as ordinary PHP arrays.

This is how Rubix Tensor and NDArray appeared, where data now lies in contiguous memory, closer to how it is arranged in NumPy. In the case of NDArray, Rust is used for this, which allows for increased performance while avoiding some of the typical problems of manual memory management. From the outside, the API remains familiar, but internally almost everything changes: zval for each element disappears, the hash-table model vanishes, and the code becomes closer to mathematical notation.

In practical comparisons, this is visible immediately: multiplying 500 by 500 matrices on arrays takes approximately 10–20 seconds, on Tensor on CPU — about 0.3–0.8 seconds, and the experimental GPU path through NumPower — already around 0.

05–0.2 seconds. But native structures quickly showed a new ceiling: even a well-optimized CPU remains a CPU.

For embeddings, neural networks, and large matrix operations, this is already insufficient, so the logical continuation was a shift toward GPU. Around RubixML v3 and NumPower, a new model is forming, where PHP is responsible for orchestration, Tensor and NumPower take on the computational layer, and GPU becomes the place where heavy mathematics lives. This aligns well with how higher-level tools like transformers-php, LLPhant, or Neuron AI work today: PHP code there describes the pipeline, model loading, inference calls, and result processing, but does not attempt to manually calculate matrices and loops where more suitable runtimes already exist for that.

This means that PHP in AI has not so much a chance to catch up with Python as a computational environment, but rather an opportunity to occupy its own clear niche. The language remains strong where it is necessary to quickly embed ML or LLM functions into a web application, SaaS, an agent scenario, or a corporate pipeline. But to achieve this, the ecosystem had to accept an unpleasant, yet honest conclusion: for PHP to participate in modern ML, it does not need to calculate everything itself.

Its value is increasingly in being the glue between models, data, infrastructure, and interface, rather than being a replacement for GPUs or low-level libraries.

ZK
Hamidun News
AI news without noise. Daily editorial selection from 400+ sources. A product by Zhemal Khamidun, Head of AI at Alpina Digital.

Want to stop reading about AI and start using it?

AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.

What do you think?
Loading comments…