Perfect Data Sorting in LLM: Algorithms vs. Naivety

Many who have tried to use large language models (LLMs) for data sorting, for example, to select the best item from a list, have encountered disappointing results. The problem is not always in the model itself, but in the approach to sorting. Recently, I conducted an experiment comparing five different sorting methods on 164 posts from my Telegram channel, and the results proved quite revealing. The naive approach, where LLMs are simply asked to evaluate each item in the list and sort them by ratings, often turns out to be ineffective. This is due to the fact that LLMs are prone to systematic errors and are not always consistent in their evaluations. Moreover, they can be influenced by the order of items in the list. Simply put, LLMs are not designed for direct sorting.

Khamidun Zhemal

AI monitoring · Habr AI

Jan 22, 2026· 2 min

AI-processed from Habr AI; edited by Hamidun News

Perfect Data Sorting in LLM: Algorithms vs. Naivety — Source: Habr AI. Collage: Hamidun News.

◐ Listen to article

The naive approach, where LLMs are simply asked to evaluate each item in the list and sort them by ratings, often turns out to be ineffective. This is due to the fact that LLMs are prone to systematic errors and are not always consistent in their evaluations. Moreover, they can be influenced by the order of items in the list. Simply put, LLMs are not designed for direct sorting.

One of the interesting alternative approaches I explored is the TrueSkill algorithm, originally developed for the player matchmaking system in Xbox Live. TrueSkill assesses player skills based on the results of their matches and uses these assessments to predict the probability of winning future games. In the context of data sorting, TrueSkill can be used to compare list items with each other and build a ranking based on these comparisons.

TrueSkill works by modeling each item's skill as a normal distribution. When two items are compared, the algorithm updates the distributions of their skills based on the comparison result. This process is repeated for all pairs of items in the list until the skill distributions stabilize. The resulting mean values of the distributions are then used to rank the items.

In my experiment, TrueSkill showed significantly better results than naive approaches. It provided higher correlation with real data and was less prone to systematic errors. However, it is important to note that TrueSkill requires a large number of comparisons to achieve good accuracy. This can be a problem for large datasets.

What conclusions can be drawn from this experiment? First, do not rely on naive approaches to data sorting in LLMs. Second, there are alternative algorithms, such as TrueSkill, that can significantly improve results. Third, the choice of the right algorithm depends on the specific task and the size of the data list. In the future, even more efficient algorithms for data sorting in LLMs may emerge, specifically designed for this purpose. This will open new opportunities for using LLMs in tasks that require accurate ranking and selection.

Hamidun News

AI news without noise. Daily editorial selection from 50+ sources. A product by Zhemal Khamidun, Head of AI at Alpina Digital.

Telegram channel RSS hamidun.com

Want to stop reading about AI and start using it?

AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.

🎓 Academy — 7 days free Free consultation

Perfect Data Sorting in LLM: Algorithms vs. Naivety

Want to stop reading about AI and start using it?

The AI world, distilled — once a week