Perfect Data Sorting in LLM: Algorithms vs. Naivety
Сортировка данных в LLM часто дает посредственные результаты. Автор сравнил 5 методов на примере Telegram-канала, показав, что правильный алгоритм, а не модель,
AI-processed from Habr AI; edited by Hamidun News
Many who have tried to use large language models (LLMs) for data sorting, for example, to select the best item from a list, have encountered disappointing results. The problem is not always in the model itself, but in the approach to sorting. Recently, I conducted an experiment comparing five different sorting methods on 164 posts from my Telegram channel, and the results proved quite revealing.
The naive approach, where LLMs are simply asked to evaluate each item in the list and sort them by ratings, often turns out to be ineffective. This is due to the fact that LLMs are prone to systematic errors and are not always consistent in their evaluations. Moreover, they can be influenced by the order of items in the list. Simply put, LLMs are not designed for direct sorting.
One of the interesting alternative approaches I explored is the TrueSkill algorithm, originally developed for the player matchmaking system in Xbox Live. TrueSkill assesses player skills based on the results of their matches and uses these assessments to predict the probability of winning future games. In the context of data sorting, TrueSkill can be used to compare list items with each other and build a ranking based on these comparisons.
TrueSkill works by modeling each item's skill as a normal distribution. When two items are compared, the algorithm updates the distributions of their skills based on the comparison result. This process is repeated for all pairs of items in the list until the skill distributions stabilize. The resulting mean values of the distributions are then used to rank the items.
In my experiment, TrueSkill showed significantly better results than naive approaches. It provided higher correlation with real data and was less prone to systematic errors. However, it is important to note that TrueSkill requires a large number of comparisons to achieve good accuracy. This can be a problem for large datasets.
What conclusions can be drawn from this experiment? First, do not rely on naive approaches to data sorting in LLMs. Second, there are alternative algorithms, such as TrueSkill, that can significantly improve results. Third, the choice of the right algorithm depends on the specific task and the size of the data list. In the future, even more efficient algorithms for data sorting in LLMs may emerge, specifically designed for this purpose. This will open new opportunities for using LLMs in tasks that require accurate ranking and selection.
Want to stop reading about AI and start using it?
AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.