Telegram Anti-Spam Bot Tab Launches With Custom Neural Network and Moderator Learning
Telegram now has Tab, an anti-spam bot that classifies messages using its own neural network instead of a pre-trained model. The developer manually curated a…
AI-processed from Habr AI; edited by Hamidun News
In Telegram, an anti-spam bot called Tab has appeared, which uses not a third-party ready-made model, but the author's own neural network. The project has been working in chats for several months, remains free for testing, and collects data for further retraining.
How the bot works
At the core of Tab is a model for binary classification of messages: the bot decides whether the text is spam or not. The author did not use ready-made solutions from Hugging Face and built the architecture himself, relying on an LSTM approach. The logic here is clear: for short Telegram messages, it is important to maintain context, and the combination of a recurrent network with attention mechanisms provides a lighter and more manageable alternative to large universal models.
On top of the neural network itself, several more rules work, which are responsible not only for detecting suspicious messages, but also for reducing the number of false bans. The bot separately takes into account whether the user is in the spammer database, and depending on this, either deletes the message immediately or leaves the final decision to the moderator. This hybrid approach seems more practical than pure automation: the risk of error in text classification still remains, especially in live chats with conversational language.
Data and training
The most difficult part of the project turned out to be not the bot code, but data preparation. The author did not find a ready-made fresh dataset for Russian-language Telegram spam, so the corpus had to be collected manually: parsing public groups, reviewing clearly spam-filled chats, and marking messages one by one. The dataset has now grown to more than 25 thousand examples, and the main accuracy of the classification depends on it.
A feedback mechanism from moderators was also built into the bot. If a message was mistakenly marked as spam, the moderator can confirm that it is normal text, and such a case goes into the dataset as a false positive. This allows not just to clean the chat, but gradually improve the model on real boundary examples, which usually break the quality of anti-spam systems.
"I do not position this solution as a spam killer."
Modes and limitations
Currently Tab supports two scenarios: a more cautious standard mode and a stricter automatic mode. In standard mode, the bot first runs the message through the model, then looks at additional signals, including the user's presence in the spammer database. If there is not enough confidence, the decision goes to a human.
This reduces the risk of punishing a regular chat member for disputed text.
- In standard mode, a suspicious message can go to moderator review
- In automatic mode, spam is deleted immediately after the model triggers
- A ban is more tied to the coincidence of two factors: spam classification and presence in the database
- Users can report messages with the /spam command
- Chat admins can switch the bot's operating mode
The main problem for such systems is the evolution of spam itself. Spammers disguise words with similar characters from other alphabets, insert spaces between letters, change the presentation and context of the message. This means that the model cannot be trained once and left unattended: it needs a constant stream of new examples, retraining, and checks. The author's plans include a public dashboard with real-time statistics and further automation of labeling, because it is the manual step that currently most limits scalability.
What this means
Tab shows that even without a large team and access to heavy infrastructure, you can build a working applied AI tool for a specific Telegram chat pain point. For the market, this is another signal: niche models and careful human moderation often give more useful results than trying to solve everything with one large universal neural network.
Want to stop reading about AI and start using it?
AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.