Habr AI→ original

Suricata showed how to train ML-based attack detection systems on real traffic

Using Suricata and their own session_analyzer utility, the study's authors tested whether ML-based IDS could be trained not on lab attacks, but on production…

AI-processed from Habr AI; edited by Hamidun News
Suricata showed how to train ML-based attack detection systems on real traffic
Source: Habr AI. Collage: Hamidun News.
◐ Listen to article

Suricata showed how to train ML attack detection systems on real traffic

The signature-based IDS Suricata can be not only a detection tool but also a source of labeling for an ML model for attack detection. The authors of the study tested this idea on real corporate traffic and found a working, though not universal, scenario for training an ML IDS without conducting artificial attacks on the protected resource.

How the experiment was set up

The experiment was deployed on the Ideco company's test bench. One server received live company traffic and ran it through an NGFW with a modified Suricata IDS and current signatures. A second server analyzed the same traffic stream with its own session_analyzer utility, which collected features for each network session.

The authors deliberately did not build a laboratory infrastructure with synthetic attacks: the task was to understand whether a model could be trained directly on an already-running network and on real security events. Collection ran for two weeks—from June 26 to July 10, 2025. After filtering, 55,548,971 network connections remained.

From 118 original features, they selected address information and 10 of the most informative session characteristics, then compared them with Suricata detections and assigned labels of Benign or Attack. The result was a binary dataset where the role of "teacher" for the model was played not by people or manual labeling, but by an already-tuned signature-based IDS.

Where the scheme breaks

The main problem turned out not to be in algorithm selection, but in labeling quality. The event time in Suricata does not match the start time of the network connection: a detection can relate to a packet that arrives seconds after the session starts, and for slow attacks the gap exceeded 20 seconds. Additionally, the same traffic could be observed both before and after the gateway, meaning one attack corresponded to two connections with different address information. If such cases are not accounted for, noise enters the dataset, and the model begins learning from contradictory examples.

  • not all Suricata SIDs are suitable for labeling, especially rules tied only to IP, SNI, or specific URLs;
  • for some attacks, including various types of port scanning, the current set of features is simply insufficient;
  • the training sample must cover at least one week of real traffic, including weekdays and weekends;
  • the model must be retrained when new attack types appear, signatures change, network infrastructure changes, or employee work patterns change.

From this emerged the key finding about "bad" vectors: if two connections have matching or nearly matching features but different labels, classification quality drops sharply. Even strong gradient boosting like CatBoost does not help in this case. Some Suricata events help the model, while others only add false positives. Some signatures ultimately make more sense to exclude from labeling and return the corresponding connections to the Benign class, otherwise the ML IDS inherits errors from the underlying signature layer.

What the results showed

Despite all limitations, the hypothesis was generally confirmed: a network-level ML IDS can be built on an already-operating network, using Suricata events as a label source. This is convenient because finely-tuned signature rules in advance filter out a significant portion of noisy alerts that operators would not respond to anyway. In this mode, Suricata becomes not just a detection system but also a quality filter for the training set.

The best practical result in the study was an F1-score of 0.98 with correct dataset labeling. But the authors honestly note the limits of the approach.

First, they solved a binary classification problem, but for a real NGFW this is insufficient: the business needs to understand what exact class of attack was detected and how to respond to it. Second, the experiment was conducted on a user company network, not on a specific protected service like a web server, so transferring findings to other networks requires separate verification.

What this means

The study shows a practical path from signature-based protection to an ML model without an expensive test range and manual labeling of millions of sessions. But it also reminds us of the main point: in cybersecurity, the quality of ML begins not with algorithm selection, but with how carefully you connect real alerts, network features, and infrastructure context.

ZK
Hamidun News
AI news without noise. Daily editorial selection from 400+ sources. A product by Zhemal Khamidun, Head of AI at Alpina Digital.

Want to stop reading about AI and start using it?

AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.

What do you think?
Loading comments…