GLiGuard by Fastino Labs: a safety model 16x faster than larger competitors
Fastino Labs released GLiGuard, an open model for LLM safety checks. It has just 300M parameters, but is 16x faster and more accurate than current models. It ha

Fastino Labs released GLiGuard — a compact model for moderating LLMs that is faster and more accurate than huge competitors. It has only 300 million parameters, and results match models 90 times larger.
Four Tasks in One Pass
GLiGuard solves four critically important safety tasks:
- Safety check of prompts — catches potentially dangerous inputs
- Jailbreak detection — finds attempts to bypass model restrictions
- Harm type classification — determines what kind of harm might occur
- Refusal detection — verifies that the model properly refused to respond
All four analyses happen in a single forward pass, delivering phenomenal processing speed.
Encoder Instead of Decoder
Most current guardrail models use decoder-only architecture, like typical LLMs. Fastino Labs took a different approach — built GLiGuard on an encoder foundation. This reduced latency by 16.6x and increased throughput 16 times over, while simultaneously maintaining accuracy across nine different safety benchmarks.
"An encoder is inherently better suited for classification tasks than decoder-only architecture,"
Fastino Labs' experience demonstrates.
What This Means
Efficiency has literally skyrocketed. Previously, you needed a large model to reliably check LLM safety. Now you can deploy GLiGuard locally, on edge devices, in mobile applications. This changes AI economics: lower compute costs, reduced latency, better user privacy.