Vector search without brute force: how IVF and HNSW work

Q: Источник материала?

Оригинальная публикация на Habr AI. Hamidun News обрабатывает и адаптирует материалы с помощью AI.

Q: Когда опубликовано?

2026-06-11. Время чтения: 3 мин.

Exact search in high dimensions freezes applications. Two algorithms save the day: IVF partitions space into clusters (k-means), HNSW builds a neighborhood grap

Hamidun News Editorial

AI monitoring · Habr AI

2026-06-11· 3 min

◐ Listen to article

Vector Search Without Brute Force: How IVF and HNSW Work

When a vector database contains millions of embeddings, searching for similar vectors using brute force becomes impossible. Two algorithms — IVF and HNSW — solve this task through approximate search, finding answers in milliseconds instead of minutes. A new detailed article on Habr breaks down both approaches from mathematical theory to Python implementation.

Why Exact Search Doesn't Work

In low dimensions (2D, 3D), you can build spatial indexes — k-d trees, R-trees. They use geometry: if a query is far from one part of space, it can be excluded from the search without losing results. But in high dimensions (768D for BERT, 3072D for GPT-4 Turbo), this idea breaks down. Distances become non-local — almost all points are at almost the same distance from each other. This phenomenon is called the curse of dimensionality. Traditional indexes lose efficiency, stop shrinking the search space, and we return to brute force — comparing the query with all millions of vectors in the database.

IVF: Clustering Comes to the Rescue

IVF (Inverted File Index) solves the task through preliminary clustering. The algorithm uses classical k-means to divide space into conditional "bucket"-clusters. The intuition is simple: if vectors are close in space, they should be in the same cluster, and during search we only need to check a few relevant clusters, not the entire database. The process works in two stages. During indexing, each vector in the database is assigned to its nearest cluster (its centroid). During search, the system first finds K nearest clusters to the query through fast search, then searches for exact matches only within them. This dramatically reduces the number of comparisons — instead of millions of comparisons, we need only a few thousand.

First stage: k-means divides space into hundreds or thousands of clusters
Second stage: each database vector is indexed into its cluster (centroid ID is saved)
During search: find K nearest clusters, then full search only within them

IVF scales well to billions of vectors and requires relatively little memory. Habr walks through the implementation in Python with concrete code examples and performance metrics.

HNSW: Hierarchical Navigable Small World

HNSW (Hierarchical Navigable Small World) uses a completely different approach — building a neighbor graph instead of clusters. Each vector becomes a node in the graph, edges connect neighboring (close) vectors. The graph is built hierarchically: at lower levels there is a dense local neighborhood of each vector, at upper levels there are sparse long-distance connections for fast global approximate descent. During search, the algorithm starts from an entry point at the upper level and descends through the hierarchy, selecting at each step neighbors that are closer to the target query. This is like navigating a large city: first you travel on highways, then switch to local streets until you find the address you need.

"HNSW works like a navigator in an unfamiliar city: you start on main highways, then descend to local streets," explain the authors of the article on

Habr.

In practice, HNSW is usually faster than IVF for low-latency search, but requires more memory to store the graph structure and connections between nodes.

What This Means

Both algorithms are used in modern vector databases: Pinecone, Weaviate, Qdrant, Milvus. For LLM applications — RAG systems, document search, semantic recommendations, intelligent chatbots — the choice of search algorithm directly affects response speed and user experience. IVF is better to choose for huge databases with limited memory, HNSW provides an advantage when minimum latency is required and when working with more compact datasets.

Hamidun News

AI news without noise. Daily editorial selection from 400+ sources. A product by Zhemal Khamidun, Head of AI at Alpina Digital.

Telegram channel RSS hamidun.com

Хотите не читать про ИИ, а внедрить его?

«AI News» — это полезные новости из мира ИИ. Системно научиться работать с нейросетями и применять их в работе — в Hamidun Academy.

🎓 Academy — 7 дней бесплатно Бесплатная консультация