Cognitive memory for an AI agent: how SQLite replaced vector databases
A developer introduced an open cognitive memory architecture for local AI agents built on top of a single SQLite file. Instead of the standard vector database a
AI-processed from Habr AI; edited by Hamidun News
One of the main unresolved problems of modern AI agents sounds deceptively simple: how to teach them to remember what's important and forget what's obsolete. A developer who published a detailed technical analysis on Habr proposed a solution that goes against the industry mainstream. Instead of popular vector databases, he built a full-fledged cognitive memory on top of a single SQLite file.
The problem he describes is familiar to anyone who has tried to create a long-lived AI agent. The standard recipe looks like this: you take text, chop it into chunks, convert it into vector embeddings, store it in Pinecone or Chroma, and retrieve the nearest by cosine distance when queried. On short timescales this works. But as soon as the agent lives longer, chaos begins: the context window gets cluttered with irrelevant fragments, contradictory facts from different periods coexist as if nothing happened, and there is no forgetting mechanism at all. The agent remembers everything equally well, which in practice means it remembers everything equally poorly.
The proposed architecture borrows principles from cognitive psychology and neuroscience. Memory is organized as a graph with two types of nodes: episodic, which store specific events and interactions, and semantic, containing generalized knowledge and facts. Between the nodes are laid typed edges, reflecting the nature of the connections. Named entities are highlighted as a separate layer, allowing the agent to track mentions of specific people, organizations, concepts, and link disparate information fragments into a unified picture.
Special attention deserves the search system. Rather than relying on a single information retrieval method, the developer implemented a hybrid approach that combines three mechanisms: full-text search via SQLite FTS5 for exact matches and keywords, vector search for semantic proximity, and graph traversal for retrieving associated contexts. The results of the three search strategies are combined using Reciprocal Rank Fusion, an algorithm that combines ranked lists from different sources without needing to calibrate their absolute scores. This is an elegant solution that allows each method to compensate for the weaknesses of the others.
But the most interesting part of the architecture is related not to remembering, but to forgetting. The developer implemented Ebbinghaus's forgetting curve, a classical model from 19th-century experimental psychology that describes exponential memory decay over time. Each graph node has a "memory strength" metric that gradually decreases. Information that is accessed repeatedly is reinforced, while rarely requested fragments naturally recede into the background. This is fundamentally different from the approach of most systems, where data either exists or it doesn't.
The picture is complemented by a background LLM consolidation mechanism. Analogous to how the human brain processes and summarizes information during sleep, the agent periodically launches a language model to analyze accumulated episodic memories. The model identifies patterns, resolves contradictions, and creates new semantic nodes, transforming disparate episodes into structured knowledge. Essentially this is automatic generation of "wisdom from experience."
It is important to emphasize the engineering pragmatism of the solution. The entire system runs on a single SQLite file, without external services, without Docker containers with vector databases, without subscriptions to cloud storage. For a locally running agent, this means minimal dependencies, simplicity of deployment, and full control over data. SQLite, despite its reputation as "a database for small projects," has long proven its ability to handle serious workloads, and the FTS5 extension turns it into a full-fledged search engine.
This project fits into a growing trend toward creating more "human-like" memory for AI agents. Large laboratories like Google DeepMind and OpenAI are actively researching long-term memory mechanisms, but their solutions are typically tied to proprietary platforms. An open architecture that can be deployed locally on any machine democratizes access to these capabilities. If the approach proves viable at scale, it could change how autonomous agents' memory should be structured, from mindless data accumulation to meaningful knowledge management.
Want to stop reading about AI and start using it?
AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.