Telegram Bot with RAG Without Vector Databases: Example on Cloudflare Workers
How do you build a Telegram bot with knowledge base search capabilities without vector databases and expensive infrastructure? A developer detailed the…
AI-processed from Habr AI; edited by Hamidun News
A Telegram bot with knowledge base search is a popular task. Usually, this requires vector databases like Pinecone or Weaviate, embeddings, and paid cloud infrastructure. A developer from Habr showed that this isn't necessary—there's a cheaper and simpler way.
Why Vector Databases Aren't the Only Option
RAG (Retrieval-Augmented Generation) doesn't necessarily require vector embeddings. For a medium-sized knowledge base, full-text search by keywords is sufficient. The Jaccard algorithm calculates text similarity through word intersection—it's simple, fast, and requires no machine learning.
Here's how it works in practice: if the database contains a customer support FAQ and a user writes "how to restart the device," the bot splits the query into words, searches for matches in the documents, takes those with the highest intersection, and passes them to an LLM. Results can even be better than expensive embedding APIs if the content is well-structured. Faster too—because you don't need to wait for embeddings to be generated.
The main benefit: zero dependencies on external services except one LLM API for answer generation. Conversation history and the knowledge base itself are stored directly on Cloudflare KV—built-in storage included in the free plan. No infrastructure deployment queue, no monthly bills for vector storage.
How the Architecture Works
The workflow operates as follows: a user writes a question in Telegram → the bot searches for relevant documents from the knowledge base using the Jaccard algorithm → it takes the top-3 results and passes them along with the question to the Groq API (a free LLM) → Groq generates an answer based on the found documents → the bot sends the result to the chat. Conversation history is saved in KV for context between messages. This allows the bot to remember previous questions and refine answers based on conversation context.
Notably, Groq was chosen for a reason. It's a fast LLM service with generous free plan limits, ideal for chatbots and RAG systems where instant answer generation is needed. The knowledge base itself is stored as a collection of documents in KV: the key is the document ID, the value is the text.
When a query arrives, the bot loads all documents, applies Jaccard to each, and ranks them by similarity score. This solution scales to thousands of documents without issues.
What Stack Do You Need
To implement this, you need a minimal set of components:
- TypeScript—the language used for all code, with type support
- Telegraf—a lightweight and popular library for working with the Telegram API
- Cloudflare Workers—a serverless platform for deployment (free plan with generous limits)
- Cloudflare KV—built-in storage for the knowledge base and conversation history
- Groq API—a free LLM service for generating answers based on found documents
Deployment happens with a single command via Wrangler—a CLI utility for Cloudflare Workers. No server configuration, no need for your own hosting, no Docker containers. If the bot stays within the free Workers plan limits (one million requests per month), the cost will be exactly zero. For comparison, a typical setup with a vector database requires at least $20-50 per month just for storage.
What This Means for Developers
This opens the door for simple RAG systems that previously seemed too expensive or complex to implement. For small teams, startups, and enthusiasts, it's a way to quickly launch an AI bot with knowledge search at virtually no cost. A classic example: an in-office bot that answers questions about FAQs, corporate policies, technical guides, and documentation. Previously, such a project required allocating budget for infrastructure and maintaining a complex system. Now all of this can live in a single KV bucket and launch in an hour. This is especially useful for teams that want to quickly add AI functionality without major investment.
Want to stop reading about AI and start using it?
AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.