Latest publications

How Token Selection Works in Neural Networks: logits, Temperature, and top-p
Understanding the mathematics of LLM text generation: how logits, temperature, and top-p affect the balance between accuracy and creativity in responses.

Context-pruning for long-lived LLM agents: a memory management technique
Agents based on large language models require a new approach to memory management during long sessions. Context-pruning allows removing unnecessary information and saving tokens.

Hybrid Search in RAG: When Semantics Meet Keywords
Hybrid search combines semantic and lexical algorithms—critical for production-ready RAG systems.

Multi-agent Research Assistant in Python with OpenAI SDK
OpenAI introduced Agents SDK — a framework for building systems of multiple agents that work together to search and analyze information. This opens new possibilities for automating research.

Machine Learning Mastery: Semantic Search with Embeddings Instead of Keywords
Keyword search fails when documents don't contain the exact words users are searching for. Machine Learning Mastery shows how to solve this with LLM embeddings and metadata.

How to choose an AI agent architecture: a decision tree from Machine Learning Mastery
Machine Learning Mastery has published a guide with a decision tree for choosing the optimal AI agent design pattern. The choice depends on the task type, scalability requirements, and the nature of interactions with ext

Machine Learning Mastery explained how to build ML systems without servers and large datasets
Machine Learning Mastery released a practical guide to ML in conditions of limited hardware, poor internet, and messy data — with an emphasis on simple models and straightforward solutions.

Machine Learning Mastery explained how vector databases work from simple to complex
Machine Learning Mastery released a detailed guide to vector databases: from embeddings and similarity search to HNSW, IVF, PQ, and the trade-offs between accuracy, memory, and latency.

LlamaCloud added LlamaAgents Builder for building and deploying AI agents in minutes
LlamaCloud now includes LlamaAgents Builder, a beta service that builds a document-processing agent from a text description, deploys it via GitHub, and lets users test it in the interface.

Machine Learning Mastery highlighted 7 itertools functions for feature engineering in Python
Machine Learning Mastery published a practical breakdown of seven Python itertools functions that help build interaction, lag, polynomial, and cumulative features faster without bulky loops.

Machine Learning Mastery identified 7 ML trends that will shape 2026
Machine Learning Mastery highlighted seven machine learning trends for 2026: agentic systems, generative AI as infrastructure, small models, edge computing, and the growing role of MLOps.

Machine Learning Mastery showed how Python decorators make ML services more reliable
Machine Learning Mastery broke down five Python decorators for production ML: they help withstand API failures, validate inputs, save compute resources, and improve service observability.

Machine Learning Mastery explained how to avoid race conditions in multi-agent systems
Machine Learning Mastery published an analysis of race conditions in multi-agent systems: why agents corrupt shared state without errors and which patterns reduce the risk.

Google’s Gemma 4: how to run tool calling locally with Python and Ollama
Machine Learning Mastery showed how to turn Gemma 4 into a local agent with tool calling: using Ollama and Python, the model calls functions, gets data from APIs, and responds without the cloud.

Machine Learning Mastery explained how to build long-context RAG without extra tokens
Machine Learning Mastery broke down five techniques for long-context RAG: reranking, caching, hybrid search, metadata, and query expansion to reduce noise, cost, and latency.

Machine Learning Mastery showed how to run zero-shot text classification without a dataset
Machine Learning Mastery released a practical breakdown of zero-shot text classification: how to define categories, use BART, and get labels without training on your own dataset.

Why memory has become a key element of AI agents: a breakdown across three levels of complexity
A new breakdown of memory in AI agents shows the main point: without preserving context, a model responds in isolation, while useful agent systems are built on memory of the dialogue, tasks, and past sessions.

Machine Learning Mastery identified five major barriers to scaling agentic AI in 2026
Machine Learning Mastery compiled five problems preventing agentic AI from transitioning from impressive demos to stable production: from orchestration to security and cost control.

Machine Learning Mastery: why one vector store is not enough for AI applications
Machine Learning Mastery explains why production AI cannot live on vector store alone: SQL layer is also needed for access control, billing, metadata, and application state.

Machine Learning Mastery showed how to build AI agents in Python with Pydantic AI
Machine Learning Mastery released a practical guide on Pydantic AI: how to get structured responses, connect tools, implement dependencies, and build more reliable AI agents in Python.

Machine Learning Mastery released a guide on context engineering for reliable AI agents
Machine Learning Mastery showed why AI agents more often fail due to poor context management than due to the model, and how to fix it through token budgets, history summarization, and precise retrieval.

OpenAI, Anthropic, and Gemini: How Inference Caching Reduces LLM Cost and Latency
Inference caching allows LLMs to avoid recalculating identical portions of the prompt, reducing token expenses and accelerating responses, with prefix caching becoming the primary lever for production.

Scikit-LLM Shows How to Embed Text Summarization Into a scikit-learn ML Pipeline
Scikit-LLM has proposed a scheme where long texts are first briefly summarized by a Hugging Face model, then immediately fed into a scikit-learn pipeline for classification.

Five security patterns without which agentic AI is doomed to fail
Autonomous AI agents are increasingly making decisions without human involvement. But the more freedom a system has, the higher the cost of a mistake. We examine which security architecture patterns are becoming the indu

Comparing LLM Embeddings, TF-IDF, and Bag-of-Words in Scikit-learn
We examine which text vectorization method—from classic TF-IDF to modern embeddings—is best suited for machine learning algorithms in Scikit-learn.

Vector Magic: 7 Ways to Maximize LLM Embeddings
Vector Magic: 7 Ways to Maximize LLM Embeddings The artificial intelligence industry right now resembles a person who bought a Ferrari just to drive it exclusively to the neighboring store for bread.

LLM 2026: What to Read Today So You Don't Wake Up a Dinosaur Tomorrow
The artificial intelligence industry moves faster than most of us manage to finish our morning coffee.

Agentic AI: Seven Reasons Why Your Autonomous Assistant Could Go Insane
The artificial intelligence industry is undergoing an important transition from passive language models to active agents.

LLM Applications: Three Horsemen of the Apocalypse for Your Startup
Let's be honest: today any student with access to OpenAI's API can build a "revolutionary" AI assistant in one evening.

Andrew Ng's Course Is Complete: Where to Go to Avoid Staying a Junior Forever
You've closed the final week of Andrew Ng's Coursera course, got your coveted digital certificate, and now feel like a master of weights and biases.