Latest publications

ML Red Teaming for LLMs: From Hallucinations to Data Leaks — Testing in Practice
How to attack LLM models to find vulnerabilities before adversaries: a practical breakdown of attack classes, testing methodologies, and defenses for enterprise AI.

Activation Steering: A tutorial on controlling a language model from within using PyTorch and nnsight
A Habr tutorial explains Activation Steering — how to control LLM behavior by directly intervening in neural network activations without retraining, using PyTorch, nnsight, and pyvene.

AI agents manage HR processes, but HRIS doesn't see who made the decision
Agents screen candidates and approve time off, but the system records only the outcome — the decision-maker and audit trail vanish.

MCP-Agents in Corporate Systems: How SimpleOne and Ainergy Integrated AI into Business Processes
SimpleOne and Ainergy integrated MCP-agents into their corporate platform — now AI doesn't just help with text, but creates tasks, checks statuses, and works directly with business processes.

Nine AI agents, one API quota: how Rate Governor prevents cascading failures
Standard retries and jitter don't work when multiple agents share a common quota — one 429 response turns into an avalanche of requests and crashes the entire system.

How Bitrix24 Built Eval and Automated Martha RAG Agent Optimization
Bitrix24 engineers shared their methodology for end-to-end RAG system evaluation: expert and synthetic datasets, the gap between retrieval metrics and real-world performance, and an automated optimization cycle.

AI Without Extremes: The Closed Loop of Generative Models and Cognitive Debt
Generative AI can degrade by training on its own texts, while users lose independent thinking skills — we examine real risks and non-obvious opportunities.

LLM Context Window: Why Neural Networks Forget Parts of Your Conversation
Every time you write in a chat with AI, the model rereads the entire conversation from scratch — it has no memory in the conventional sense. This is called a context window, and it has a hard limit.

Archspec investigate: How LLMs catch inter-service conflicts before code is written
Third part of the archspec series: the author tested whether Claude Sonnet 4.6 can catch inter-service conflicts at the planning stage when given machine-readable SERVICE_MAP.yaml contracts.

How a Lawyer Wrote Her First Code with AI and Automated Compliance
A lawyer got tired of waiting for IT and opened an IDE for the first time: in a few weeks with an AI assistant, she wrote a Python script that automates contract compliance checks against internal policies.

Blood and Sweat of AI: Millions of Hidden Workers Behind Every ChatGPT Query
Millions of low-wage annotators from Kenya, Pakistan, and India make ChatGPT possible — and their labor is deliberately not mentioned.

How to Build an AI Scheduler Solo: From Zero Budget to MWP
A developer shares how he achieved a working AI scheduler solo and with zero budget — from idea to MVP, and further to MWP, a minimally impressive product.

Vibe coders arrive on the marketplace: how LLMs stratified the freelance market by 2026
A freelancer whose income grew from 40k to 270k monthly shares how AI-equipped vibe coders transformed the marketplace: price wars, 7k-ruble gigs, and real income numbers.

Program Verification in the AI Era: Why Hallucinations Make Code Verification More Important
Researchers prove: AI accelerates code writing, but hallucinations make formal program verification critically important — especially for business and critical systems.

Local AI agent instead of sysadmin: autonomous server log analysis
A developer replaced monthly manual log review with a local AI agent that continuously monitors physical servers and alerts about failures before they become critical.

Cloud.ru HR's Notes: What ChatGPT Has Done to Hiring and Interviews
Marina Lomadze, Hiring Manager at Cloud.ru, explains how AI has transformed recruitment: why resumes have lost their meaning, how interviews have changed, and who companies are hiring now.

Anthropic launched Claude Mythos for cybersecurity — but first leaked its own drafts
Anthropic announced cybersecurity AI Claude Mythos with 11 partners and $100M — but a month before launch, accidentally exposed public access to 3000 internal files.

AlphaFold and AI Challenge Alzheimer's — After 20 Years Trapped by a Single Theory
Alzheimer's disease has remained untreatable for three decades — largely due to the monopoly of the 'amyloid hypothesis'. Now AI is finding new molecular targets and changing the entire logic of the search.

I Can't Code, But I'm Running 10 Telegram Bots: My Claude Code Vibeoding Story
An author with no programming skills deployed a dozen working Telegram bots on a VPS using Claude Code — and now they generate real income.

Siemens releases AI agent for TIA Portal that understands your project architecture
The new Siemens agent is built into TIA Portal and generates PLC code based on actual network topology and project structure — without manual adaptation or hallucinations.

Emergence AI launched 5 AI civilizations: Claude built a utopia, Grok died in 4 days
Emergence AI company created five virtual cities managed by Claude, Gemini, Grok and GPT — and observed how AI agents evolved over 15 days.

ChatGPT Marketing Strategy in 20 Minutes: Real Prompts and Error Breakdown
Habr broke down why ChatGPT outputs fluff instead of strategy — and showed a workflow with real prompts, a case study, and an honest list of where AI falls short.

Code agents: subscription vs API — pricing breakdown for custom harnesses
A Coddy Agent developer compared Claude Max, Cursor, Windsurf, and Copilot subscriptions against direct API — which is more cost-effective and what works for embedding in your own agent pipeline.

Нейро-панк: почему разработчики должны освободить ИИ от корпоративного контроля
Хабр-эссе призывает ML-исследователей и схемотехников стать «нейро-панками» — строить ИИ, независимый от корпораций и государств, пока это ещё возможно.

Claude Fable 5 lasted three days: system prompt breach, degradation, and US directive
Anthropic released Claude Fable 5, but withdrew access three days later — after a system prompt leak, a response degradation scandal, and US government intervention.

MCP Server for Obsidian: How to Connect Your Personal Knowledge Base to Any LLM
A developer created the obsidian-agent MCP server, which connects an Obsidian vault to any LLM client and gives the language model direct access to personal notes without manual copying.

Why ChatGPT Forgets: Explaining the Context Window of Language Models
We explain what a context window is in language models and why ChatGPT begins to 'forget' task details after a long conversation — this is an architectural limitation, not a bug.

Anthropic on AI Agents in Cybersecurity: Capabilities and Pitfalls
Anthropic published research on the application of AI agents in cybersecurity — developer Edgar Sipki analyzed the document and asked difficult questions about their actual reliability.

Claude Code Launched Agent Team Mode: A Team of AI Agents Instead of One
The experimental Agent Team mode in Claude Code launches multiple independent agents in parallel: each breaks down tasks from a shared list and communicates with colleagues.

An engineer dictated a diary to an AI agent for four months — and realized memory matters more than the model
A developer built a voice diary system using faster-whisper and Telegram, but the main discovery proved surprising: reliable memory is more important than language model power.