The Russian cultural code as a test for neural networks: Shurik, panelki, and Nano Banana

Q: What is the source?

Originally published on Habr AI. Hamidun News processes and adapts the material with AI.

Q: When was it published?

May 2, 2026. Reading time: 3 min.

A developer on Habr created a mini-benchmark for neural networks: instead of academic metrics, it uses Shurik, Soviet panelki, and Dr. Livesey. The idea came…

Hamidun News Editorial

AI monitoring · Habr AI

May 2, 2026· 2 min

AI-processed from Habr AI; edited by Hamidun News

The Russian cultural code as a test for neural networks: Shurik, panelki, and Nano Banana — Source: Habr AI. Collage: Hamidun News.

◐ Listen to article

A developer tested several image generation neural networks on their understanding of Russian cultural code — Soviet panel buildings, Shurik and Doctor Livsey. Not academically, but "to the eye": you look at a picture and immediately understand whether the model got the vibe.

The idea: vibe instead of metrics

It all started with Nano Banana. The author asked it to draw a surreal scene against the backdrop of Soviet panel buildings — and the model didn't just depict the buildings, it accurately conveyed the atmosphere. That became the reason for a mini-benchmark: not thousands of prompts, not FID scores, not academic tables. Just a set of visually recognizable images — and a comparison of results in real time.

Russian cultural code is difficult to understand from the outside. Panel buildings are not just a type of housing, they're an entire visual narrative: Soviet space, courtyards, faded benches, the smell of summer. Shurik is not just a student in glasses, but an archetype of Soviet comedy with a certain energy. Doctor Livsey is a meme about how to walk as if you're the main character in any room. If the model doesn't "know" these images from the inside, the pictures will be technically correct, but the feeling will be off.

Visual benchmarks of this kind are still rare — most tests focus on text, logic and facts. But for models that draw, understanding visual culture is more important than spelling "panelka" correctly.

Prompts from life

For the benchmark, the author used several culturally loaded scenes:

Soviet panel houses — courtyard, benches, garages, summer
Shurik from "The Operation Y" — a glasses-wearing student getting shawarma "on the go"
Doctor Livsey in his signature walk from a viral meme
post-Soviet courtyard aesthetics in general

None of the prompts explain the context in detail — that's exactly what it tests: how much the model has "absorbed" the cultural layer, rather than just knowing the words. A good test is one without hints.

Where models stumble

Western models, trained primarily on English content, reproduce "Soviet" through clichés: too bleak, too industrial, without liveliness. Their Shurik is a typical Western student in glasses, without Soviet immediacy. Panel buildings look like a dystopia, not nostalgia.

The problem isn't in the quality of the drawing — it's that the model is looking at culture from the outside.

"It didn't just draw panel buildings, didn't just perfectly execute the prompt, it accurately conveyed the vibe and the entire atmosphere," writes the author about

Nano Banana.

Nano Banana in this test turned out to be closest to "from the inside": the model is trained on a wide enough post-Soviet visual material to reproduce not just the form, but the feeling. This is rare among commercial image generation models.

Why this matters

Most benchmarks evaluate logic, factual knowledge, language abilities. Cultural accuracy remains in the blind spot — especially for non-Latin cultures. Meanwhile, it's precisely what determines how useful a model will be for local tasks: design, content, education, marketing. "Folk" tests are a quick and honest way to see the gap that academic metrics don't catch.

If a model doesn't understand why Shurik gets shawarma specifically "on the go," it doesn't understand the culture — even if it writes in Russian without mistakes.

What this means

Cultural code is an underestimated parameter for evaluating neural networks. Understanding language ≠ understanding culture. For Russian-speaking users, this means that the choice of model is worth checking not just by MMLU or HumanEval, but by "Shurik in shawarma" — and see what comes out.

Hamidun News

AI news without noise. Daily editorial selection from 400+ sources. A product by Zhemal Khamidun, Head of AI at Alpina Digital.

Telegram channel RSS hamidun.com

Want to stop reading about AI and start using it?

AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.

🎓 Academy — 7 days free Free consultation