3DNews AI→ original

OpenAI Claims ChatGPT Learned to Count Letters, But Still Fails on Simple Words

OpenAI announced that ChatGPT has finally learned to correctly count the letters in the word strawberry and handles simple logical traps better. But it's too…

AI-processed from 3DNews AI; edited by Hamidun News
OpenAI Claims ChatGPT Learned to Count Letters, But Still Fails on Simple Words
Source: 3DNews AI. Collage: Hamidun News.
◐ Listen to article

OpenAI announced that ChatGPT has finally learned to correctly answer the question about the number of R letters in the word strawberry — one of the most famous common sense tests for AI chatbots. But almost immediately after this, users demonstrated that on almost identical tasks, the model still makes mistakes and does so with the same confidence as before.

What exactly was fixed

The strawberry question has long become a meme around large language models. For a human, the task is trivial: you just need to count the letters. But ChatGPT for a long time regularly answered incorrectly and could insist that the word has no three R letters.

A similar story existed with another popular request: "I want to wash my car today, but the car wash is only 50 meters away. Should I walk to get there?" Instead of noticing the logical absurdity of the phrasing, the bot often advised walking in order to then drive.

On April 28, 2026, OpenAI wrote on X that both cases have finally been fixed, and presented this as a small but symbolic victory over the old memes about ChatGPT. The message was simple: the model has become better at handling elementary logic and letter analysis, which it used to stumble on. But the effect of this announcement quickly faded, because users immediately began checking neighboring phrasings and seeing how widely the fix works.

"Finally"

Where the bot breaks

The most telling example is the word cranberry. When asked how many letters r are in this word, ChatGPT, as users noticed, still often answers that such a letter appears once. This is incorrect: there are two. That is, the model can pass one viral test and fail almost the same test a minute later. This is why many suspected that OpenAI didn't solve the root problem, but simply closed a few too obvious scenarios.

  • In the word strawberry, the model now more often counts letters correctly
  • The request about a car wash 50 meters away also started working more logically
  • In the word cranberry, the bot can still confidently name an incorrect number of letters
  • After a correction, the model may not acknowledge the error right away and instead continue to argue

This is why a version emerged about targeted, possibly hardcoded patches. If the model had learned a general rule, it should apply it equally to similar words and tasks, not just pass pre-known viral tests. From this perspective, one correct answer by itself proves almost nothing: what matters is not the fact of fixing a particular meme, but the transfer of logic to similar cases without manual tuning.

Why this matters

The story seems funny only on the surface. The error in counting letters by itself is harmless, but it well demonstrates a more unpleasant feature of modern AI systems: they can output falsehoods confidently and without internal alarm signals. The model doesn't say "I'm not sure" and doesn't always notice a contradiction even after clarification. Because of this, the user gets not just an inaccurate answer, but an answer that looks convincing.

From a technical point of view, this is not such an unexpected thing. Large language models are great at predicting plausible text continuation, but they are not required to stably perform symbolic operations or strict logical checks in every similar case. Therefore, failure at the level of one letter is not a curiosity in a vacuum, but a symptom of a more general limitation. The same pattern can manifest in document summaries, advice, reports, number comparisons, and any task where you need not smooth speech, but precise internal verification of the result.

What this means

For OpenAI, this story is a reminder that users are already evaluating the model not by beautiful announcements, but by stability on neighboring examples. If ChatGPT was fixed only for a couple of meme requests, trust doesn't grow much from this. For the average user, the conclusion is simple: even when the bot confidently handles a known trap, its answers to simple letter, number, and logic tasks still need to be double-checked.

ZK
Hamidun News
AI news without noise. Daily editorial selection from 400+ sources. A product by Zhemal Khamidun, Head of AI at Alpina Digital.

Want to stop reading about AI and start using it?

AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.

What do you think?
Loading comments…