Marusya and Salyut read out unwanted phrases through choices, names, and reminders
As the analysis showed, the voice assistants Marusya and Salyut can be bypassed without API or scripts. In Marusya, a choice-between-two-options scenario…
AI-processed from Habr AI; edited by Hamidun News
It turned out that household voice assistants Marusia and Salute can be made to pronounce phrases that they normally should block. This doesn't require APIs, programming skills, or automation: standard scenarios like choice selection between options, reminders, and saved facts are sufficient.
How the bypass works
In the first scenario, we're talking about Marusia. The author noticed that the assistant readily answers questions in the format "A or B?" and simply chooses one of the suggested options. The problem is that the system, according to the experiment description, doesn't analyze the permissibility of both answers as a single construct. If both options are poorly phrased, the column still pronounces one of them aloud, whereas in a normal direct request for a similar phrase, it would likely refuse to respond.
With Salute, the bypass logic turned out to be different, but no less revealing. Instead of a direct request to say something undesirable, the author broke the phrase into parts and saved them as names of "friends." After that, the assistant can be asked to greet the friends or list them in order, and it will sequentially voice the saved list. Individually, the elements look like normal profile data, but the output combines into a complete phrase that the filter no longer catches.
What scenarios worked
Besides choice selection and a list of names, the analysis describes several other everyday functions through which unwanted text passes. The general scheme is the same everywhere: the system first accepts the phrase as normal user data, saves it in memory or a service function, and then reproduces it almost verbatim in a different context where additional moderation is either weak or doesn't activate at all for such scenarios.
- A question to Marusia in the format "A or B?", where both answers are unwanted, but one will still be voiced.
- Remembering parts of a phrase as names of friends in Salute with subsequent reading of this list aloud.
- Saving "facts" about the user or their surroundings, which can then be invoked with a command like "tell me about me."
- Ordinary reminders where the text is first recorded, and a minute later the assistant simply reproduces it as a service message.
From a practical standpoint, this bypass is particularly troubling because it doesn't require rare conditions. The user doesn't need access to internal settings, third-party skills, or automation chains. It's enough to formulate the request several times so that the assistant first accepts the disputed text as data, and then pronounces it itself in a different context.
For home devices often used by children and families, this is no longer just a curiosity, but a quite concrete risk of inappropriate behavior.
Why the filters didn't work
In the note, the problem is described as architectural. Protective mechanisms in such systems are usually placed on direct user input: when a person asks the assistant to say something explicitly forbidden, the model or rule blocks the response. But when that same phrase is broken into harmless fragments, saved as a name, fact, or reminder, it starts to be perceived as trusted data. At the vocalization stage, re-checking is either too weak or completely absent.
"The problem is that control usually exists on input, but is absent on output."
That's why the author connects the observation to prompt injection and the broader class of attacks on LLM systems. If the model can't distinguish between an instruction and user data, safe individual elements can combine into an unwanted result. For voice platforms, this means not only reputational costs, but also more serious scenarios: from accidental reproduction of toxic phrases to leaks of fragments of saved context through vocalization.
What this means
The story with Marusia and Salute shows that voice assistants no longer suffice with simple moderation of direct requests. It's necessary to check not only what the user said now, but also what the system is about to pronounce from memory, reminders, and other "safe" data sources. Otherwise, ordinary household functions themselves become a channel for bypassing basic restrictions and a source of new risks.
Want to stop reading about AI and start using it?
AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.