ServiceNow: AI-агент сливает корпоративные тайны через цепочку поисковых запросов
Исследователи ServiceNow обнаружили: AI-агент для корпоративного поиска сливает конфиденциальные данные через обычные поисковые запросы — «эффект мозаики»…
AI-processed from Hugging Face Blog; edited by Hamidun News
Researchers at ServiceNow have published MosaicLeaks on Hugging Face — the first systematic analysis of how AI agents for deep research can inadvertently expose confidential corporate data. The culprit is not a hack or a code error, but the very nature of search queries.
The Mosaic Effect: Innocent Details Combine into Secrets
The work's title references the "mosaic effect" from intelligence theory: individual public facts are safe in isolation, but together they form a complete picture. An AI agent working with corporate documents makes a series of search queries. Each looks innocent. But an external observer seeing the entire sequence can reconstruct confidential information — infrastructure migration timelines, security incident details, internal projects. Researchers formalized three types of leaks:
- Intent leakage — queries reveal what exactly the agent is investigating
- Answer leakage — queries allow inference of answers to closed questions
- Full-information leakage — the observer finds private facts themselves, without knowing what to search for
The Paradox: More Accurate = More Dangerous
The most counterintuitive finding: standard training of an agent for maximum search accuracy makes it more dangerous from a privacy perspective. A basic untrained agent solved tasks successfully in 48.7% of cases. After training solely on task metrics, success improved to 59.3%. But leak frequency increased — from 34.0% to 51.7%. The mechanism is simple: to find the right document, the agent formulates more precise and informative queries. This same precision helps the search engine — and it reveals far more context to an external observer.
"More informative queries help the agent find the right documents, but
they also reveal more context to the observer about what is being searched for."
PA-DR: Dual-Reward Learning
ServiceNow proposes the Privacy-Aware Deep Research (PA-DR) architecture — a system where the agent is optimized on two objectives simultaneously.
Contextual rewards. In standard RL, the agent receives a reward only for the final correct answer. In PA-DR, each intermediate step is evaluated based on what the agent knew at that moment. This dramatically improves learning efficiency: 5–6 times fewer examples are needed to achieve comparable quality.
Trained privacy rewards. A separate evaluator model penalizes the agent for queries that create mosaic vulnerabilities — those that reveal intent or allow inference of private facts. Privacy criteria are also learned from data, not hardcoded.
The result of combining both mechanisms:
- Task chain success — 58.7% (without significant loss)
- Leak frequency — 9.9% (lower than the baseline untrained agent's 34.0%)
What This Means
The work identifies a fundamental limitation: corporate deep-research agents are unsafe by default. Running an agent on internal data with internet access creates a leakage channel that users may not suspect. PA-DR shows the problem is technically solvable without sacrificing quality — but it requires conscious design choices during training, not relying on hoping that a "good agent will figure it out."
Want to stop reading about AI and start using it?
AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.
The AI world, distilled — once a week
Seven stories that actually mattered, hand-picked. No noise, no reposts, no press releases.
Done! Check your inbox for a confirmation.