MosaicLeaks: Can Your Research Agent Keep a Secret?

MosaicLeaks: Can Your Research Agent Keep a Secret?


Published June 18, 2026 β€” As enterprises increasingly deploy AI research agents to handle sensitive internal data, a new class of privacy risks has emerged. Researchers from ServiceNow have identified a critical vulnerability, dubbed MosaicLeaks, that can expose confidential information through seemingly harmless queries.


The Growing Threat of Mosaic Attacks


MosaicLeaks exploits a fundamental weakness in how large language models (LLMs) handle context: when an agent is given access to a knowledge base, it may inadvertently piece together fragmentary answers from multiple unrelated queries to reconstruct private data. By 2026, with most Fortune 500 companies using retrieval-augmented generation (RAG) pipelines, the attack surface has expanded dramatically.


In a typical scenario, a user might ask a research agent:

  • "What is the average salary of engineering managers?"
  • "Who is the highest-paid engineering manager?"
  • "Which project did X lead?"

Individually, each answer appears innocuous. Combined, they can pinpoint a specific employee's compensation and role. This mosaic effect is particularly dangerous when the agent has access to proprietary HR, financial, or product launch data.


How MosaicLeaks Works


  1. Granular access control is bypassed β€” Most RAG systems do not track how separate queries relate to each other. An attacker can iterate through small, targeted questions.
  2. No query memory β€” The agent lacks awareness of previous interactions, so it cannot detect when a user is building a mosaic.
  3. Faithful generation is misinterpreted β€” When the LLM accurately answers each sub-question, it appears correct, but the cumulative outcome violates data confidentiality.

  4. Real-World Impact in 2026


    • Healthcare: A researcher could infer patient diagnoses by combining query results about symptoms, test dates, and prescribed medications, even if each query only returns anonymized aggregates.
    • Finance: An analyst could deduce insider trading patterns by asking about executive trades, board meeting dates, and stock price movements separately.
    • Legal: MosaicLeaks can reconstruct confidential settlement terms or client identities from fragmented discovery requests.

    Mitigation Strategies


    ServiceNow's research team recommends a multi-layered defense:


    • Differential privacy injection: Add calibrated noise to numeric aggregates so that no single query can reveal exact values, and multiple queries cannot converge on a precise figure.
    • Query correlation detection: Maintain a short-term memory of recent user queries and flag patterns that suggest mosaic reconstruction attempts.
    • Context-aware redaction: Train the model to recognize when a combination of answers could be sensitive, even if each piece separately is not. This can be implemented via classifier gates or fine-tuned safety prompts.
    • Rate-limiting on sensitive topics β€” Hard limit the number of queries per session that touch certain high-risk data categories (e.g., employee PII, unreleased financials).

    Looking Ahead


    As of mid-2026, no widely adopted standard exists for mosaic attack resilience in enterprise AI agents. ServiceNow has open-sourced their MosaicLeaks detection toolkit, which includes a benchmark dataset and adversarial simulation framework. The goal is to push the industry toward agents that are not only accurate but also context-aware privacy guardians.


    The question remains: can your research agent keep a secret? With MosaicLeaks, the answer depends on how well your system remembersβ€”and forgets.




    This article was authored by Alexander Gurung and Rafael Pardinas, researchers at ServiceNow, as part of ongoing work in AI safety and enterprise data protection.

    via Hugging Face Blog

Related