السلامة

AI Watermarking

AI watermarking is a technique for embedding imperceptible signals into AI-generated content — text, images, audio, or video — to enable later identification of its machine-made origin. It supports provenance verification and content authenticity without degrading perceived quality.

AI watermarking refers to methods that encode identifying information into model outputs in ways that survive normal handling but remain undetectable to human perception. For images, signals are embedded in pixel-value distributions or frequency components; for text, subtle statistical biases are introduced at the token-selection stage; for audio and video, modifications target frequency bands or metadata layers.

In text watermarking, one influential approach — published by Kirchenbauer et al. at the University of Maryland in 2023 — partitions a model's vocabulary into "green" and "red" token lists at each generation step, biasing the model to prefer green tokens. A detector with knowledge of the partition can then identify generated text with high statistical confidence. Image watermarking uses techniques such as steganographic encoding or learned perturbations applied post-generation. Robustness to common transformations like cropping, JPEG compression, and paraphrasing remains an active research challenge, and the adversarial dynamics between watermark insertion and removal continue to evolve.

Watermarking is considered a key technical tool for AI content provenance and disinformation mitigation. In July 2023, major AI labs including OpenAI, Google, Meta, and Anthropic voluntarily committed to developing watermarking systems as part of White House agreements. The EU AI Act and proposed U.S. legislation reference watermarking as a transparency mechanism for synthetic media, increasing the regulatory incentive for adoption.

As of 2026, no single standard has achieved universal adoption. Google's SynthID tool watermarks images from Imagen and text from Gemini models, progressively deployed across Google products. The C2PA (Coalition for Content Provenance and Authenticity), backed by Adobe, Microsoft, and camera manufacturers, provides a complementary cryptographic provenance standard at the file level. Open-source watermarking implementations exist but can be partially defeated by paraphrasing attacks, and detection accuracy degrades when watermarked content is heavily edited.

مثال

A news organization uses Google's SynthID detector to check whether an image submitted by a freelancer was generated by an AI image model, flagging it for editorial review before publication.

مصطلحات مرتبطة

← المسرد