Seguridad

Deepfake

A deepfake is AI-generated or AI-manipulated media—video, audio, or images—in which a person's likeness or voice is fabricated or replaced to create realistic but false depictions, typically using deep learning generative models.

Deepfake refers to synthetic media produced by deep learning models that convincingly replaces or fabricates a person's face, voice, or both in video, audio, or still images. The term is a portmanteau of deep learning and fake, coined on internet forums around 2017 when face-swapping software first became accessible to non-specialists. Modern deepfakes encompass face-swap videos, voice cloning synthesized from seconds of reference audio, lip-sync manipulation that inserts fabricated words into existing footage, and fully synthetic talking-head videos generated without any original recording of the depicted individual.

Early deepfakes relied on Generative Adversarial Networks (GANs), in which a generator synthesizes realistic output while a discriminator tries to identify fakes, with both networks improving through adversarial training. By the mid-2020s, diffusion models and neural rendering techniques such as Neural Radiance Fields (NeRF) and Gaussian splatting produced higher-fidelity results with fewer visual artifacts. Simultaneously, voice cloning services achieved near-human naturalness from under 30 seconds of reference audio, lowering the barrier for audio-only deepfakes used in social engineering attacks.

Consequential risks span multiple domains. Political disinformation campaigns have used fabricated videos of public officials to spread false statements ahead of elections in multiple countries. Voice-cloning fraud—sometimes called vishing—has cost enterprises tens of millions of dollars in incidents where audio of executives was synthesized to authorize wire transfers. Non-consensual intimate imagery using fabricated likenesses causes documented psychological harm and constitutes a growing category of online abuse. The same underlying technology also has legitimate uses in film post-production, accessibility tools for people who have lost the ability to speak, and interactive entertainment.

As of 2026, detection tools—classifiers trained on facial inconsistencies, unnatural blinking patterns, and spectral artifacts—consistently lag behind generation capabilities, creating a persistent adversarial dynamic. Regulatory responses include the EU AI Act's mandatory disclosure requirements for AI-generated media, US state-level legislation targeting political and non-consensual intimate deepfakes, and major platform policies requiring labeling of synthetic content. The Coalition for Content Provenance and Authenticity (C2PA) promotes cryptographic provenance standards that embed tamper-evident origin metadata into media at the point of capture.

Ejemplo

A financial services firm's fraud team flagged a wire-transfer authorization request after voice-analysis software detected spectral inconsistencies in a call appearing to come from the CFO; forensic review confirmed the audio was a deepfake synthesized from recordings extracted from publicly available earnings call webcasts.

Términos relacionados

Últimas noticias sobre el tema

← Glosario