Deepfakes and data leaks: how AI violates privacy
An investigation by MIT Technology Review identified two critical privacy threats in the age of AI. First: porn deepfakes that use women's bodies and faces with

Research by MIT Technology Review reveals two serious privacy threats posed by modern AI: women discover their bodies in fake pornographic videos created with deepfake technology, while large language models unintentionally disclose private phone numbers and other personal data.
Deepfake Porn as a Global Problem
When Jennifer got a job in 2023, she ran her professional photo through a facial recognition program—standard procedure for new employees. Several days later came the shock: the woman discovered a video in which her face and body were used to create pornographic content without any consent. Jennifer's story is far from unique.
According to research, more than 99% of all deepfake videos are pornography, and the overwhelming majority of victims are women and girls. Tools for creating such videos are becoming increasingly accessible. Today, free applications and simple scripts allow anyone without special skills to create convincing videos in just a few hours.
The problem is scaling: content hosting platforms struggle with a wave of deepfake porn, but removing material is often impossible once it has spread. For victims, this means permanent shame, psychological trauma, and often—the inability to prove in court that it was actually them. Moreover, each reposting of the video creates new trauma.
When AI Reveals Personal Information
In parallel, research discovered a second threat: large language models unintentionally reproduce private data. When people enter their phone numbers, email addresses, and other personal information into AI assistants, this data can be stored in training data and later reproduced in responses to other users. During model training, systems absorb vast amounts of text from the internet—including private messages, databases, and leaks from companies. The system can then reproduce this information if it happens to encounter a suitable query. Users often don't know that their personal information has been copied into the model itself and may be disclosed.
- Phone numbers are reproduced in a significant percentage of tests
- Email addresses are disclosed even more frequently
- Social security numbers, addresses, and other data are also at risk of being disclosed
- Users are typically not informed of this risk when using the service
Legal Vacuum
Neither deepfake porn nor data leaks through AI have adequate legal protection in most jurisdictions. Europe is moving faster, thanks to GDPR and the new AI Act, but in the USA, Russia, and many other countries, victims usually have no real way to protect their rights. Companies creating AI models rarely bear sufficient responsibility. There is no unified standard for cleaning training data of private information, and there are no strict penalties for leaks. Some companies don't even disclose whether a leak occurred, hiding the problem from the public.
What This Means
These two problems show a broader picture: AI is developing in a legal vacuum, with minimal accountability of developers to victims. Solutions are urgently needed on three levels: technical—filtering training data and protecting against deepfakes; legal—criminal liability for creating and distributing pornographic deepfakes; and educational—people should know about these risks. Without a comprehensive approach, the wave of privacy violations will only grow. AI system developers must take responsibility for what data they collect and how they use it. And regulators must finally start taking action.