Wired→ original

Meta suspended work with Mercor after leak of data on AI model training

Several of the largest AI labs are investigating a security incident at data vendor Mercor, and Meta has already suspended cooperation with the company. At…

AI-processed from Wired; edited by Hamidun News
Meta suspended work with Mercor after leak of data on AI model training
Source: Wired. Collage: Hamidun News.
◐ Listen to article

Mercor, a leading data provider for the AI industry, found itself at the center of a serious security incident. Several major AI labs launched internal investigations, and Meta — one of the company's key clients — announced a suspension of cooperation. Confidential data about AI model training methods was potentially at risk: information that tech companies carefully guard as their primary competitive asset.

Mercor is a platform that connects AI companies with thousands of data labeling and annotation specialists worldwide. It is precisely labeled data — carefully selected texts, dialogs, images with quality marks — that forms the foundation for training modern language models. Without quality annotation, neither GPT-4, nor Claude, nor Llama would exist.

Mercor served the industry's leading players and over several years became one of the most prominent vendors in this segment. The key question of the incident is what exactly could have been exposed. It's not just a leak of client database or personal data.

Instructions for annotators, data categories, preference schemes — RLHF labels that train models to give desired answers — all of this indirectly reveals the methodological decisions of a particular company. Developing such processes costs hundreds of millions of dollars and requires many years of accumulated expertise. The compromise of this data is comparable in value to a source code leak.

Meta reacted quickly and preventatively — it suspended work with Mercor pending full clarification of the incident. This is standard protocol when supply chain compromise is suspected: continuing to transmit sensitive data to a vendor with unknown security status is an unjustified risk. Especially since Meta invests tens of billions in its own AI systems, including the open Llama model family and Meta AI assistant.

Other AI labs that worked with Mercor are also conducting their own reviews. It remains unclear: what exactly was compromised, in what volume, and whether the incident resulted from external cyberattack or internal security error. Neither Mercor nor the involved companies have yet disclosed technical details of what occurred.

The incident exposes a systemic vulnerability in the AI industry. Mass outsourcing of data labeling means that in the production chain of each major AI model, dozens and sometimes hundreds of intermediary companies participate. Each of them gains access to fragments of their clients' confidential methodology.

Meanwhile, there are no unified industry security standards for such vendors: no mandatory audits, no encryption requirements, no incident notification protocols. For Mercor, this is a reputational crisis. The company's business is entirely based on the trust of AI labs, and that trust is now in question.

Even if the investigation shows that actual damage turned out to be limited, the mere fact of the incident and Meta's public reaction will change the company's negotiating position in the market. This incident should accelerate discussions about mandatory security standards for data providers. Training methodology is a key competitive weapon in the AI race.

Treating data vendors as ordinary contractors is no longer possible: the level of inspection and control should match the level of access to confidential information. The scale of the incident and the full list of affected companies have not yet been disclosed. Details of the investigation will become known in the coming days.

ZK
Hamidun News
AI news without noise. Daily editorial selection from 400+ sources. A product by Zhemal Khamidun, Head of AI at Alpina Digital.

Want to stop reading about AI and start using it?

AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.

What do you think?
Loading comments…