السلامة

AI Bias

AI bias refers to systematic, skewed patterns in an AI system's outputs that arise from biased training data, flawed problem framing, or model design choices, causing the system to treat individuals or groups inequitably.

AI bias describes any consistent, non-random error in a model's predictions or decisions that correlates with sensitive attributes such as race, gender, age, religion, or socioeconomic status. It encompasses both statistical bias—where a model's average prediction deviates from the true value—and social bias, where those errors are distributed unequally across demographic groups, often to the detriment of historically marginalized populations.

Bias enters AI systems through multiple pathways. Training data may underrepresent certain groups (a facial recognition dataset composed mostly of light-skinned faces will generalize poorly to darker-skinned subjects), reflect historical discrimination (a hiring model trained on decades of male-dominated approval decisions will encode those patterns as signal), or carry annotation bias where human labelers apply inconsistent standards across demographic groups. Model architecture and objective function also play a role: a system optimizing average accuracy may sacrifice performance on minority subgroups to improve overall metrics, a trade-off invisible in aggregate benchmark scores.

The practical consequences are significant. A 2018 study by Joy Buolamwini and Timnit Gebru found that commercial gender-classification systems had error rates up to 34 percentage points higher for dark-skinned women than for light-skinned men. Biased outputs in credit scoring, medical diagnosis, criminal risk assessment, and automated hiring have measurable effects on people's access to loans, treatment, liberty, and employment. Regulatory frameworks including the EU AI Act, with high-risk system requirements applicable from 2024, mandate bias assessments and human oversight before deployment.

As of 2026, bias mitigation spans pre-processing (dataset curation, resampling, synthetic data augmentation), in-processing (fairness constraints during training, adversarial debiasing), and post-processing (threshold adjustment per demographic group). No single technique eliminates all forms of bias simultaneously because different formal fairness definitions—equalized odds, demographic parity, and predictive parity—are mathematically incompatible when base rates differ across groups. Documentation standards such as Model Cards and Datasheets for Datasets, along with third-party auditing requirements increasingly encoded in law, seek to institutionalize accountability.

مثال

A bank's loan-approval model trained on historical data was found to deny applications from qualified borrowers in certain zip codes at rates significantly higher than for comparable borrowers elsewhere, a pattern an external audit traced to historical redlining encoded in the geographic features the model had learned to weight heavily.

مصطلحات مرتبطة

بيانات التدريب (Training Data)Model Evaluation (Evals)Content Moderation

← المسرد