Skip to content

AI Ethics and Governance

Introduction

The widespread deployment of AI systems raises ethical challenges including fairness, bias, accountability, and regulation. Within the AI Safety & Trustworthiness section, this page answers a narrower question: once models affect real people and organizations, how do we define fairness, assign responsibility, and turn governance requirements into concrete processes?

This page focuses on four things:

  • how bias enters the model lifecycle
  • what fairness metrics actually measure, and why they conflict
  • how regulation, model documentation, and impact assessments turn ethics into organizational controls
  • which minimal engineering practices make fairness auditing operational

1. Bias in Machine Learning

1.1 Sources of Bias

Bias Type Source Example
Historical bias Real-world inequalities Hiring data reflecting historical gender discrimination
Representation bias Imbalanced group proportions in training data Face datasets dominated by lighter skin tones
Measurement bias Proxy variables introducing bias Using zip codes as proxies for race
Aggregation bias Using one model for different groups Medical models ignoring racial differences
Evaluation bias Evaluation data not representing actual users Test sets lacking specific groups
Deployment bias System used differently than designed Applied to populations outside the training scope

1.2 Notable Cases

Case Problem Cause
Amazon hiring AI Discriminated against female applicants Training data reflected historical preferences
COMPAS recidivism Unfair to Black defendants Proxy variables and historical bias
Facial recognition Lower accuracy for darker skin tones Imbalanced training data
GPT language models Gender/racial stereotypes Bias in internet text

2. Fairness Metrics

2.1 Group Fairness

Let \(A\) be the protected attribute (e.g., gender, race), \(\hat{Y}\) the model prediction, and \(Y\) the true label.

Demographic Parity:

\[ P(\hat{Y} = 1 | A = 0) = P(\hat{Y} = 1 | A = 1) \]

Positive prediction rates are equal across groups.

Equalized Odds:

\[ P(\hat{Y} = 1 | Y = y, A = 0) = P(\hat{Y} = 1 | Y = y, A = 1), \quad \forall y \in \{0, 1\} \]

Prediction rates are equal across groups for each true label.

Equal Opportunity:

\[ P(\hat{Y} = 1 | Y = 1, A = 0) = P(\hat{Y} = 1 | Y = 1, A = 1) \]

Only requires the true positive rate (TPR) to be equal across groups.

Predictive Parity:

\[ P(Y = 1 | \hat{Y} = 1, A = 0) = P(Y = 1 | \hat{Y} = 1, A = 1) \]

Positive predictive value (Precision) is equal across groups.

2.2 Impossibility Theorems

Chouldechova (2017) and Kleinberg et al. (2016) proved:

When base rates differ across groups, it is impossible to simultaneously satisfy all fairness metrics.

This means the choice of fairness definition is itself a value judgment.

2.3 Why an engineering snippet belongs here

Governance is not just a list of principles. The moment a team claims that it has "evaluated fairness" or "trained under a fairness constraint," those claims have to map to an auditable workflow. The fairlearn snippet below is not here as a library tutorial. It is a minimal example of how governance requirements become engineering actions:

  • MetricFrame: computes metrics by sensitive group instead of relying on one overall average
  • selection_rate: checks whether positive outcomes are distributed evenly across groups
  • false_positive_rate: checks whether one group is disproportionately harmed by false alarms
  • accuracy: reminds us that fairness trade-offs are evaluated alongside task performance
  • ExponentiatedGradient + DemographicParity: shows what it means to encode a fairness constraint into training or post-processing rather than adding an after-the-fact explanation

Without this bridge, governance stays at the slogan level instead of becoming part of model evaluation and release discipline.

2.4 Minimal fairness workflow example

from fairlearn.metrics import MetricFrame, selection_rate, false_positive_rate

# Compute group-level metrics
metric_frame = MetricFrame(
    metrics={
        "selection_rate": selection_rate,
        "false_positive_rate": false_positive_rate,
        "accuracy": accuracy_score,
    },
    y_true=y_test,
    y_pred=y_pred,
    sensitive_features=sensitive_features
)

print(metric_frame.by_group)

# Fairness-constrained training
from fairlearn.reductions import ExponentiatedGradient, DemographicParity

constraint = DemographicParity()
mitigator = ExponentiatedGradient(estimator, constraint)
mitigator.fit(X_train, y_train, sensitive_features=A_train)

3. Regulatory Frameworks

3.1 EU AI Act

The EU AI Act classifies management by risk level:

Risk Level Requirements Examples
Unacceptable risk Prohibited Social scoring systems, real-time public facial recognition
High risk Strict regulation Medical AI, hiring AI, credit scoring, judiciary
Limited risk Transparency requirements Chatbots must declare identity
Minimal risk No special requirements Spam filters, game AI

High-risk AI system requirements:

  • Risk management system
  • Data governance and data quality
  • Technical documentation
  • Record keeping and traceability
  • Transparency and user information
  • Human oversight
  • Accuracy, robustness, and cybersecurity
  • Conformity assessment

3.2 Chinese AI Regulations

Regulation Year Focus
Deep Synthesis Management Provisions 2023 Deepfake labeling, content review
Interim Measures for Generative AI 2023 Training data quality, content safety
Algorithm Recommendation Management Provisions 2022 Recommendation transparency, user rights
Personal Information Protection Law (PIPL) 2021 Data protection, similar to GDPR

3.3 Other Regions

Region Approach Characteristics
United States Industry self-regulation + executive orders 2023 AI Executive Order; state-level legislation
United Kingdom Principles-based Building on existing regulatory frameworks
Japan Innovation-promoting Lighter regulation

4. Responsible AI Principles

4.1 Major Frameworks

Microsoft Responsible AI Principles:

  1. Fairness
  2. Reliability & Safety
  3. Privacy & Security
  4. Inclusiveness
  5. Transparency
  6. Accountability

Google AI Principles:

  1. Be socially beneficial
  2. Avoid creating or reinforcing unfair bias
  3. Be built and tested for safety
  4. Be accountable to people
  5. Incorporate privacy design principles
  6. Uphold high standards of scientific excellence
  7. Be made available for uses that accord with these principles

Anthropic Core Safety Commitments:

  • Do not pursue advanced AI capabilities at the expense of safety
  • Invest substantial resources in safety research
  • Collaborate with policymakers
  • Transparently share safety research

4.2 Practical Recommendations

Phase Practice
Design Define use cases and limitations; involve stakeholders
Data Audit training data for bias; data documentation (Datasheets)
Development Fairness-constrained training; multi-dimensional evaluation
Testing Red teaming; group-level evaluation; adversarial testing
Deployment Human oversight; monitor fairness drift; feedback mechanisms
Documentation Model Cards; data statements; impact assessments

5. AI Governance Tools

Tool Purpose
Model Cards Model documentation standard (Google)
Datasheets for Datasets Dataset documentation standard
AI Impact Assessment Impact assessment framework
Fairlearn Fairness evaluation and mitigation (Microsoft)
AI Verify AI governance testing framework (Singapore)
NIST AI RMF AI Risk Management Framework (US)

6. Open Challenges

Challenge Description
Conflicting fairness definitions Different fairness metrics cannot be simultaneously satisfied
Cross-cultural differences Different cultures have different understandings of fairness and privacy
Regulation-innovation balance Too strict hampers development; too lax causes harm
Generative AI Deepfakes, misinformation, copyright issues
Global coordination Inconsistent regulations across countries/regions
Accountability chain Who is responsible for AI errors? Developers? Deployers?

Relations to other topics

References

  • EU AI Act Full Text
  • "Fairness and Machine Learning" - Barocas, Hardt, Narayanan
  • "Weapons of Math Destruction" - Cathy O'Neil
  • Google Responsible AI Practices
  • Microsoft Responsible AI Standard

评论 #