Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. The Importance of Modeling Data Missingness in Algorithmic Fairness: A Causal Perspective
 
conference paper

The Importance of Modeling Data Missingness in Algorithmic Fairness: A Causal Perspective

Goel, Naman  
•
Amayuelas, Alfonso
•
Deshpande, Amit
Show more
January 1, 2021
Thirty-Fifth Aaai Conference On Artificial Intelligence, Thirty-Third Conference On Innovative Applications Of Artificial Intelligence And The Eleventh Symposium On Educational Advances In Artificial Intelligence
35th AAAI Conference on Artificial Intelligence / 33rd Conference on Innovative Applications of Artificial Intelligence / 11th Symposium on Educational Advances in Artificial Intelligence

Training datasets for machine learning often have some form of missingness. For example, to learn a model for deciding whom to give a loan, the available training data includes individuals who were given a loan in the past, but not those who were not. This missingness, if ignored, nullifies any fairness guarantee of the training procedure when the model is deployed. Using causal graphs, we characterize the missingness mechanisms in different real-world scenarios. We show conditions under which various distributions, used in popular fairness algorithms, can or can not be recovered from the training data. Our theoretical results imply that many of these algorithms can not guarantee fairness in practice. Modeling missingness also helps to identify correct design principles for fair algorithms. For example, in multi-stage settings where decisions are made in multiple screening rounds, we use our framework to derive the minimal distributions required to design a fair algorithm. Our proposed algorithm decentralizes the decision-making process and still achieves similar performance to the optimal algorithm that requires centralization and non-recoverable distributions.

  • Details
  • Metrics
Type
conference paper
DOI
10.1609/aaai.v35i9.16926
Web of Science ID

WOS:000680423507077

Author(s)
Goel, Naman  
Amayuelas, Alfonso
Deshpande, Amit
Sharma, Amit
Date Issued

2021-01-01

Publisher

ASSOC ADVANCEMENT ARTIFICIAL INTELLIGENCE

Publisher place

Palo Alto

Published in
Thirty-Fifth Aaai Conference On Artificial Intelligence, Thirty-Third Conference On Innovative Applications Of Artificial Intelligence And The Eleventh Symposium On Educational Advances In Artificial Intelligence
ISBN of the book

978-1-57735-866-4

Series title/Series vol.

AAAI Conference on Artificial Intelligence; 35

Start page

7564

End page

7573

Subjects

Computer Science, Artificial Intelligence

•

Computer Science, Interdisciplinary Applications

•

Education, Scientific Disciplines

•

Computer Science

•

Education & Educational Research

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
LIA  
Event nameEvent placeEvent date
35th AAAI Conference on Artificial Intelligence / 33rd Conference on Innovative Applications of Artificial Intelligence / 11th Symposium on Educational Advances in Artificial Intelligence

ELECTR NETWORK

Feb 02-09, 2021

Available on Infoscience
September 11, 2021
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/181291
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés