SlideShare a Scribd company logo
Big data, Big prejudice:
how algorithms can
discriminate?
Sara Hajian
@eurecat.org
Ph.D. in computer science from Universitat Rovira i Virgili (URV)
Data Scientist @Eurecat. Sara’s research interests are data
mining methods and algorithms, social media and social
network analysis, privacy-preserving data mining and
publishing, and algorithmic bias
http://blog.ness-ses.com/big-data-101-big-data-at-rest
Big data, Big prejudice: how algorithms can discriminate?
Decision making: humans versus algorithms
• People's decisions include
objective and subjective
elements
• Algorithmic inputs include
only objective elements
Unfortunately the answer is
“positive”
Google image search: gender stereotypes
Google image query: “Doctor” Google image query: “Nurse”
M. Kay, C. Matuszek, S. Munson (2015): Unequal Representation and Gender Stereotypes in Image Search
Results for Occupations. CHI'15.
Google image search: gender stereotypes
• Google image search
for “C.E.O.” produced
11 percent women,
even though 27 percent
of United States chief
executives are women.
M. Kay, C. Matuszek, S. Munson (2015): Unequal Representation and Gender Stereotypes in Image Search
Results for Occupations. CHI'15.
Gender bias in Google’s Ad-targetingsystem
• Google’s algorithm shows
prestigious job ads to men,
but not to women.
A. Datta, M. C. Tschantz, and A. Datta (2015). Automated experiments on ad privacy settings.
Proceedings on Privacy Enhancing Technologies, 2015(1):92–112.
Racism
• Auto-tagging system
tagged black people
as “apes” or “animals”
https://twitter.com/jackyalcine/status/6153318692
66157568
Racism
• The importance of being Latanya
• Names used predominantly by black
men and women are much more
likely to generate ads related to
arrest records, than names used
predominantly by white men and
women.
L. Sweeney (2013). Discrimination in online ad delivery. Queue, 11(3). See also N. Newman (2011) in
Huffington Post.
Geography and race: the "Tiger Mom Tax"
• Pricing of SAT tutoring
by The Princeton
Review in the US
doubles for Asians, due
to geographical price
discrimination
J. Angwin and J. Larson (2015). The tiger mom tax. ProPublica.
Judiciary use of COMPAS scores
Pro Publica, May 2016. https://www.propublica.org/article/how-we-analyzed-the-compas-recidivism-algorithm
Big data, Big prejudice: how algorithms can discriminate?
Data-drivendecision making process
S. Hajian, F. Bonchi and C. Castillo (2016). Algorithmic Bias: From Discrimination
Discovery to Fairness-aware Data Mining. In KDD, pp. 2125-2126.
Sources of algorithmic bias: Data
• Data as a social mirror
• Sample size disparity
• Cultural differences
• Incomplete, incorrect, or
outdated data
M. Hardt (2014): "How big data is unfair". Medium.
Sources of algorithmic bias: Algorithm and model
• Undesired complexity
• Noise and meaning of
5% error
M. Hardt (2014): "How big data is unfair". Medium.
What should we do with this
algorithmic bias?
Algorithmic bias: solutions
• Legal:
• Anti-discrimination regulations
• Give us the rules of the
game: definitions,
objective functions,
constraints
• General Data Protection
Regulation (2018): Right
to explanation
B. Goodman and S. Flaxman (2016): EU regulations on algorithmic decision-making and a" right
to explanation". arXiv preprint arXiv:1606.08813.
Algorithmic bias: solutions
• Technical:
• Tools for discrimination
risk evaluation
• Tools for discrimination
risk mitigation
• Tools for algorithmic
auditing
• Explainable models and
user interfaces
Anti-discrimination by design
Anti-discrimination by design
S. Hajian, F. Bonchi and C. Castillo (2016). Algorithmic Bias: From Discrimination
Discovery to Fairness-aware Data Mining. In KDD, pp. 2125-2126.
Anti-discrimination by design
S. Hajian, F. Bonchi and C. Castillo (2016). Algorithmic Bias: From Discrimination
Discovery to Fairness-aware Data Mining. In KDD, pp. 2125-2126.
Anti-discrimination by design
S. Hajian, F. Bonchi and C. Castillo (2016). Algorithmic Bias: From Discrimination
Discovery to Fairness-aware Data Mining. In KDD, pp. 2125-2126.
Conclusions
• Bad news: The algorithm and big
data are not just mirroring the
existing bias but also they are
reinforcing that bias and
amplifying inequality
• Good news: Algorithmic
discrimination: Despite its
challenges, it brings also a lot
opportunities for machine
learning researchers to build tools
for addressing this problem
http://money.cnn.com/2016/09/06/technology/
weapons-of-math-destruction/index.html
Big data, Big prejudice: how algorithms can discriminate?
Thank you
Sara.hajian@eurecat.org

More Related Content

PPTX
MLA 2013 presentation
PDF
Algorithms are biased because we are. Are we willing to change?
PPTX
Intro to Big Data session, AAMC GREAT/GRAND Meeting, 2014
PDF
PDF
Private social networks d healthcare 03-05-2013
PDF
Data-driven stories off your beat - Mark Nichols - Muncie NewsTrain - 3.24.18
PPTX
Raven pack kevin
PDF
Explainable Fact Checking with Humans in-the-loop
MLA 2013 presentation
Algorithms are biased because we are. Are we willing to change?
Intro to Big Data session, AAMC GREAT/GRAND Meeting, 2014
Private social networks d healthcare 03-05-2013
Data-driven stories off your beat - Mark Nichols - Muncie NewsTrain - 3.24.18
Raven pack kevin
Explainable Fact Checking with Humans in-the-loop

Similar to Big data, Big prejudice: how algorithms can discriminate? (20)

PDF
Fairness in Machine Learning @Codemotion
PPTX
Research Using Behavioral Big Data: A Tour and Why Mechanical Engineers Shoul...
PPTX
“Big data” in human services organisations: Practical problems and ethical di...
PDF
Don't blindly trust your ML System, it may change your life (Azzurra Ragone, ...
PPTX
BROWN BAG TALK WITH MICAH ALTMAN, SOURCES OF BIG DATA FOR SOCIAL SCIENCES
PPTX
Algorithmic fairness
DOCX
June 2015 (142) MIS Quarterly Executive 67The Big Dat.docx
PPTX
IE_expressyourself_EssayH
PPTX
Digital Demography - WWW'17 Tutorial - Part II
PPTX
Big Data Ethics Cjbe july 2021
PPTX
Researcher Dilemmas using Behavioral Big Data in Healthcare (INFORMS DMDA Wo...
PPTX
Business Intelligence
PDF
Breakout 3. AI for Sustainable Development and Human Rights: Inclusion, Diver...
PDF
How do we train AI to be Ethical and Unbiased?
PDF
Grounded theory meets big data: One way to marry ethnography and digital methods
PDF
Discrimination Discovery
PDF
Roundtable: Social Media Users' Privacy Expectations & the Ethics of Using Th...
PPTX
Roger hoerl say award presentation 2013
PDF
Big Data Privacy - Society Issues + Big Data
PDF
The Future of Big Data
 
Fairness in Machine Learning @Codemotion
Research Using Behavioral Big Data: A Tour and Why Mechanical Engineers Shoul...
“Big data” in human services organisations: Practical problems and ethical di...
Don't blindly trust your ML System, it may change your life (Azzurra Ragone, ...
BROWN BAG TALK WITH MICAH ALTMAN, SOURCES OF BIG DATA FOR SOCIAL SCIENCES
Algorithmic fairness
June 2015 (142) MIS Quarterly Executive 67The Big Dat.docx
IE_expressyourself_EssayH
Digital Demography - WWW'17 Tutorial - Part II
Big Data Ethics Cjbe july 2021
Researcher Dilemmas using Behavioral Big Data in Healthcare (INFORMS DMDA Wo...
Business Intelligence
Breakout 3. AI for Sustainable Development and Human Rights: Inclusion, Diver...
How do we train AI to be Ethical and Unbiased?
Grounded theory meets big data: One way to marry ethnography and digital methods
Discrimination Discovery
Roundtable: Social Media Users' Privacy Expectations & the Ethics of Using Th...
Roger hoerl say award presentation 2013
Big Data Privacy - Society Issues + Big Data
The Future of Big Data
 
Ad

Recently uploaded (20)

PPTX
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
PDF
Mega Projects Data Mega Projects Data
PDF
annual-report-2024-2025 original latest.
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PDF
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PPTX
Qualitative Qantitative and Mixed Methods.pptx
PPTX
Business Acumen Training GuidePresentation.pptx
PDF
Fluorescence-microscope_Botany_detailed content
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PDF
Clinical guidelines as a resource for EBP(1).pdf
PPTX
1_Introduction to advance data techniques.pptx
PPTX
Computer network topology notes for revision
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PPTX
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PDF
Business Analytics and business intelligence.pdf
PPT
Miokarditis (Inflamasi pada Otot Jantung)
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
Mega Projects Data Mega Projects Data
annual-report-2024-2025 original latest.
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
Data_Analytics_and_PowerBI_Presentation.pptx
Qualitative Qantitative and Mixed Methods.pptx
Business Acumen Training GuidePresentation.pptx
Fluorescence-microscope_Botany_detailed content
IBA_Chapter_11_Slides_Final_Accessible.pptx
Clinical guidelines as a resource for EBP(1).pdf
1_Introduction to advance data techniques.pptx
Computer network topology notes for revision
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
Introduction-to-Cloud-ComputingFinal.pptx
Business Analytics and business intelligence.pdf
Miokarditis (Inflamasi pada Otot Jantung)
Ad

Big data, Big prejudice: how algorithms can discriminate?

  • 1. Big data, Big prejudice: how algorithms can discriminate? Sara Hajian @eurecat.org Ph.D. in computer science from Universitat Rovira i Virgili (URV) Data Scientist @Eurecat. Sara’s research interests are data mining methods and algorithms, social media and social network analysis, privacy-preserving data mining and publishing, and algorithmic bias
  • 4. Decision making: humans versus algorithms • People's decisions include objective and subjective elements • Algorithmic inputs include only objective elements
  • 5. Unfortunately the answer is “positive”
  • 6. Google image search: gender stereotypes Google image query: “Doctor” Google image query: “Nurse” M. Kay, C. Matuszek, S. Munson (2015): Unequal Representation and Gender Stereotypes in Image Search Results for Occupations. CHI'15.
  • 7. Google image search: gender stereotypes • Google image search for “C.E.O.” produced 11 percent women, even though 27 percent of United States chief executives are women. M. Kay, C. Matuszek, S. Munson (2015): Unequal Representation and Gender Stereotypes in Image Search Results for Occupations. CHI'15.
  • 8. Gender bias in Google’s Ad-targetingsystem • Google’s algorithm shows prestigious job ads to men, but not to women. A. Datta, M. C. Tschantz, and A. Datta (2015). Automated experiments on ad privacy settings. Proceedings on Privacy Enhancing Technologies, 2015(1):92–112.
  • 9. Racism • Auto-tagging system tagged black people as “apes” or “animals” https://twitter.com/jackyalcine/status/6153318692 66157568
  • 10. Racism • The importance of being Latanya • Names used predominantly by black men and women are much more likely to generate ads related to arrest records, than names used predominantly by white men and women. L. Sweeney (2013). Discrimination in online ad delivery. Queue, 11(3). See also N. Newman (2011) in Huffington Post.
  • 11. Geography and race: the "Tiger Mom Tax" • Pricing of SAT tutoring by The Princeton Review in the US doubles for Asians, due to geographical price discrimination J. Angwin and J. Larson (2015). The tiger mom tax. ProPublica.
  • 12. Judiciary use of COMPAS scores Pro Publica, May 2016. https://www.propublica.org/article/how-we-analyzed-the-compas-recidivism-algorithm
  • 14. Data-drivendecision making process S. Hajian, F. Bonchi and C. Castillo (2016). Algorithmic Bias: From Discrimination Discovery to Fairness-aware Data Mining. In KDD, pp. 2125-2126.
  • 15. Sources of algorithmic bias: Data • Data as a social mirror • Sample size disparity • Cultural differences • Incomplete, incorrect, or outdated data M. Hardt (2014): "How big data is unfair". Medium.
  • 16. Sources of algorithmic bias: Algorithm and model • Undesired complexity • Noise and meaning of 5% error M. Hardt (2014): "How big data is unfair". Medium.
  • 17. What should we do with this algorithmic bias?
  • 18. Algorithmic bias: solutions • Legal: • Anti-discrimination regulations • Give us the rules of the game: definitions, objective functions, constraints • General Data Protection Regulation (2018): Right to explanation B. Goodman and S. Flaxman (2016): EU regulations on algorithmic decision-making and a" right to explanation". arXiv preprint arXiv:1606.08813.
  • 19. Algorithmic bias: solutions • Technical: • Tools for discrimination risk evaluation • Tools for discrimination risk mitigation • Tools for algorithmic auditing • Explainable models and user interfaces Anti-discrimination by design
  • 20. Anti-discrimination by design S. Hajian, F. Bonchi and C. Castillo (2016). Algorithmic Bias: From Discrimination Discovery to Fairness-aware Data Mining. In KDD, pp. 2125-2126.
  • 21. Anti-discrimination by design S. Hajian, F. Bonchi and C. Castillo (2016). Algorithmic Bias: From Discrimination Discovery to Fairness-aware Data Mining. In KDD, pp. 2125-2126.
  • 22. Anti-discrimination by design S. Hajian, F. Bonchi and C. Castillo (2016). Algorithmic Bias: From Discrimination Discovery to Fairness-aware Data Mining. In KDD, pp. 2125-2126.
  • 23. Conclusions • Bad news: The algorithm and big data are not just mirroring the existing bias but also they are reinforcing that bias and amplifying inequality • Good news: Algorithmic discrimination: Despite its challenges, it brings also a lot opportunities for machine learning researchers to build tools for addressing this problem http://money.cnn.com/2016/09/06/technology/ weapons-of-math-destruction/index.html