SlideShare a Scribd company logo
NEUTRALISING BIAS ON WORD
EMBEDDINGS
–Wilder Rodrigues
Wilder Rodrigues
• Machine Learning Engineer at Quby;
• Coursera Mentor;
• City.AI Ambassador;
• School of AI Dean [Utrecht]
• IBM Watson AI XPRIZE contestant;
• Kaggler;
• Public speaker;
• Family man and father of 3.
@wilderrodrigues
https://medium.com/@wilder.rodrigues
How do you see racism?
• Before you proceed, please watch this video: https://www.youtube.com/watch?v=5F_atkP3pqs
• The audio is in Portuguese, but in the next slide you will find translations for what people said in the
interviews.
Source: Canal deTV da FAP (Astrojildo Pereira Foundation)
Translations
• Group 1
• He is late;
• She is a fashion designer;
• Holds an executive position in either the HR
or Finance area;
• Taking care of his garden. Doesn’t look like a
gardener;
• She is cleaning her own house; the countertop;
• Graffiti artist; it’s an art, it’s not vandalism.
• Group II
• Vandalising the wall; she is a spitter;
• She is a housekeeper; cleaning the house;
• He is a gardener;
• He looks like a security guard or a
chauffeur;
• Seamstress; saleswoman;
• He is running away; he is a thief.
Unconscious bias
• Blue is for boys, pink for girls.
• Boys are better at maths and science.
• Tall people make better leaders.
• New mothers are more absent from work
than new fathers.
• People with tattoos are rebellious.
• Younger people are better with technology
than older people.
–Joanna Bryson, University of Bath and Princeton University
"AI is just an extension of our existing culture.”
Racialized code & Unregulated algorithms
Source: https://www.theguardian.com/technology/2017/dec/04/racist-facial-recognition-white-coders-black-people-police
Joy Buolamwini, Code4Rights and MIT Media Lab Researcher.
How white engineers built racist code – and
why it's dangerous for black people
Source: https://www.theguardian.com/technology/2017/dec/04/racist-facial-recognition-white-coders-black-people-police
Implicit AssociationTest
Both black and white Americans, for
example, are faster at associating names
like “Brad” and “Courtney” with words
like “happy” and “sunrise,” and names like
“Leroy” and “Latisha” with words like
“hatred” and “vomit” than vice versa.
Source: http://www.sciencemag.org/news/2017/04/even-artificial-intelligence-can-acquire-biases-against-race-and-gender
W.E.A.T
Names like “Brett” and “Allison” were
more similar to those for positive words
including love and laughter, and those for
names like “Alonzo” and “Shaniqua” were
more similar to negative words like
“cancer” and “failure.” 
Source: http://www.sciencemag.org/news/2017/04/even-artificial-intelligence-can-acquire-biases-against-race-and-gender
W.E.F.A.T
How closely related the embeddings for
words like “hygienist” and “librarian” were
to those of words like “female” and
“woman.” It then compared this
computer-generated gender association
measure to the actual percentage of
women in that occupation.
Source: http://www.sciencemag.org/news/2017/04/even-artificial-intelligence-can-acquire-biases-against-race-and-gender
Word Embeddings
A ⋅ B
∥A∥∥B∥
=
∑
n
i=1
AiBi
∑
n
i=1
A2
i ∑
n
i=1
B2
i
Source: https://medium.com/cityai/deep-learning-for-natural-language-processing-part-i-8369895ffb98
Father (L2 norm): 5.31
Mother (L2 norm): 5.63
d: 26.67
p: 29.89
Similarity: d / p = 0.89
Car (L2 norm): 5.73
Bird (L2 norm): 4.83
d: 5.96
p: 27.67
Similarity: d / p = 0.21
Identifying gender
[woman] - [man] = [female]
What about other words?
Neutralising bias from non-gender specific
words
ebias_comp
=
e ⋅ g
∥g∥2
2
g
edebiased
= e − ebias
Source: Bolukbasi et al., 2016, https://arxiv.org/pdf/1607.06520.pdf
Does it work?
• Cosine similarity between receptionist
and gender, before neutralising:
• 0.3307794175059373
• Cosine similarity between receptionist
and gender, after neutralising:
• 5.2021694209043796e-17
Equalising gender-specific words
Tricky
parts!
Equalising gender-specific words
• Cosine similarity between actor and gender, before
equalising:
• -0.08387555382505694
• Cosine similarity between actress and gender, before
equalising::
• 0.33422494897899785
• Cosine similarity between actor and gender, after
equalising:
• -0.8796563888581831
• Cosine similarity between actress and gender, after
equalising:
• 0.879656388858183
How far is actor from babysitter?
• Cosine similarity between actor and babysitter, before
neutralising:
• 0.2766562472128601
• Cosine similarity between actress and babysitter, before
neutralising::
• 0.3378475317457311
• Cosine similarity between actor and babysitter, after
neutralising:
• 0.1408988327631711
• Cosine similarity between actress and babysitter, after
neutralising:
• 0.14089883276317122
References
• https://www.youtube.com/watch?v=5F_atkP3pqs
• https://www.theguardian.com/technology/2017/dec/04/racist-facial-recognition-white-coders-black-people-police
• http://www.sciencemag.org/news/2017/04/even-artificial-intelligence-can-acquire-biases-against-race-and-gender
• https://medium.com/cityai/deep-learning-for-natural-language-processing-part-i-8369895ffb98
• Bolukbasi et al., 2016, https://arxiv.org/pdf/1607.06520.pdf
• Jeffrey Pennington, Richard Socher, and Christopher D. Manning, https://nlp.stanford.edu/projects/glove/
• https://github.com/ekholabs/DLinK/blob/master/notebooks/nlp/neutralising-equalising-word-embeddings.ipynb
Neutralising bias on word embeddings

More Related Content

PPTX
Why Buy the Cow: Milk These Online Resources for Free!
PPT
Multimedia03
PPTX
CAS Project Status 2017
PDF
Algorithm Bias
PPTX
Ethical Issues in Machine Learning Algorithms. (Part 3)
PDF
Using AI to Build Fair and Equitable Workplaces
PDF
Algorithmic Bias - What is it? Why should we care? What can we do about it?
PDF
Algorithmic Bias : What is it? Why should we care? What can we do about it?
Why Buy the Cow: Milk These Online Resources for Free!
Multimedia03
CAS Project Status 2017
Algorithm Bias
Ethical Issues in Machine Learning Algorithms. (Part 3)
Using AI to Build Fair and Equitable Workplaces
Algorithmic Bias - What is it? Why should we care? What can we do about it?
Algorithmic Bias : What is it? Why should we care? What can we do about it?

Similar to Neutralising bias on word embeddings (20)

PDF
Ethical Algorithms: Bias in Machine Learning for NextAI
PDF
Eliminating Machine Bias - Mary Ann Brennan - ML4ALL 2018
PDF
Women in AI (WAI) masterclass: bias, ethics & privacy
PDF
He Said, She Said: Finding and Fixing Bias in NLP (Natural Language Processin...
PPTX
Avoiding Machine Learning Pitfalls 2-10-18
PPTX
Vahid naghashi nlp_presentation
PDF
Bijaya Zenchenko - An Embedding is Worth 1000 Words - Start Using Word Embedd...
PDF
Man is to computer programmer as woman is to homemaker debiasing word embeddings
PDF
Deep Learning and Text Mining
PDF
Ethical Dilemmas in AI/ML-based systems
PPTX
Tomáš Mikolov - Distributed Representations for NLP
PPTX
Introducción a NLP (Natural Language Processing) en Azure
PDF
David Barber - Deep Nets, Bayes and the story of AI
PDF
When the AIs failures send us back to our own societal biases
PDF
Text mining voor Business Intelligence toepassingen
PDF
Yulia-Tsvetkov-slides-AI-and-ethics-projects.pdf
PPTX
Avoiding Machine Learning Pitfalls 2-10-18
PDF
Artificial Thinking: can machines reason with analogies?
PDF
Generative Artificial Intelligence (AI) and Bias
PDF
Texts Classification with the usage of Neural Network based on the Word2vec’s...
Ethical Algorithms: Bias in Machine Learning for NextAI
Eliminating Machine Bias - Mary Ann Brennan - ML4ALL 2018
Women in AI (WAI) masterclass: bias, ethics & privacy
He Said, She Said: Finding and Fixing Bias in NLP (Natural Language Processin...
Avoiding Machine Learning Pitfalls 2-10-18
Vahid naghashi nlp_presentation
Bijaya Zenchenko - An Embedding is Worth 1000 Words - Start Using Word Embedd...
Man is to computer programmer as woman is to homemaker debiasing word embeddings
Deep Learning and Text Mining
Ethical Dilemmas in AI/ML-based systems
Tomáš Mikolov - Distributed Representations for NLP
Introducción a NLP (Natural Language Processing) en Azure
David Barber - Deep Nets, Bayes and the story of AI
When the AIs failures send us back to our own societal biases
Text mining voor Business Intelligence toepassingen
Yulia-Tsvetkov-slides-AI-and-ethics-projects.pdf
Avoiding Machine Learning Pitfalls 2-10-18
Artificial Thinking: can machines reason with analogies?
Generative Artificial Intelligence (AI) and Bias
Texts Classification with the usage of Neural Network based on the Word2vec’s...
Ad

More from Wilder Rodrigues (7)

PDF
Improving Machine Learning
 Workflows: Training, Packaging and Serving.
PDF
Deep Learning for Natural Language Processing
PDF
Ai - A Practical Approach
PDF
Java 9: Jigsaw Project
PDF
Microservices with Spring Cloud
PDF
Machine intelligence
PDF
Embracing Reactive Streams with Java 9 and Spring 5
Improving Machine Learning
 Workflows: Training, Packaging and Serving.
Deep Learning for Natural Language Processing
Ai - A Practical Approach
Java 9: Jigsaw Project
Microservices with Spring Cloud
Machine intelligence
Embracing Reactive Streams with Java 9 and Spring 5
Ad

Recently uploaded (20)

PPTX
Derivatives of integument scales, beaks, horns,.pptx
PPTX
7. General Toxicologyfor clinical phrmacy.pptx
PPT
protein biochemistry.ppt for university classes
PDF
diccionario toefl examen de ingles para principiante
PDF
Sciences of Europe No 170 (2025)
PPTX
DRUG THERAPY FOR SHOCK gjjjgfhhhhh.pptx.
PPTX
ECG_Course_Presentation د.محمد صقران ppt
PDF
HPLC-PPT.docx high performance liquid chromatography
PDF
. Radiology Case Scenariosssssssssssssss
DOCX
Q1_LE_Mathematics 8_Lesson 5_Week 5.docx
PDF
IFIT3 RNA-binding activity primores influenza A viruz infection and translati...
PPTX
2. Earth - The Living Planet earth and life
PDF
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
PPT
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
PPTX
2. Earth - The Living Planet Module 2ELS
PPTX
Classification Systems_TAXONOMY_SCIENCE8.pptx
PPTX
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
PPTX
famous lake in india and its disturibution and importance
PPTX
GEN. BIO 1 - CELL TYPES & CELL MODIFICATIONS
PDF
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
Derivatives of integument scales, beaks, horns,.pptx
7. General Toxicologyfor clinical phrmacy.pptx
protein biochemistry.ppt for university classes
diccionario toefl examen de ingles para principiante
Sciences of Europe No 170 (2025)
DRUG THERAPY FOR SHOCK gjjjgfhhhhh.pptx.
ECG_Course_Presentation د.محمد صقران ppt
HPLC-PPT.docx high performance liquid chromatography
. Radiology Case Scenariosssssssssssssss
Q1_LE_Mathematics 8_Lesson 5_Week 5.docx
IFIT3 RNA-binding activity primores influenza A viruz infection and translati...
2. Earth - The Living Planet earth and life
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
2. Earth - The Living Planet Module 2ELS
Classification Systems_TAXONOMY_SCIENCE8.pptx
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
famous lake in india and its disturibution and importance
GEN. BIO 1 - CELL TYPES & CELL MODIFICATIONS
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...

Neutralising bias on word embeddings

  • 1. NEUTRALISING BIAS ON WORD EMBEDDINGS –Wilder Rodrigues
  • 2. Wilder Rodrigues • Machine Learning Engineer at Quby; • Coursera Mentor; • City.AI Ambassador; • School of AI Dean [Utrecht] • IBM Watson AI XPRIZE contestant; • Kaggler; • Public speaker; • Family man and father of 3. @wilderrodrigues https://medium.com/@wilder.rodrigues
  • 3. How do you see racism? • Before you proceed, please watch this video: https://www.youtube.com/watch?v=5F_atkP3pqs • The audio is in Portuguese, but in the next slide you will find translations for what people said in the interviews. Source: Canal deTV da FAP (Astrojildo Pereira Foundation)
  • 4. Translations • Group 1 • He is late; • She is a fashion designer; • Holds an executive position in either the HR or Finance area; • Taking care of his garden. Doesn’t look like a gardener; • She is cleaning her own house; the countertop; • Graffiti artist; it’s an art, it’s not vandalism. • Group II • Vandalising the wall; she is a spitter; • She is a housekeeper; cleaning the house; • He is a gardener; • He looks like a security guard or a chauffeur; • Seamstress; saleswoman; • He is running away; he is a thief.
  • 5. Unconscious bias • Blue is for boys, pink for girls. • Boys are better at maths and science. • Tall people make better leaders. • New mothers are more absent from work than new fathers. • People with tattoos are rebellious. • Younger people are better with technology than older people.
  • 6. –Joanna Bryson, University of Bath and Princeton University "AI is just an extension of our existing culture.”
  • 7. Racialized code & Unregulated algorithms Source: https://www.theguardian.com/technology/2017/dec/04/racist-facial-recognition-white-coders-black-people-police Joy Buolamwini, Code4Rights and MIT Media Lab Researcher.
  • 8. How white engineers built racist code – and why it's dangerous for black people Source: https://www.theguardian.com/technology/2017/dec/04/racist-facial-recognition-white-coders-black-people-police
  • 9. Implicit AssociationTest Both black and white Americans, for example, are faster at associating names like “Brad” and “Courtney” with words like “happy” and “sunrise,” and names like “Leroy” and “Latisha” with words like “hatred” and “vomit” than vice versa. Source: http://www.sciencemag.org/news/2017/04/even-artificial-intelligence-can-acquire-biases-against-race-and-gender
  • 10. W.E.A.T Names like “Brett” and “Allison” were more similar to those for positive words including love and laughter, and those for names like “Alonzo” and “Shaniqua” were more similar to negative words like “cancer” and “failure.”  Source: http://www.sciencemag.org/news/2017/04/even-artificial-intelligence-can-acquire-biases-against-race-and-gender
  • 11. W.E.F.A.T How closely related the embeddings for words like “hygienist” and “librarian” were to those of words like “female” and “woman.” It then compared this computer-generated gender association measure to the actual percentage of women in that occupation. Source: http://www.sciencemag.org/news/2017/04/even-artificial-intelligence-can-acquire-biases-against-race-and-gender
  • 12. Word Embeddings A ⋅ B ∥A∥∥B∥ = ∑ n i=1 AiBi ∑ n i=1 A2 i ∑ n i=1 B2 i Source: https://medium.com/cityai/deep-learning-for-natural-language-processing-part-i-8369895ffb98 Father (L2 norm): 5.31 Mother (L2 norm): 5.63 d: 26.67 p: 29.89 Similarity: d / p = 0.89 Car (L2 norm): 5.73 Bird (L2 norm): 4.83 d: 5.96 p: 27.67 Similarity: d / p = 0.21
  • 13. Identifying gender [woman] - [man] = [female]
  • 15. Neutralising bias from non-gender specific words ebias_comp = e ⋅ g ∥g∥2 2 g edebiased = e − ebias Source: Bolukbasi et al., 2016, https://arxiv.org/pdf/1607.06520.pdf
  • 16. Does it work? • Cosine similarity between receptionist and gender, before neutralising: • 0.3307794175059373 • Cosine similarity between receptionist and gender, after neutralising: • 5.2021694209043796e-17
  • 18. Equalising gender-specific words • Cosine similarity between actor and gender, before equalising: • -0.08387555382505694 • Cosine similarity between actress and gender, before equalising:: • 0.33422494897899785 • Cosine similarity between actor and gender, after equalising: • -0.8796563888581831 • Cosine similarity between actress and gender, after equalising: • 0.879656388858183
  • 19. How far is actor from babysitter? • Cosine similarity between actor and babysitter, before neutralising: • 0.2766562472128601 • Cosine similarity between actress and babysitter, before neutralising:: • 0.3378475317457311 • Cosine similarity between actor and babysitter, after neutralising: • 0.1408988327631711 • Cosine similarity between actress and babysitter, after neutralising: • 0.14089883276317122
  • 20. References • https://www.youtube.com/watch?v=5F_atkP3pqs • https://www.theguardian.com/technology/2017/dec/04/racist-facial-recognition-white-coders-black-people-police • http://www.sciencemag.org/news/2017/04/even-artificial-intelligence-can-acquire-biases-against-race-and-gender • https://medium.com/cityai/deep-learning-for-natural-language-processing-part-i-8369895ffb98 • Bolukbasi et al., 2016, https://arxiv.org/pdf/1607.06520.pdf • Jeffrey Pennington, Richard Socher, and Christopher D. Manning, https://nlp.stanford.edu/projects/glove/ • https://github.com/ekholabs/DLinK/blob/master/notebooks/nlp/neutralising-equalising-word-embeddings.ipynb