SlideShare a Scribd company logo
Future directions of
Fairness-aware Data Mining
Recommendation, Causality, and Theoretical Aspects
Toshihiro Kamishima*1 and Kazuto Fukuchi*2
joint work with Shotaro Akaho*1, Hideki Asoh*1, and Jun Sakuma*2,3
*1National Institute of Advanced Industrial Science and Technology (AIST), Japan
*2University of Tsukuba, and *3JST CREST
Workshop on Fairness, Accountability, and Transparency in Machine Learning
In conjunction with the ICML 2015 @ Lille, France, Jul. 11, 2015
1
Outline
New Applications of Fairness-Aware Data Mining
Applications of FADM techniques, other than anti-discrimination,
especially in a recommendation context
New Directions of Fairness
Relations of existing formal fairness with causal inference and
information theory
Introducing an idea of a fair division problem and avoiding unfair
treatments
Generalization Bound in terms of Fairness
Theoretical aspects of fairness not on training data, but on test data
✤ We use the term “fairness-aware” instead of “discrimination-aware,” because the word “discrimination”
means classification in a ML context, and this technique applicable to tasks other than avoiding
discriminative decisions
2
Foods for discussion about new directions of fairness in DM / ML
PART Ⅰ
Applications of
Fairness-Aware Data Mining
3
Fairness-Aware Data Mining
4
Fairness-aware Data mining (FADM)
data analysis taking into account potential issues of fairness
Unfairness Prevention
S : sensitive feature: representing information that is wanted not to
influence outcomes
Other factors: Y: target variable, X: non-sensitive feature
Learning a statistical model from potentially unfair data sets so
that the sensitive feature does not influence the model’s outcomes
Two major tasks of FADM
Unfairness Detection: Finding unfair treatments in database
Unfairness Prevention: Building a model to provide fair outcomes
[Romei+ 2014]
Anti-Discrimination
5
[Sweeney 13]
obtaining socially and legally anti-discriminative outcomes
African descent names European descent names
Arrested?
Located:
Advertisements indicating arrest records were more frequently
displayed for names that are more popular among individuals of
African descent than those of European descent
sensitive feature = users’ socially sensitive demographic information
anti-discriminative outcomes
Unbiased Information
6
[Pariser 2011, TED Talk by Eli Pariser, http://www.filterbubble.com, Kamishima+ 13]
To fit for Pariser’s preference, conservative people are eliminated from
his friend recommendation list in a social networking service
avoiding biased information that doesn’t meet a user’s intention
Filter Bubble: a concern that personalization technologies narrow
and bias the topics of information provided to people
sensitive feature = political conviction of a friend candidate
unbiased information in terms of candidates’ political conviction
In the recommendation on the retail store, the items sold by the site
owner are constantly ranked higher than those sold by tenants
Tenants will complain about this unfair treatment
Fair Trading
7
equal treatment of content providers
Online retail store
The site owner directly sells items
The site is rented to tenants, and the tenants also sells items
[Kamishima+ 12, Kamishima+ 13]
sensitive feature = a content provider of a candidate item
site owner and its tenants are equally treated in recommendation
Ignoring Uninteresting Information
8
[Gondek+ 04]
ignore information unwanted by a user
A simple clustering method finds two
clusters: one contains only faces, and the
other contains faces with shoulders
A data analyst considers this clustering is
useless and uninteresting
By ignoring this uninteresting information,
more meaningful female- and male-like
clusters could be obtained
non-redundant clustering : find clusters that are as independent
from a given uninteresting partition as possible
clustering facial images
sensitive feature = uninteresting information
ignore the influence of uninteresting information
Part Ⅰ: Summary
9
A belief introduction of FADM and a unfairness prevention task
Learning a statistical model from potentially unfair data sets so that
the sensitive feature does not influence the model’s outcomes
FADM techniques are widely applicable
There are many FADM applications other than anti-discrimination,
such as providing unbiased information, fair trading, and ignoring
uninteresting information
PART Ⅱ
New Directions of Fairness
10
Discussion about formal definitions and treatments of fairness
in data mining and machine learning contexts
Related Topics of a Current Formal Fairness
connection between formal fairness and causal inference
interpretation in view of information theory
New Definitions of Formal Fairness
Why statistical independence can be used as fairness
Introducing an idea of a fair division problem
New Treatments of Formal Fairness
methods for avoiding unfair treatments instead of enhancing
fairness
PART Ⅱ: Outline
11
Causality
12
[Žliobaitė+ 11, Calders+ 13]
sensitive feature: S
gender
male / female
target variable: Y
acceptance
accept / not accept
Fair determination: the gender does not influence the acceptance
	 statistical independence: Y ?? S
An example of university admission in [Žliobaitė+ 11]
Unfairness Prevention task
optimization of accuracy under causality constraints
Information Theoretic Interpretation
13
statistical independence between S and Y implies
zero mutual information: I(S; Y) = 0
the degree of influence S to Y can be measured by I(S; Y)
Sensitive: S
Target: Y
H(Y )
H(S)
I(S; Y )
H(S | Y )
H(Y | S)
Information theoretical view of a fairness condition
Causality with Explainable Features
14
[Žliobaitė+ 11, Calders+ 13]
sensitive feature: S
gender
male / female
target variable: Y
acceptance
accept / not accept
Removing the pure influence of S to Y, excluding the effect of E
	 conditional statistical independence:
explainable feature: E
(confounding feature)
program
medicine / computer
medicine → acceptance=low
computer → acceptance=high
female → medicine=high
male → computer=high
An example of fair determination
even if S and Y are not independent
Y ?? S | E
Information Theoretic Interpretation
15
Sensitive: S
Target: Y
Explainable: E H(E)
the degree of conditional independence between Y and S given E
conditional mutual information: I(S; Y | E)
We can exploit additional information I(S; Y; E) to obtain outcomes
I(S; Y ; E)
H(Y )
H(S)
H(S | Y, E)
H(Y | S, E)
H(E | S, Y )
I(S; E | Y )
I(Y ; E | S)
I(S; Y | E)
Why outcomes are assumed
as being fair?
16
Why outcomes are assumed as being fair,
if a sensitive feature does not influence the outcomes?
All parties agree with the use of this criterion,
may be because this is objective and reasonable
Is there any way for making an agreement?
To further examine new directions, we introduce a fair division problem
In this view, [Brendt+ 12]’s approach is
regarded as a way of making
agreements in a wisdom-of-crowds way.
The size and color of circles indicate the
size of samples and the risk of
discrimination, respectively
Fair Division
17
Alice and Bob want to divide this swiss-roll FAIRLY
Total length of this swiss-roll is 20cm
20cm
Alice and Bob get half each based on agreed common measure
This approach is adopted in current FADM techniques
Fair Division
17
Alice and Bob want to divide this swiss-roll FAIRLY
Alice and Bob get half each based on agreed common measure
This approach is adopted in current FADM techniques
10cm
10cm
divide the swiss-roll into 10cm each
Fair Division
18
Unfortunately, Alice and Bob don’t have a scale
Alice cut the swiss-roll exactly in halves based on her own feeling
envy-free division: Alice and Bob get a equal or larger piece
based on their own measure
Fair Division
18
Unfortunately, Alice and Bob don’t have a scale
envy-free division: Alice and Bob get a equal or larger piece
based on their own measure
Bob pick a larger piece based on his own feeling
Bob
Fair Division
19
Fairness in a fair division context
Envy-Free Division: Every party gets a equal or larger piece than
other parties’ pieces based on one’s own measure
Proportional Division: Every party gets an equal or larger piece than 1/n
based on one’s own measure; Envy-free division is proportional division
Exact Division: Every party gets a equal-sized piece
There are n parties
Every party i has one’s own measure mi(Pj) for each piece Pj
mi(Pj) = 1/n, 8i, j
mi(Pi) 1/n, 8i
mi(Pi) mi(Pj), 8i, j
Envy-Free in a FADM Context
20
Current FADM techniques adopt common agreed measure
Can we develop FADM techniques using an envy-free approach?
This technique can be applicable without agreements on fairness criterion
FADM under envy-free fairness
Maximize the utility of analysis, such as prediction accuracy,
under the envy-free fairness constraints
A Naïve method for Classification
Among n candidates k ones can be classified as positive
Among all nCk classifications, enumerate those satisfying envy-free
conditions based on parties’ own utility measures
ex. Fair classifiers with different sets of explainable features
Pick the classification whose accuracy is maximum
Open Problem: Can we develop a more efficient algorithm?
Fairness Guardian
21
Current fairness prevention methods are designed so as
to be fair
Example: Logistic Regression + Prejudice Remover 	 [Kamishima+ 12]
The objective function is composed of
classification loss and fairness constraint terms
P
D ln Pr[Y | X, S; ⇥] + 2 k⇥k2
2 + ⌘ I(Y ; S)
Fairness Guardian Approach
Unfairness is prevented by enhancing fairness of outcomes
Fair Is Not Unfair?
22
A reverse treatment of fairness:
not to be unfair
One possible formulation of a unfair classifier
Outcomes are determined ONLY by a sensitive feature
Pr[Y | S; ⇤
]
Ex. Your paper is rejected, just because you are not handsome
Penalty term to maximize the KL divergence between
a pre-trained unfair classifier and a target classifier
DKL[Pr[Y | S; ⇤
]k Pr[Y | X, S; ⇥]]
Unfairness Hater
23
Unfairness Hater Approach
Unfairness is prevented by avoiding unfair outcomes
This approach was almost useless for obtaining fair outcomes, but…
Better Optimization
The fairness-enhanced objective function tends to be non-convex;
thus, adding a unfairness hater may help for avoiding local minima
Avoiding Unfair Situation
There would be unfair situations that should be avoided;
Ex. Humans’ photos were mistakenly labeled as gorilla in auto-
tagging [Barr 2015]
There would be many choices between to be fair and not to be unfair
that should be examined
Part Ⅱ: Summary
Relation of fairness with causal inference and information theory
We review a current formal definition of fairness by relating it with
Rubin’s causal inference; and, its interpretation based on information
theory
New Directions of formal fairness without agreements
We showed the possibility of formal fairness that does not presume a
common criterion agreed between concerned parties
New Directions of treatment of fairness by avoiding unfairness
We discussed that FADM techniques for avoiding unfairness, instead
of enhancing fairness.
24
PART Ⅲ
Generalization Bound
in terms of Fairness
25
Part Ⅲ: Introduction
There are many technical problems to solve in a FADM literature,
because tools for excluding specific information has not been
developed actively.
Types of Sensitive Features
Non-binary sensitive feature
Analysis Techniques
Analysis methods other than classification or regression
Optimization
Constraint terms make objective functions non-convex
Fairness measure
Interpretable to humans and having convenient properties
Learning Theory
Generalization ability in terms of fairness
26
Kazuto Fukuchi’s Talk
27
Conclusion
Applications of Fairness-Aware Data Mining
Applications other than anti-discrimination: providing unbiased
information, fair trading, and excluding unwanted information
New Directions of Fairness
Relation of fairness with causal inference and information theory
Formal fairness introducing an idea of a fair division problem
Avoiding unfair treatment, instead of enhancing fairness
Generalization bound in terms of fairness
Generalization bound in terms of fairness based on f-divergence
Additional Information and codes
http://www.kamishima.net/fadm
Acknowledgments: This work is supported by MEXT/JSPS KAKENHI Grant Number 24500194, 24680015,
25540094, 25540094, and 15K00327
28
Bibliography I
A. Barr.
Google mistakenly tags black people as ‘gorillas,’ showing limits of algorithms.
The Wall Street Journal, 2015.
⟨http://on.wsj.com/1CaCNlb⟩.
B. Berendt and S. Preibusch.
Exploring discrimination: A user-centric evaluation of discrimination-aware data mining.
In Proc. of the IEEE Int’l Workshop on Discrimination and Privacy-Aware Data Mining,
pages 344–351, 2012.
T. Calders, A. Karim, F. Kamiran, W. Ali, and X. Zhang.
Controlling attribute effect in linear regression.
In Proc. of the 13th IEEE Int’l Conf. on Data Mining, pages 71–80, 2013.
T. Calders and S. Verwer.
Three naive Bayes approaches for discrimination-free classification.
Data Mining and Knowledge Discovery, 21:277–292, 2010.
C. Dwork, M. Hardt, T. Pitassi, O. Reingold, and R. Zemel.
Fairness through awareness.
In Proc. of the 3rd Innovations in Theoretical Computer Science Conf., pages 214–226,
2012.
1 / 4
Bibliography II
K. Fukuchi and J. Sakuma.
Fairness-aware learning with restriction of universal dependency using f-divergences.
arXiv:1104.3913 [cs.CC], 2015.
D. Gondek and T. Hofmann.
Non-redundant data clustering.
In Proc. of the 4th IEEE Int’l Conf. on Data Mining, pages 75–82, 2004.
T. Kamishima, S. Akaho, H. Asoh, and J. Sakuma.
Considerations on fairness-aware data mining.
In Proc. of the IEEE Int’l Workshop on Discrimination and Privacy-Aware Data Mining,
pages 378–385, 2012.
T. Kamishima, S. Akaho, H. Asoh, and J. Sakuma.
Enhancement of the neutrality in recommendation.
In Proc. of the 2nd Workshop on Human Decision Making in Recommender Systems,
pages 8–14, 2012.
T. Kamishima, S. Akaho, H. Asoh, and J. Sakuma.
Fairness-aware classifier with prejudice remover regularizer.
In Proc. of the ECML PKDD 2012, Part II, pages 35–50, 2012.
[LNCS 7524].
2 / 4
Bibliography III
T. Kamishima, S. Akaho, H. Asoh, and J. Sakuma.
Efficiency improvement of neutrality-enhanced recommendation.
In Proc. of the 3rd Workshop on Human Decision Making in Recommender Systems,
pages 1–8, 2013.
E. Pariser.
The filter bubble.
⟨http://www.thefilterbubble.com/⟩.
E. Pariser.
The Filter Bubble: What The Internet Is Hiding From You.
Viking, 2011.
A. Romei and S. Ruggieri.
A multidisciplinary survey on discrimination analysis.
The Knowledge Engineering Review, 29(5):582–638, 2014.
L. Sweeney.
Discrimination in online ad delivery.
Communications of the ACM, 56(5):44–54, 2013.
I. ˇZliobait˙e, F. Kamiran, and T. Calders.
Handling conditional discrimination.
In Proc. of the 11th IEEE Int’l Conf. on Data Mining, 2011.
3 / 4
Bibliography IV
R. Zemel, Y. Wu, K. Swersky, T. Pitassi, and C. Dwork.
Learning fair representations.
In Proc. of the 30th Int’l Conf. on Machine Learning, 2013.
4 / 4

More Related Content

PDF
The Independence of Fairness-aware Classifiers
PDF
Model-based Approaches for Independence-Enhanced Recommendation
PDF
Fairness-aware Learning through Regularization Approach
PDF
Efficiency Improvement of Neutrality-Enhanced Recommendation
PDF
Consideration on Fairness-aware Data Mining
PDF
Correcting Popularity Bias by Enhancing Recommendation Neutrality
PDF
Considerations on Recommendation Independence for a Find-Good-Items Task
PDF
Fairness-aware Classifier with Prejudice Remover Regularizer
The Independence of Fairness-aware Classifiers
Model-based Approaches for Independence-Enhanced Recommendation
Fairness-aware Learning through Regularization Approach
Efficiency Improvement of Neutrality-Enhanced Recommendation
Consideration on Fairness-aware Data Mining
Correcting Popularity Bias by Enhancing Recommendation Neutrality
Considerations on Recommendation Independence for a Find-Good-Items Task
Fairness-aware Classifier with Prejudice Remover Regularizer

What's hot (20)

PDF
Recommendation Independence
PPT
PPTX
Alleviating Privacy Attacks Using Causal Models
PDF
Lecture 4: NBERMetrics
PPT
PPT
PPT
PPTX
Machine Learning and Causal Inference
PPT
PDF
Stated preference methods and analysis
PDF
Module 6: Ensemble Algorithms
PPTX
Scott Lundberg, Microsoft Research - Explainable Machine Learning with Shaple...
PDF
Module 2: Machine Learning Deep Dive
PDF
Instance Selection and Optimization of Neural Networks
PPT
PDF
What recommender systems can learn from decision psychology about preference ...
PPT
Summer 07-mfin7011-tang1922
PPTX
Dowhy: An end-to-end library for causal inference
PDF
Module 3: Linear Regression
PDF
Nbe rtopicsandrecomvlecture1
Recommendation Independence
Alleviating Privacy Attacks Using Causal Models
Lecture 4: NBERMetrics
Machine Learning and Causal Inference
Stated preference methods and analysis
Module 6: Ensemble Algorithms
Scott Lundberg, Microsoft Research - Explainable Machine Learning with Shaple...
Module 2: Machine Learning Deep Dive
Instance Selection and Optimization of Neural Networks
What recommender systems can learn from decision psychology about preference ...
Summer 07-mfin7011-tang1922
Dowhy: An end-to-end library for causal inference
Module 3: Linear Regression
Nbe rtopicsandrecomvlecture1

Similar to Future Directions of Fairness-Aware Data Mining: Recommendation, Causality, and Theoretical Aspects (20)

PDF
The Role of Ethics in Data Science_ Best Practices (1).pdf
PPTX
Chapter 4- Policy Analysis-An Introduction modified by Hassan Md Hafizur Rahm...
PDF
Test Bank for Applied Statistics in Business and Economics, 4th Edition: Davi...
PDF
Fairness definitions explained
PPT
Ethical, Social, and Political Issues in E-commerce
PDF
Test Bank for Applied Statistics in Business and Economics, 4th Edition: Davi...
PDF
Test Bank for Applied Statistics in Business and Economics, 4th Edition: Davi...
PPTX
Dr Masood Ahmed and Alan Davies - ECO 17: Transforming care through digital h...
PPTX
EPR Annual Conference 2020 Workshop 1 - Simon Uytterhoeven
PPTX
AI and us communicating for algorithmic bias awareness
PPTX
Measures and mismeasures of algorithmic fairness
DOCX
1PPA 670 Public Policy AnalysisBasic Policy Terms an.docx
PDF
Exploring Research 9th Edition Salkind Test Bank
DOCX
STEP IV CASE STUDY & FINAL PAPERA. Based on the analysis in Ste.docx
PDF
Vint big data research privacy technology and the law
PDF
Sogeti big data research privacy technology and the law
PDF
Big data 3 4- vint-big-data-research-privacy-technology-and-the-law - big dat...
PDF
Exploring Research 9th Edition Salkind Test Bank
PDF
Statistics for Business and Economics Canadia 5th Edition Lind Test Bank
PDF
Exploring Research 9th Edition Salkind Test Bank
The Role of Ethics in Data Science_ Best Practices (1).pdf
Chapter 4- Policy Analysis-An Introduction modified by Hassan Md Hafizur Rahm...
Test Bank for Applied Statistics in Business and Economics, 4th Edition: Davi...
Fairness definitions explained
Ethical, Social, and Political Issues in E-commerce
Test Bank for Applied Statistics in Business and Economics, 4th Edition: Davi...
Test Bank for Applied Statistics in Business and Economics, 4th Edition: Davi...
Dr Masood Ahmed and Alan Davies - ECO 17: Transforming care through digital h...
EPR Annual Conference 2020 Workshop 1 - Simon Uytterhoeven
AI and us communicating for algorithmic bias awareness
Measures and mismeasures of algorithmic fairness
1PPA 670 Public Policy AnalysisBasic Policy Terms an.docx
Exploring Research 9th Edition Salkind Test Bank
STEP IV CASE STUDY & FINAL PAPERA. Based on the analysis in Ste.docx
Vint big data research privacy technology and the law
Sogeti big data research privacy technology and the law
Big data 3 4- vint-big-data-research-privacy-technology-and-the-law - big dat...
Exploring Research 9th Edition Salkind Test Bank
Statistics for Business and Economics Canadia 5th Edition Lind Test Bank
Exploring Research 9th Edition Salkind Test Bank

More from Toshihiro Kamishima (12)

PDF
RecSys2018論文読み会 資料
PDF
WSDM2018読み会 資料
PDF
機械学習研究でのPythonの利用
PDF
KDD2016勉強会 資料
PDF
科学技術計算関連Pythonパッケージの概要
PDF
WSDM2016勉強会 資料
PDF
ICML2015読み会 資料
PDF
PyMCがあれば,ベイズ推定でもう泣いたりなんかしない
PDF
Absolute and Relative Clustering
PDF
Enhancement of the Neutrality in Recommendation
PDF
OpenOpt の線形計画で圧縮センシング
PDF
Pythonによる機械学習実験の管理
RecSys2018論文読み会 資料
WSDM2018読み会 資料
機械学習研究でのPythonの利用
KDD2016勉強会 資料
科学技術計算関連Pythonパッケージの概要
WSDM2016勉強会 資料
ICML2015読み会 資料
PyMCがあれば,ベイズ推定でもう泣いたりなんかしない
Absolute and Relative Clustering
Enhancement of the Neutrality in Recommendation
OpenOpt の線形計画で圧縮センシング
Pythonによる機械学習実験の管理

Recently uploaded (20)

PDF
Microsoft Core Cloud Services powerpoint
PDF
Tetra Pak Index 2023 - The future of health and nutrition - Full report.pdf
PDF
Data Engineering Interview Questions & Answers Cloud Data Stacks (AWS, Azure,...
PDF
Transcultural that can help you someday.
DOCX
Factor Analysis Word Document Presentation
PPTX
New ISO 27001_2022 standard and the changes
PPTX
(Ali Hamza) Roll No: (F24-BSCS-1103).pptx
PPTX
IMPACT OF LANDSLIDE.....................
PDF
Navigating the Thai Supplements Landscape.pdf
PDF
Capcut Pro Crack For PC Latest Version {Fully Unlocked 2025}
PPTX
STERILIZATION AND DISINFECTION-1.ppthhhbx
PDF
Optimise Shopper Experiences with a Strong Data Estate.pdf
PPTX
SAP 2 completion done . PRESENTATION.pptx
PDF
Systems Analysis and Design, 12th Edition by Scott Tilley Test Bank.pdf
PDF
Jean-Georges Perrin - Spark in Action, Second Edition (2020, Manning Publicat...
PDF
annual-report-2024-2025 original latest.
PPTX
Pilar Kemerdekaan dan Identi Bangsa.pptx
PDF
Data Engineering Interview Questions & Answers Batch Processing (Spark, Hadoo...
PDF
OneRead_20250728_1808.pdfhdhddhshahwhwwjjaaja
PPT
DU, AIS, Big Data and Data Analytics.ppt
Microsoft Core Cloud Services powerpoint
Tetra Pak Index 2023 - The future of health and nutrition - Full report.pdf
Data Engineering Interview Questions & Answers Cloud Data Stacks (AWS, Azure,...
Transcultural that can help you someday.
Factor Analysis Word Document Presentation
New ISO 27001_2022 standard and the changes
(Ali Hamza) Roll No: (F24-BSCS-1103).pptx
IMPACT OF LANDSLIDE.....................
Navigating the Thai Supplements Landscape.pdf
Capcut Pro Crack For PC Latest Version {Fully Unlocked 2025}
STERILIZATION AND DISINFECTION-1.ppthhhbx
Optimise Shopper Experiences with a Strong Data Estate.pdf
SAP 2 completion done . PRESENTATION.pptx
Systems Analysis and Design, 12th Edition by Scott Tilley Test Bank.pdf
Jean-Georges Perrin - Spark in Action, Second Edition (2020, Manning Publicat...
annual-report-2024-2025 original latest.
Pilar Kemerdekaan dan Identi Bangsa.pptx
Data Engineering Interview Questions & Answers Batch Processing (Spark, Hadoo...
OneRead_20250728_1808.pdfhdhddhshahwhwwjjaaja
DU, AIS, Big Data and Data Analytics.ppt

Future Directions of Fairness-Aware Data Mining: Recommendation, Causality, and Theoretical Aspects

  • 1. Future directions of Fairness-aware Data Mining Recommendation, Causality, and Theoretical Aspects Toshihiro Kamishima*1 and Kazuto Fukuchi*2 joint work with Shotaro Akaho*1, Hideki Asoh*1, and Jun Sakuma*2,3 *1National Institute of Advanced Industrial Science and Technology (AIST), Japan *2University of Tsukuba, and *3JST CREST Workshop on Fairness, Accountability, and Transparency in Machine Learning In conjunction with the ICML 2015 @ Lille, France, Jul. 11, 2015 1
  • 2. Outline New Applications of Fairness-Aware Data Mining Applications of FADM techniques, other than anti-discrimination, especially in a recommendation context New Directions of Fairness Relations of existing formal fairness with causal inference and information theory Introducing an idea of a fair division problem and avoiding unfair treatments Generalization Bound in terms of Fairness Theoretical aspects of fairness not on training data, but on test data ✤ We use the term “fairness-aware” instead of “discrimination-aware,” because the word “discrimination” means classification in a ML context, and this technique applicable to tasks other than avoiding discriminative decisions 2 Foods for discussion about new directions of fairness in DM / ML
  • 4. Fairness-Aware Data Mining 4 Fairness-aware Data mining (FADM) data analysis taking into account potential issues of fairness Unfairness Prevention S : sensitive feature: representing information that is wanted not to influence outcomes Other factors: Y: target variable, X: non-sensitive feature Learning a statistical model from potentially unfair data sets so that the sensitive feature does not influence the model’s outcomes Two major tasks of FADM Unfairness Detection: Finding unfair treatments in database Unfairness Prevention: Building a model to provide fair outcomes [Romei+ 2014]
  • 5. Anti-Discrimination 5 [Sweeney 13] obtaining socially and legally anti-discriminative outcomes African descent names European descent names Arrested? Located: Advertisements indicating arrest records were more frequently displayed for names that are more popular among individuals of African descent than those of European descent sensitive feature = users’ socially sensitive demographic information anti-discriminative outcomes
  • 6. Unbiased Information 6 [Pariser 2011, TED Talk by Eli Pariser, http://www.filterbubble.com, Kamishima+ 13] To fit for Pariser’s preference, conservative people are eliminated from his friend recommendation list in a social networking service avoiding biased information that doesn’t meet a user’s intention Filter Bubble: a concern that personalization technologies narrow and bias the topics of information provided to people sensitive feature = political conviction of a friend candidate unbiased information in terms of candidates’ political conviction
  • 7. In the recommendation on the retail store, the items sold by the site owner are constantly ranked higher than those sold by tenants Tenants will complain about this unfair treatment Fair Trading 7 equal treatment of content providers Online retail store The site owner directly sells items The site is rented to tenants, and the tenants also sells items [Kamishima+ 12, Kamishima+ 13] sensitive feature = a content provider of a candidate item site owner and its tenants are equally treated in recommendation
  • 8. Ignoring Uninteresting Information 8 [Gondek+ 04] ignore information unwanted by a user A simple clustering method finds two clusters: one contains only faces, and the other contains faces with shoulders A data analyst considers this clustering is useless and uninteresting By ignoring this uninteresting information, more meaningful female- and male-like clusters could be obtained non-redundant clustering : find clusters that are as independent from a given uninteresting partition as possible clustering facial images sensitive feature = uninteresting information ignore the influence of uninteresting information
  • 9. Part Ⅰ: Summary 9 A belief introduction of FADM and a unfairness prevention task Learning a statistical model from potentially unfair data sets so that the sensitive feature does not influence the model’s outcomes FADM techniques are widely applicable There are many FADM applications other than anti-discrimination, such as providing unbiased information, fair trading, and ignoring uninteresting information
  • 10. PART Ⅱ New Directions of Fairness 10
  • 11. Discussion about formal definitions and treatments of fairness in data mining and machine learning contexts Related Topics of a Current Formal Fairness connection between formal fairness and causal inference interpretation in view of information theory New Definitions of Formal Fairness Why statistical independence can be used as fairness Introducing an idea of a fair division problem New Treatments of Formal Fairness methods for avoiding unfair treatments instead of enhancing fairness PART Ⅱ: Outline 11
  • 12. Causality 12 [Žliobaitė+ 11, Calders+ 13] sensitive feature: S gender male / female target variable: Y acceptance accept / not accept Fair determination: the gender does not influence the acceptance statistical independence: Y ?? S An example of university admission in [Žliobaitė+ 11] Unfairness Prevention task optimization of accuracy under causality constraints
  • 13. Information Theoretic Interpretation 13 statistical independence between S and Y implies zero mutual information: I(S; Y) = 0 the degree of influence S to Y can be measured by I(S; Y) Sensitive: S Target: Y H(Y ) H(S) I(S; Y ) H(S | Y ) H(Y | S) Information theoretical view of a fairness condition
  • 14. Causality with Explainable Features 14 [Žliobaitė+ 11, Calders+ 13] sensitive feature: S gender male / female target variable: Y acceptance accept / not accept Removing the pure influence of S to Y, excluding the effect of E conditional statistical independence: explainable feature: E (confounding feature) program medicine / computer medicine → acceptance=low computer → acceptance=high female → medicine=high male → computer=high An example of fair determination even if S and Y are not independent Y ?? S | E
  • 15. Information Theoretic Interpretation 15 Sensitive: S Target: Y Explainable: E H(E) the degree of conditional independence between Y and S given E conditional mutual information: I(S; Y | E) We can exploit additional information I(S; Y; E) to obtain outcomes I(S; Y ; E) H(Y ) H(S) H(S | Y, E) H(Y | S, E) H(E | S, Y ) I(S; E | Y ) I(Y ; E | S) I(S; Y | E)
  • 16. Why outcomes are assumed as being fair? 16 Why outcomes are assumed as being fair, if a sensitive feature does not influence the outcomes? All parties agree with the use of this criterion, may be because this is objective and reasonable Is there any way for making an agreement? To further examine new directions, we introduce a fair division problem In this view, [Brendt+ 12]’s approach is regarded as a way of making agreements in a wisdom-of-crowds way. The size and color of circles indicate the size of samples and the risk of discrimination, respectively
  • 17. Fair Division 17 Alice and Bob want to divide this swiss-roll FAIRLY Total length of this swiss-roll is 20cm 20cm Alice and Bob get half each based on agreed common measure This approach is adopted in current FADM techniques
  • 18. Fair Division 17 Alice and Bob want to divide this swiss-roll FAIRLY Alice and Bob get half each based on agreed common measure This approach is adopted in current FADM techniques 10cm 10cm divide the swiss-roll into 10cm each
  • 19. Fair Division 18 Unfortunately, Alice and Bob don’t have a scale Alice cut the swiss-roll exactly in halves based on her own feeling envy-free division: Alice and Bob get a equal or larger piece based on their own measure
  • 20. Fair Division 18 Unfortunately, Alice and Bob don’t have a scale envy-free division: Alice and Bob get a equal or larger piece based on their own measure Bob pick a larger piece based on his own feeling Bob
  • 21. Fair Division 19 Fairness in a fair division context Envy-Free Division: Every party gets a equal or larger piece than other parties’ pieces based on one’s own measure Proportional Division: Every party gets an equal or larger piece than 1/n based on one’s own measure; Envy-free division is proportional division Exact Division: Every party gets a equal-sized piece There are n parties Every party i has one’s own measure mi(Pj) for each piece Pj mi(Pj) = 1/n, 8i, j mi(Pi) 1/n, 8i mi(Pi) mi(Pj), 8i, j
  • 22. Envy-Free in a FADM Context 20 Current FADM techniques adopt common agreed measure Can we develop FADM techniques using an envy-free approach? This technique can be applicable without agreements on fairness criterion FADM under envy-free fairness Maximize the utility of analysis, such as prediction accuracy, under the envy-free fairness constraints A Naïve method for Classification Among n candidates k ones can be classified as positive Among all nCk classifications, enumerate those satisfying envy-free conditions based on parties’ own utility measures ex. Fair classifiers with different sets of explainable features Pick the classification whose accuracy is maximum Open Problem: Can we develop a more efficient algorithm?
  • 23. Fairness Guardian 21 Current fairness prevention methods are designed so as to be fair Example: Logistic Regression + Prejudice Remover [Kamishima+ 12] The objective function is composed of classification loss and fairness constraint terms P D ln Pr[Y | X, S; ⇥] + 2 k⇥k2 2 + ⌘ I(Y ; S) Fairness Guardian Approach Unfairness is prevented by enhancing fairness of outcomes
  • 24. Fair Is Not Unfair? 22 A reverse treatment of fairness: not to be unfair One possible formulation of a unfair classifier Outcomes are determined ONLY by a sensitive feature Pr[Y | S; ⇤ ] Ex. Your paper is rejected, just because you are not handsome Penalty term to maximize the KL divergence between a pre-trained unfair classifier and a target classifier DKL[Pr[Y | S; ⇤ ]k Pr[Y | X, S; ⇥]]
  • 25. Unfairness Hater 23 Unfairness Hater Approach Unfairness is prevented by avoiding unfair outcomes This approach was almost useless for obtaining fair outcomes, but… Better Optimization The fairness-enhanced objective function tends to be non-convex; thus, adding a unfairness hater may help for avoiding local minima Avoiding Unfair Situation There would be unfair situations that should be avoided; Ex. Humans’ photos were mistakenly labeled as gorilla in auto- tagging [Barr 2015] There would be many choices between to be fair and not to be unfair that should be examined
  • 26. Part Ⅱ: Summary Relation of fairness with causal inference and information theory We review a current formal definition of fairness by relating it with Rubin’s causal inference; and, its interpretation based on information theory New Directions of formal fairness without agreements We showed the possibility of formal fairness that does not presume a common criterion agreed between concerned parties New Directions of treatment of fairness by avoiding unfairness We discussed that FADM techniques for avoiding unfairness, instead of enhancing fairness. 24
  • 27. PART Ⅲ Generalization Bound in terms of Fairness 25
  • 28. Part Ⅲ: Introduction There are many technical problems to solve in a FADM literature, because tools for excluding specific information has not been developed actively. Types of Sensitive Features Non-binary sensitive feature Analysis Techniques Analysis methods other than classification or regression Optimization Constraint terms make objective functions non-convex Fairness measure Interpretable to humans and having convenient properties Learning Theory Generalization ability in terms of fairness 26
  • 30. Conclusion Applications of Fairness-Aware Data Mining Applications other than anti-discrimination: providing unbiased information, fair trading, and excluding unwanted information New Directions of Fairness Relation of fairness with causal inference and information theory Formal fairness introducing an idea of a fair division problem Avoiding unfair treatment, instead of enhancing fairness Generalization bound in terms of fairness Generalization bound in terms of fairness based on f-divergence Additional Information and codes http://www.kamishima.net/fadm Acknowledgments: This work is supported by MEXT/JSPS KAKENHI Grant Number 24500194, 24680015, 25540094, 25540094, and 15K00327 28
  • 31. Bibliography I A. Barr. Google mistakenly tags black people as ‘gorillas,’ showing limits of algorithms. The Wall Street Journal, 2015. ⟨http://on.wsj.com/1CaCNlb⟩. B. Berendt and S. Preibusch. Exploring discrimination: A user-centric evaluation of discrimination-aware data mining. In Proc. of the IEEE Int’l Workshop on Discrimination and Privacy-Aware Data Mining, pages 344–351, 2012. T. Calders, A. Karim, F. Kamiran, W. Ali, and X. Zhang. Controlling attribute effect in linear regression. In Proc. of the 13th IEEE Int’l Conf. on Data Mining, pages 71–80, 2013. T. Calders and S. Verwer. Three naive Bayes approaches for discrimination-free classification. Data Mining and Knowledge Discovery, 21:277–292, 2010. C. Dwork, M. Hardt, T. Pitassi, O. Reingold, and R. Zemel. Fairness through awareness. In Proc. of the 3rd Innovations in Theoretical Computer Science Conf., pages 214–226, 2012. 1 / 4
  • 32. Bibliography II K. Fukuchi and J. Sakuma. Fairness-aware learning with restriction of universal dependency using f-divergences. arXiv:1104.3913 [cs.CC], 2015. D. Gondek and T. Hofmann. Non-redundant data clustering. In Proc. of the 4th IEEE Int’l Conf. on Data Mining, pages 75–82, 2004. T. Kamishima, S. Akaho, H. Asoh, and J. Sakuma. Considerations on fairness-aware data mining. In Proc. of the IEEE Int’l Workshop on Discrimination and Privacy-Aware Data Mining, pages 378–385, 2012. T. Kamishima, S. Akaho, H. Asoh, and J. Sakuma. Enhancement of the neutrality in recommendation. In Proc. of the 2nd Workshop on Human Decision Making in Recommender Systems, pages 8–14, 2012. T. Kamishima, S. Akaho, H. Asoh, and J. Sakuma. Fairness-aware classifier with prejudice remover regularizer. In Proc. of the ECML PKDD 2012, Part II, pages 35–50, 2012. [LNCS 7524]. 2 / 4
  • 33. Bibliography III T. Kamishima, S. Akaho, H. Asoh, and J. Sakuma. Efficiency improvement of neutrality-enhanced recommendation. In Proc. of the 3rd Workshop on Human Decision Making in Recommender Systems, pages 1–8, 2013. E. Pariser. The filter bubble. ⟨http://www.thefilterbubble.com/⟩. E. Pariser. The Filter Bubble: What The Internet Is Hiding From You. Viking, 2011. A. Romei and S. Ruggieri. A multidisciplinary survey on discrimination analysis. The Knowledge Engineering Review, 29(5):582–638, 2014. L. Sweeney. Discrimination in online ad delivery. Communications of the ACM, 56(5):44–54, 2013. I. ˇZliobait˙e, F. Kamiran, and T. Calders. Handling conditional discrimination. In Proc. of the 11th IEEE Int’l Conf. on Data Mining, 2011. 3 / 4
  • 34. Bibliography IV R. Zemel, Y. Wu, K. Swersky, T. Pitassi, and C. Dwork. Learning fair representations. In Proc. of the 30th Int’l Conf. on Machine Learning, 2013. 4 / 4