A Fairness-aware Machine Learning
Interface for End-to-end Discrimination
Discovery and Mitigation
Niels Bantilan
New York, NY
https://arxiv.org/abs/1710.06921 (2017)
Seminar: Fortgeschrittene Themen in Data Mining
Student: Waqar Alamgir / TU Braunschweig / 4850580 / wajrcs@gmail.com
09 March 2018
Source: https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing 1
Problem
Machine learning models optimized only for prediction
accuracy reflect and amplify real-world social biases.
2
Bias is an amoral concept
“The preference for or against something”
3
Bias in Machine Learning
Biased
Decisions
Biased
Data
Biased
Algorithm
Biased
Predictions
4
Solution # 1
Biased
Decisions
Biased
Data
Biased
Predictions
Preprocessing
Machine
Learning
Algorithm
5
Solution # 2
Biased
Decisions
Data
Biased
Predictions
Post
processing
Bias Machine
Learning
Algorithm
6
Themis-ml
(thee-mus em-el)
An open source Python library built on
top of pandas and skit-learn that
implements fairness-aware machine
learning interfaces to measure and
reduce social bias in machine learning.
Available at github, pip and conda!
https://github.com/cosmicBboy/themis-ml
7
Fairness-aware Machine
Learning
Given set of records {(X, y)} ∈ D,
measurement of social bias b with
protected class s, and a measure of
performance p, train a machine learning
model that makes fair predictions while
preserving the accuracy of decisions.
8
Machine Learning Pipeline Recap
Instantiated Models
Raw Data
Model Specifications
Predictions on New Data
Deployed Model
Preprocessing
Training
Evaluation
Prediction
9
Themis-ml Functions
10
Preprocessing
Fairness-aware
models
Postprocessing Metrics
Let’s see some new conventions
y +
Positive target labels i.e. credit
given.
y –
Negative target labels i.e. credit not
given.
yTrain
Target variable for training set i.e.
give credit for training set.
s
Protected class (a binary variable)
i.e. female, age below 25.
Xd
Members of the disadvantaged
group i.e. immigrants.
Xa
Members of the advantaged group
i.e. citizens.
X d,y-
Negatively labeled members of
disadvantaged group.
X d,y+
Positively labeled members of
disadvantaged group.
y
Class/ target labels i.e. give credit.
11
1. Preprocessing / Transformer API / Relabeling
Description
Generates new yTrain
variable by relabelling
the target variables.
Parameters
ranker: An instance of
binary classifier i.e.
DecisionTreeClassifier
Return Values
yTrain: Dataframe of
modified target
variables for the
replacement of old y
train set.
Example
massager = Relabeller ( ranker = DecisionTreeClassifier ())
newYTrain = massager . fit ( x , yTrain , s ). transform ( x )
12
1. Preprocessing / Transformer API / Relabeling
income
Good Credit Risk
Bad Credit Risk
Woman
Man
Original Data Relabeled Data
is homeowner
13
2. Fairness-aware model / Training / Estimator API
Prejudice Remover Regularizer
Description
measures the degree to
which predictions y
and s are dependent on
each other.
Parameters
penalty: A string as tuning
parameter for biasing data
towards particular values
i.e. L1/ L2 regularization [5].
discrimination_penalty: A
string to add the
discrimination penalizer as
the prejudice index
Return Values
yTrain: Dataframe of
predicted class
variables when called
with fit and predict
functions
simultaneously.
Example
y_pred = LogisticRegressionPRR (penalty = "L2" , discrimination_penalty = "PI")
. fit (x_train , y_train , s_train). predict (x_test , s_test) 14
2. Fairness-aware model / Training / Estimator API
Prejudice Remover Regularizer
Value of weight Θ
Fairness-unaware Objective
Fairness-aware Model Objective
Fairness-utility
tradeoff
Cost
15
3. Fairness-aware model / Training / Estimator API
Additive Counterfactually Fair
Description
computes residuals
between predicted and
original class values
which is used to train
the model.
Parameters
target_estimator
continuous_estimator
binary_estimator
binary_residual_type
Return Values
yTrain: Dataframe of
predicted class
variables when called
with fit and predict
functions
simultaneously.
Example
y_pred = LinearACFClassifier ()
. fit ( x_train , y_train , s_train )
. predict ( x_test , s_test )
16
3. Fairness-aware model / Training / Estimator API
Additive Counterfactually Fair
X
s
E
X
ˆ
residual
model
model
y
ˆ
y
protected
classes
features
labels
X - ˆX
predicted
features
residual
features 17
4. Postprocessing / Predictor API
Reject Option Classification
Description
generates predicted
probabilities on train set
and compute the
proximities of each
prediction to the decision
boundary learned by
classifier.
Parameters
estimator
theta
demote
Return Values
yTrain: Dataframe of
predicted class
variables when called
with fit and predict
functions
simultaneously.
Example
y_pred = SingleROClassifier ( estimator = DecisionTreeClassifier ())
. fit ( x_train , y_train ). predict ( x_test , s_test ) 18
4. Postprocessing / Predictor API
Reject Option Classification
income
Original Prediction Relabeled Data
is homeowner
19
Good Credit Risk
Bad Credit Risk
Woman
Man
5. Metrics / Scorer
Mean Difference
Description
calculates difference
between p(a U y+) and
p(d U y+), resulting
betwen -1 to +1.
Parameters
y
s
d
Return Values
Array of float value
which is mean
difference between
advantaged group and
disadvantaged group
with error margin.
Example
md_y_true = mean_difference ( y_train , s_train )[ 0 ]
md_y_pred = mean_difference ( y_pred , s_test )[ 0 ]
diff = md_y_pred - md_y_true
20
“
Experiment with Themis-ml
Available at
https://github.com/waqar-alamgir/Fairness-aware-Machine-Learning
21
Case Study: German Credit Data
1 binary target variable y
700 “good” credit_risk
300 “bad” credit_risk
~20 input variables X
housing
credit_history
purpose
foreign_worker
personal_status_and_sex
age_in_years
3 binary protected classes s
is_foreign
is_female
age_below_25
1000 loan application records
22
German Credit Data Results
Does the baseline make socially biased predictions? 23
Baseline (B) - Remove Protected Attribute (RPA) - Relabel Target Variable (RTV) - Counterfactually Fair Model (CFM) - Reject-option Classification (ROC)
Related Work
24
fairml
Author: Julius Adebayo
Version: 0.1.1.5
Development: Active
1. Measures fairness at data level.
2. Great visualisation of features to validate
discrimination.
Attribute variable significance (from fairml)
25
Fair-classification
Author: Muhammad Bilal Zafar
Version: Not available
Development: Active
1. Fair Classification.
2. Classification without disparate impact.
3. Classification without disparate mistreatment.
Loss in accuracy to achieve fairness (from Fair-classification).
26
“
Live Demo
From jupyter notebook available at
http://nbviewer.jupyter.org/github/waqar-alamgir/Fairness-aware-Machine-
Learning/blob/master/experiment-german-credit.ipynb
27
Conclusion
● Themis-ml is a better library compared to others.
● It has well defined interface and methods to deal discrimination as well as
mitigation.
● Model flexibility: can be applied to numbers of existing machine learning
models.
● Fairness as performance: well not just that, but includes tools to optimize
for accuracy.
● Transparency of fairness-utility tradeoff
Having said that,
● Poorly documented.
● Wrong specification / incompatible with paper.
28
References
1. Themis-ml: A Fairness-aware Machine Learning Interface for End-to-end Discrimination Discovery and Mitigation (2017): Niels Bantilan, [online]
https://arxiv.org/abs/1710.06921 [01.11.2017]
2. Themis-ml (2017): Niels Bantilan, [online]
https://github.com/cosmicBboy/themis-ml [02.12.2017]
3. Scikit-learn (2010): David Cournapeau, [online]
https://github.com/scikit-learn/scikit-learn [15.06.2017]
4. Themis-ml installation (2017): Niels Bantilan, [online] https://github.com/cosmicBboy/themis-ml#installation [02.12.2017]
5. Objective function: [online] https://en.wikipedia.org/wiki/Loss_function [18.02.2018]
6. Regularization: Simple Definition, L1 & L2 Penalties, [online] http://www.statisticshowto.com/regularization/ [18.02.2018]
7. German-Credit Data (1994): [online]
https://archive.ics.uci.edu/ml/datasets/statlog+(german+credit+data) [02.12.2017]
8. Census-Income Data (2000): [online]
https://archive.ics.uci.edu/ml/datasets/Census-Income+%28KDD%29 [02.12.2017]
9. Fairness-aware Machine Learning (2018): Waqar Alamgir, [online] https://github.com/waqar-alamgir/Fairness-aware-Machine-Learning [02.02.2018]
10. FairML: Auditing Black-Box Predictive Models (2017): Julius Adebayo, [online] https://github.com/adebayoj/fairml [10.01.2018]
11. Fairness in Classification (2017): Muhammad Bilal Zafar, [online] https://github.com/mbilalzafar/fair-classification [13.01.2018]
12. Decision Theory for Discrimination-Aware Classification (2011): F. Kamiran, A. Karim & Xiangliang Zhang [online]
http://ieeexplore.ieee.org/document/6413831/ [02.03.2018]
13. Scikit-learn: Machine Learning in Python (2011), Pedregosa et al., JMLR 12, pp. 2825-2830
14. API design for machine learning software: experiences from the scikit-learn project (2013), Buitinck et al.
15. A survey on measuring indirect discrimination in machine learning (2015), [online]
https://www.researchgate.net/publication/283471618_A_survey_on_measuring_indirect_discrimination_in_machine_learning
16. Themis-ml experiment / Jupyter notebook (2018), [online] http://nbviewer.jupyter.org/github/waqar-alamgir/Fairness-aware-Machine-
Learning/blob/master/experiment-german-credit.ipynb 29
Thank You!
Any Questions?
30

More Related Content

PPTX
Nimrita koul Machine Learning
PPTX
Intro to Machine Learning for non-Data Scientists
PDF
A tour of the top 10 algorithms for machine learning newbies
PPTX
Learning machine learning with Yellowbrick
PDF
Steering Model Selection with Visual Diagnostics: Women in Analytics 2019
PDF
Machine Learning
PPTX
Session 06 machine learning.pptx
PPTX
Heart disease classification
Nimrita koul Machine Learning
Intro to Machine Learning for non-Data Scientists
A tour of the top 10 algorithms for machine learning newbies
Learning machine learning with Yellowbrick
Steering Model Selection with Visual Diagnostics: Women in Analytics 2019
Machine Learning
Session 06 machine learning.pptx
Heart disease classification

What's hot (20)

PDF
Linear Regression in R
PPTX
IMPLEMENTATION OF MACHINE LEARNING IN E-COMMERCE & BEYOND
PPTX
Machine Learning and Real-World Applications
PDF
VSSML16 L3. Clusters and Anomaly Detection
PPTX
sentiment analysis using support vector machine
PDF
Binary classification metrics_cheatsheet
PDF
Scikit Learn Tutorial | Machine Learning with Python | Python for Data Scienc...
PPTX
Data Analysis project "TITANIC SURVIVAL"
PDF
Le Machine Learning de A à Z
PDF
Feature Importance Analysis with XGBoost in Tax audit
PDF
(Py)testing the Limits of Machine Learning
PDF
Data Science - Part V - Decision Trees & Random Forests
PDF
Introduction to conventional machine learning techniques
PPTX
Introduction into machine learning
PDF
L13. Cluster Analysis
PPTX
An overview of machine learning
PDF
Training deep auto encoders for collaborative filtering
PPTX
Machine Learning - Simple Linear Regression
PDF
Classification Based Machine Learning Algorithms
PPTX
Machine Learning
Linear Regression in R
IMPLEMENTATION OF MACHINE LEARNING IN E-COMMERCE & BEYOND
Machine Learning and Real-World Applications
VSSML16 L3. Clusters and Anomaly Detection
sentiment analysis using support vector machine
Binary classification metrics_cheatsheet
Scikit Learn Tutorial | Machine Learning with Python | Python for Data Scienc...
Data Analysis project "TITANIC SURVIVAL"
Le Machine Learning de A à Z
Feature Importance Analysis with XGBoost in Tax audit
(Py)testing the Limits of Machine Learning
Data Science - Part V - Decision Trees & Random Forests
Introduction to conventional machine learning techniques
Introduction into machine learning
L13. Cluster Analysis
An overview of machine learning
Training deep auto encoders for collaborative filtering
Machine Learning - Simple Linear Regression
Classification Based Machine Learning Algorithms
Machine Learning
Ad

Similar to A Fairness-aware Machine Learning Interface for End-to-end Discrimination Discovery and Mitigation (20)

PDF
Spark + AI Summit - The Importance of Model Fairness and Interpretability in ...
PPTX
AlgorithmsModelsNov13.pptx
PDF
The importance of model fairness and interpretability in AI systems
PDF
THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...
PDF
THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...
PDF
THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...
PDF
Paper-Allstate-Claim-Severity
DOCX
Imtiaz khan data_science_analytics
PPTX
Regression with Microsoft Azure & Ms Excel
PDF
Artificial intelligence and IoT
PDF
Human in the loop: Bayesian Rules Enabling Explainable AI
PPTX
Supervised learning
PDF
20MEMECH Part 3- Classification.pdf
PPTX
Machine Learning Overview.pptx
DOCX
Essentials of machine learning algorithms
PDF
Next directions in Mahout's recommenders
PDF
Performance Comparision of Machine Learning Algorithms
PDF
Machine Learning part 2 - Introduction to Data Science
PPTX
Machine learning ppt unit one syllabuspptx
PDF
ML_Lec4 introduction to linear regression.pdf
Spark + AI Summit - The Importance of Model Fairness and Interpretability in ...
AlgorithmsModelsNov13.pptx
The importance of model fairness and interpretability in AI systems
THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...
THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...
THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...
Paper-Allstate-Claim-Severity
Imtiaz khan data_science_analytics
Regression with Microsoft Azure & Ms Excel
Artificial intelligence and IoT
Human in the loop: Bayesian Rules Enabling Explainable AI
Supervised learning
20MEMECH Part 3- Classification.pdf
Machine Learning Overview.pptx
Essentials of machine learning algorithms
Next directions in Mahout's recommenders
Performance Comparision of Machine Learning Algorithms
Machine Learning part 2 - Introduction to Data Science
Machine learning ppt unit one syllabuspptx
ML_Lec4 introduction to linear regression.pdf
Ad

Recently uploaded (20)

PPTX
retention in jsjsksksksnbsndjddjdnFPD.pptx
PPTX
Machine Learning and working of machine Learning
PPTX
statsppt this is statistics ppt for giving knowledge about this topic
PPT
PROJECT CYCLE MANAGEMENT FRAMEWORK (PCM).ppt
PDF
Votre score augmente si vous choisissez une catégorie et que vous rédigez une...
PDF
Tetra Pak Index 2023 - The future of health and nutrition - Full report.pdf
PPT
lectureusjsjdhdsjjshdshshddhdhddhhd1.ppt
PDF
©️ 02_SKU Automatic SW Robotics for Microsoft PC.pdf
PDF
Systems Analysis and Design, 12th Edition by Scott Tilley Test Bank.pdf
PPTX
1 hour to get there before the game is done so you don’t need a car seat for ...
PPTX
SET 1 Compulsory MNH machine learning intro
PPTX
Lesson-01intheselfoflifeofthekennyrogersoftheunderstandoftheunderstanded
PPTX
eGramSWARAJ-PPT Training Module for beginners
PDF
Navigating the Thai Supplements Landscape.pdf
PPTX
sac 451hinhgsgshssjsjsjheegdggeegegdggddgeg.pptx
PDF
A biomechanical Functional analysis of the masitary muscles in man
PPTX
IMPACT OF LANDSLIDE.....................
PPT
expt-design-lecture-12 hghhgfggjhjd (1).ppt
PPT
statistic analysis for study - data collection
PPTX
Crypto_Trading_Beginners.pptxxxxxxxxxxxxxx
retention in jsjsksksksnbsndjddjdnFPD.pptx
Machine Learning and working of machine Learning
statsppt this is statistics ppt for giving knowledge about this topic
PROJECT CYCLE MANAGEMENT FRAMEWORK (PCM).ppt
Votre score augmente si vous choisissez une catégorie et que vous rédigez une...
Tetra Pak Index 2023 - The future of health and nutrition - Full report.pdf
lectureusjsjdhdsjjshdshshddhdhddhhd1.ppt
©️ 02_SKU Automatic SW Robotics for Microsoft PC.pdf
Systems Analysis and Design, 12th Edition by Scott Tilley Test Bank.pdf
1 hour to get there before the game is done so you don’t need a car seat for ...
SET 1 Compulsory MNH machine learning intro
Lesson-01intheselfoflifeofthekennyrogersoftheunderstandoftheunderstanded
eGramSWARAJ-PPT Training Module for beginners
Navigating the Thai Supplements Landscape.pdf
sac 451hinhgsgshssjsjsjheegdggeegegdggddgeg.pptx
A biomechanical Functional analysis of the masitary muscles in man
IMPACT OF LANDSLIDE.....................
expt-design-lecture-12 hghhgfggjhjd (1).ppt
statistic analysis for study - data collection
Crypto_Trading_Beginners.pptxxxxxxxxxxxxxx

A Fairness-aware Machine Learning Interface for End-to-end Discrimination Discovery and Mitigation

  • 1. A Fairness-aware Machine Learning Interface for End-to-end Discrimination Discovery and Mitigation Niels Bantilan New York, NY https://arxiv.org/abs/1710.06921 (2017) Seminar: Fortgeschrittene Themen in Data Mining Student: Waqar Alamgir / TU Braunschweig / 4850580 / wajrcs@gmail.com 09 March 2018 Source: https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing 1
  • 2. Problem Machine learning models optimized only for prediction accuracy reflect and amplify real-world social biases. 2
  • 3. Bias is an amoral concept “The preference for or against something” 3
  • 4. Bias in Machine Learning Biased Decisions Biased Data Biased Algorithm Biased Predictions 4
  • 7. Themis-ml (thee-mus em-el) An open source Python library built on top of pandas and skit-learn that implements fairness-aware machine learning interfaces to measure and reduce social bias in machine learning. Available at github, pip and conda! https://github.com/cosmicBboy/themis-ml 7
  • 8. Fairness-aware Machine Learning Given set of records {(X, y)} ∈ D, measurement of social bias b with protected class s, and a measure of performance p, train a machine learning model that makes fair predictions while preserving the accuracy of decisions. 8
  • 9. Machine Learning Pipeline Recap Instantiated Models Raw Data Model Specifications Predictions on New Data Deployed Model Preprocessing Training Evaluation Prediction 9
  • 11. Let’s see some new conventions y + Positive target labels i.e. credit given. y – Negative target labels i.e. credit not given. yTrain Target variable for training set i.e. give credit for training set. s Protected class (a binary variable) i.e. female, age below 25. Xd Members of the disadvantaged group i.e. immigrants. Xa Members of the advantaged group i.e. citizens. X d,y- Negatively labeled members of disadvantaged group. X d,y+ Positively labeled members of disadvantaged group. y Class/ target labels i.e. give credit. 11
  • 12. 1. Preprocessing / Transformer API / Relabeling Description Generates new yTrain variable by relabelling the target variables. Parameters ranker: An instance of binary classifier i.e. DecisionTreeClassifier Return Values yTrain: Dataframe of modified target variables for the replacement of old y train set. Example massager = Relabeller ( ranker = DecisionTreeClassifier ()) newYTrain = massager . fit ( x , yTrain , s ). transform ( x ) 12
  • 13. 1. Preprocessing / Transformer API / Relabeling income Good Credit Risk Bad Credit Risk Woman Man Original Data Relabeled Data is homeowner 13
  • 14. 2. Fairness-aware model / Training / Estimator API Prejudice Remover Regularizer Description measures the degree to which predictions y and s are dependent on each other. Parameters penalty: A string as tuning parameter for biasing data towards particular values i.e. L1/ L2 regularization [5]. discrimination_penalty: A string to add the discrimination penalizer as the prejudice index Return Values yTrain: Dataframe of predicted class variables when called with fit and predict functions simultaneously. Example y_pred = LogisticRegressionPRR (penalty = "L2" , discrimination_penalty = "PI") . fit (x_train , y_train , s_train). predict (x_test , s_test) 14
  • 15. 2. Fairness-aware model / Training / Estimator API Prejudice Remover Regularizer Value of weight Θ Fairness-unaware Objective Fairness-aware Model Objective Fairness-utility tradeoff Cost 15
  • 16. 3. Fairness-aware model / Training / Estimator API Additive Counterfactually Fair Description computes residuals between predicted and original class values which is used to train the model. Parameters target_estimator continuous_estimator binary_estimator binary_residual_type Return Values yTrain: Dataframe of predicted class variables when called with fit and predict functions simultaneously. Example y_pred = LinearACFClassifier () . fit ( x_train , y_train , s_train ) . predict ( x_test , s_test ) 16
  • 17. 3. Fairness-aware model / Training / Estimator API Additive Counterfactually Fair X s E X ˆ residual model model y ˆ y protected classes features labels X - ˆX predicted features residual features 17
  • 18. 4. Postprocessing / Predictor API Reject Option Classification Description generates predicted probabilities on train set and compute the proximities of each prediction to the decision boundary learned by classifier. Parameters estimator theta demote Return Values yTrain: Dataframe of predicted class variables when called with fit and predict functions simultaneously. Example y_pred = SingleROClassifier ( estimator = DecisionTreeClassifier ()) . fit ( x_train , y_train ). predict ( x_test , s_test ) 18
  • 19. 4. Postprocessing / Predictor API Reject Option Classification income Original Prediction Relabeled Data is homeowner 19 Good Credit Risk Bad Credit Risk Woman Man
  • 20. 5. Metrics / Scorer Mean Difference Description calculates difference between p(a U y+) and p(d U y+), resulting betwen -1 to +1. Parameters y s d Return Values Array of float value which is mean difference between advantaged group and disadvantaged group with error margin. Example md_y_true = mean_difference ( y_train , s_train )[ 0 ] md_y_pred = mean_difference ( y_pred , s_test )[ 0 ] diff = md_y_pred - md_y_true 20
  • 21. “ Experiment with Themis-ml Available at https://github.com/waqar-alamgir/Fairness-aware-Machine-Learning 21
  • 22. Case Study: German Credit Data 1 binary target variable y 700 “good” credit_risk 300 “bad” credit_risk ~20 input variables X housing credit_history purpose foreign_worker personal_status_and_sex age_in_years 3 binary protected classes s is_foreign is_female age_below_25 1000 loan application records 22
  • 23. German Credit Data Results Does the baseline make socially biased predictions? 23 Baseline (B) - Remove Protected Attribute (RPA) - Relabel Target Variable (RTV) - Counterfactually Fair Model (CFM) - Reject-option Classification (ROC)
  • 25. fairml Author: Julius Adebayo Version: 0.1.1.5 Development: Active 1. Measures fairness at data level. 2. Great visualisation of features to validate discrimination. Attribute variable significance (from fairml) 25
  • 26. Fair-classification Author: Muhammad Bilal Zafar Version: Not available Development: Active 1. Fair Classification. 2. Classification without disparate impact. 3. Classification without disparate mistreatment. Loss in accuracy to achieve fairness (from Fair-classification). 26
  • 27. “ Live Demo From jupyter notebook available at http://nbviewer.jupyter.org/github/waqar-alamgir/Fairness-aware-Machine- Learning/blob/master/experiment-german-credit.ipynb 27
  • 28. Conclusion ● Themis-ml is a better library compared to others. ● It has well defined interface and methods to deal discrimination as well as mitigation. ● Model flexibility: can be applied to numbers of existing machine learning models. ● Fairness as performance: well not just that, but includes tools to optimize for accuracy. ● Transparency of fairness-utility tradeoff Having said that, ● Poorly documented. ● Wrong specification / incompatible with paper. 28
  • 29. References 1. Themis-ml: A Fairness-aware Machine Learning Interface for End-to-end Discrimination Discovery and Mitigation (2017): Niels Bantilan, [online] https://arxiv.org/abs/1710.06921 [01.11.2017] 2. Themis-ml (2017): Niels Bantilan, [online] https://github.com/cosmicBboy/themis-ml [02.12.2017] 3. Scikit-learn (2010): David Cournapeau, [online] https://github.com/scikit-learn/scikit-learn [15.06.2017] 4. Themis-ml installation (2017): Niels Bantilan, [online] https://github.com/cosmicBboy/themis-ml#installation [02.12.2017] 5. Objective function: [online] https://en.wikipedia.org/wiki/Loss_function [18.02.2018] 6. Regularization: Simple Definition, L1 & L2 Penalties, [online] http://www.statisticshowto.com/regularization/ [18.02.2018] 7. German-Credit Data (1994): [online] https://archive.ics.uci.edu/ml/datasets/statlog+(german+credit+data) [02.12.2017] 8. Census-Income Data (2000): [online] https://archive.ics.uci.edu/ml/datasets/Census-Income+%28KDD%29 [02.12.2017] 9. Fairness-aware Machine Learning (2018): Waqar Alamgir, [online] https://github.com/waqar-alamgir/Fairness-aware-Machine-Learning [02.02.2018] 10. FairML: Auditing Black-Box Predictive Models (2017): Julius Adebayo, [online] https://github.com/adebayoj/fairml [10.01.2018] 11. Fairness in Classification (2017): Muhammad Bilal Zafar, [online] https://github.com/mbilalzafar/fair-classification [13.01.2018] 12. Decision Theory for Discrimination-Aware Classification (2011): F. Kamiran, A. Karim & Xiangliang Zhang [online] http://ieeexplore.ieee.org/document/6413831/ [02.03.2018] 13. Scikit-learn: Machine Learning in Python (2011), Pedregosa et al., JMLR 12, pp. 2825-2830 14. API design for machine learning software: experiences from the scikit-learn project (2013), Buitinck et al. 15. A survey on measuring indirect discrimination in machine learning (2015), [online] https://www.researchgate.net/publication/283471618_A_survey_on_measuring_indirect_discrimination_in_machine_learning 16. Themis-ml experiment / Jupyter notebook (2018), [online] http://nbviewer.jupyter.org/github/waqar-alamgir/Fairness-aware-Machine- Learning/blob/master/experiment-german-credit.ipynb 29