SlideShare a Scribd company logo
Complex Adaptive Systems 2012 – Washington DC USA,
                                                  November 14-16




Towards A Differential Privacy and Utility Preserving
           Machine Learning Classifier

       Kato Mivule, Claude Turner, and Soo-Yeon Ji

             Computer Science Department
                Bowie State University

  Complex Adaptive Systems 2012 – Washington DC USA,
                    November 14-16

                                                                                     1
Complex Adaptive Systems 2012 – Washington DC USA,
Outline                                   November 14-16




     Introduction
     Related work
     Essential Terms
     Methodology
     Results
     Conclusion




                                                                       2
Introduction

                 Entities transact in ‘big data’ containing personal identifiable
                  information (PII).

                 Organizations are bound by federal and state law to ensure data privacy.

                 In the process to achieve privacy, the utility of privatized datasets
                  diminishes.

                 Achieving balance between privacy and utility is an ongoing problem.

                 Therefore, we investigate a differential privacy preserving machine
                  learning classification approach that seeks an acceptable level of
                  utility.


Complex Adaptive Systems 2012 – Washington DC USA, November 14-16                         3
Related Work
   There is a growing interest in investigating privacy preserving data mining
   solutions that provide a balance between data privacy and utility.

            Kifer and Gehrke (2006) did a broad study of enhanced data utility in
             privacy preserving data publishing by using statistical approaches.

            Wong (2007) described how achieving global optimal privacy while
             maintaining utility is an NP-hard problem.

            Krause and Horvitz (2010) noted that endeavours of finding trade-offs
             between privacy and utility is still an NP-hard problem.

            Muralidhar and Sarathy (2011) showed that differential privacy provides
             strong privacy guarantees but utility is still a problem due to noise levels.

            Finding the optimal balance between privacy and utility remains a
             challenge—even with differential privacy.                                       4
Complex Adaptive Systems 2012 – Washington DC USA, November 14-16
Data Utility verses Privacy

          Data utility is the extent of how useful a published dataset is to the
           consumer of that publicized dataset.

          In the course of a data privacy process, original data will lose statistical
           value despite privacy guarantees.




                                                 Image Source: Kenneth Corbin/Internet News.




Complex Adaptive Systems 2012 – Washington DC USA, November 14-16                              5
Objective

                  Achieving an optimal balance between data privacy and utility
                   remains an ongoing challenge.

                  Such optimality is highly desired and remains our investigation goal.




                                                 Image Source: Wikipedia, on Confidentiality.




Complex Adaptive Systems 2012 – Washington DC USA, November 14-16                               6
Ensemble classification
          Is a machine learning process, in which a collection of several
           independently trained classifiers are merged to achieve better prediction.




          Examples include single trained decision trees joined to make accurate
           predictions.
Complex Adaptive Systems 2012 – Washington DC USA, November 14-16                   7
AdaBoost Ensemble – Adaptive Boosting
          Proposed by Freund and Schapire (1995), uses several iterations by adding weak
           learners to create a powerful learner, adjusting weights to center on misclassified
           data in earlier iterations.

          Classification Error in AdaBoost Ensemble is computed as below:




Complex Adaptive Systems 2012 – Washington DC USA, November 14-16                                8
AdaBoost Ensemble (Cont’d )
          AdaBoost Ensemble computes as follows:




Complex Adaptive Systems 2012 – Washington DC USA, November 14-16   9
Differential Privacy




Complex Adaptive Systems 2012 – Washington DC USA, November 14-16   10
Differential Privacy (Cont’d)




Complex Adaptive Systems 2012 – Washington DC USA, November 14-16   11
Methodology (Cont’d)

          We utilized a public available Barack Obama 2008 campaign donations dataset.

          The data set contained 17,695 records of original unperturbed data.

          Two attributes, the donation amount and income status, are utilized to classify data
           into three classes.

          The three classes are low income, middle income, and high income, for donations
           $1 to $49, $50 to $80, $81 and above respectively.

          Validating our approach, the dataset comprised 50 percent on training and the
           remainder on testing, on both Original and Privatized datasets.

          Oracle database is queried via MATLAB ODBC connector. MATLAB is used for
           differential privacy and machine learning classification.


Complex Adaptive Systems 2012 – Washington DC USA, November 14-16                           12
Results

          Essential statistical traits of the original and differential privacy datasets,
           a necessary requirement to publish privatized datasets, are kept.

          As depicted, the mean, standard deviation, and variance of the original
           and differential privacy datasets remained the same.




Complex Adaptive Systems 2012 – Washington DC USA, November 14-16                        13
Results (Cont’d)
          There is a strong positive covariance of 1060.8 between the two datasets, which
           means that they grow simultaneously, as illustrated below:




Complex Adaptive Systems 2012 – Washington DC USA, November 14-16                            14
Results (Cont’d)
          There is almost no correlation (the correlation was 0.0054) between the
           original and differentially privatized datasets.

          Indicates some privacy assurances, and difficulty for an attacker, dealing
           only with the privatized dataset, to correctly infer any alterations.




Complex Adaptive Systems 2012 – Washington DC USA, November 14-16                       15
Results (Cont’d)
          After applying differential privacy, AdaBoost ensemble classifier is
           performed.

          The outcome of the donors’ dataset was Low, Middle, and High income,
           for donations 0 to 50, 51 to 80, and 81 to 100, respectively.

          This same classification outcome is used for the perturbed dataset to
           investigate whether the classifier would categorize the perturbed dataset
           correctly.




Complex Adaptive Systems 2012 – Washington DC USA, November 14-16                  16
Results (Cont’d)

          The training dataset from the original data showed that the classification
           error dropped from 0.25 to 0 with increased weak decision tree learners.

          The results changed with the training dataset on the differentially private
           data when the classification error dropped from 0.588 to 0.58.




Complex Adaptive Systems 2012 – Washington DC USA, November 14-16                   17
Results (Cont’d)
   When the same procedure is applied to the test dataset of the original data the
    classification error dropped from 0.03 to 0.

   However, when this procedure perform on the differentially private data, the error rate
    did not change even with increased number of weak decision tree.




Complex Adaptive Systems 2012 – Washington DC USA, November 14-16                       18
Conclusion
   In this study, we found that while differential privacy might guarantee strong
    confidentiality, providing data utility still remains a challenge.

   However, this study is instructive in a variety of ways:

               The level of Laplace noise does affect the classification error.

               Increasing the number of weak learners is not too significant.

               Adjusting the Laplace noise parameter, ε, is essential for further study.

               However, accurate classification means loss of privacy.

               Tradeoffs must be made between privacy and utility.

               We plan on investigating optimization approaches for such tradeoffs.
Complex Adaptive Systems 2012 – Washington DC USA, November 14-16                       19
Complex Adaptive Systems 2012 – Washington DC USA,
  Questions?                                          November 14-16




Contact:
Kato Mivule: kmivule@gmail.com



                            Thank You.




                                                                                    20

More Related Content

PPTX
Aaron Roth, Associate Professor, University of Pennsylvania, at MLconf NYC 2017
PDF
Differential privacy and applications to location privacy
PPTX
Differential Privacy for Information Retrieval
PDF
Differential privacy in the real world
PDF
Security and Privacy of Machine Learning
PDF
Social Impacts & Trends of Data Mining
PPTX
One Time Pad Encryption Technique
PPSX
Parallel Database
Aaron Roth, Associate Professor, University of Pennsylvania, at MLconf NYC 2017
Differential privacy and applications to location privacy
Differential Privacy for Information Retrieval
Differential privacy in the real world
Security and Privacy of Machine Learning
Social Impacts & Trends of Data Mining
One Time Pad Encryption Technique
Parallel Database

What's hot (20)

PPTX
Developing a Map Reduce Application
PPTX
What is NoSQL and CAP Theorem
PDF
Siamese networks
PDF
Mastering the MongoDB Shell
PDF
Knowledge Graphs & Graph Data Science, More Context, Better Predictions - Neo...
PPTX
Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...
PDF
CRYPTOGRAPHY AND NETWORK SECURITY
PPTX
Chapter1: NoSQL: It’s about making intelligent choices
PDF
Knowledge Graphs Overview
PPTX
Data preprocessing
PPTX
PPL, OQL & oodbms
PDF
Hadoop & MapReduce
PDF
Data mining with differential privacy
PPT
5.1 mining data streams
PDF
The CAP Theorem
PPTX
Sql vs NoSQL
PPT
PPT
Data Mining: Concepts and techniques classification _chapter 9 :advanced methods
PPTX
SOA And Cloud Computing
Developing a Map Reduce Application
What is NoSQL and CAP Theorem
Siamese networks
Mastering the MongoDB Shell
Knowledge Graphs & Graph Data Science, More Context, Better Predictions - Neo...
Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...
CRYPTOGRAPHY AND NETWORK SECURITY
Chapter1: NoSQL: It’s about making intelligent choices
Knowledge Graphs Overview
Data preprocessing
PPL, OQL & oodbms
Hadoop & MapReduce
Data mining with differential privacy
5.1 mining data streams
The CAP Theorem
Sql vs NoSQL
Data Mining: Concepts and techniques classification _chapter 9 :advanced methods
SOA And Cloud Computing
Ad

Viewers also liked (20)

PPT
Wonju Medical Industry Techno Valley Introduction
PDF
Carta mordiscon
PPT
Wmit introduction 2012 english
PDF
Presentazione Peopleware Marcom
PPTX
Iltabloidmotori
PPT
About P&T
PPT
Реальные углы обзора видеорегистраторов
PPTX
Comparison between different marketing plans
PDF
June 2013 IRMAC slides
PPT
Baker Business Bootcamp
DOCX
Oumh1103 bab 4
PDF
PROFESSIONAL LEARNING NETWORKS- MASS CUE 2013
PPT
4 Seasons Virtual Field Trip
PDF
Applying Data Privacy Techniques on Published Data in Uganda
PDF
Resolution Independence - Preparing Websites for Retina Displays
PDF
17.mengadministrasi server dalam_jaringan
PDF
Vocab dict
PDF
Kato Mivule - Utilizing Noise Addition for Data Privacy, an Overview
PPT
Burton Industries ppt 2012
Wonju Medical Industry Techno Valley Introduction
Carta mordiscon
Wmit introduction 2012 english
Presentazione Peopleware Marcom
Iltabloidmotori
About P&T
Реальные углы обзора видеорегистраторов
Comparison between different marketing plans
June 2013 IRMAC slides
Baker Business Bootcamp
Oumh1103 bab 4
PROFESSIONAL LEARNING NETWORKS- MASS CUE 2013
4 Seasons Virtual Field Trip
Applying Data Privacy Techniques on Published Data in Uganda
Resolution Independence - Preparing Websites for Retina Displays
17.mengadministrasi server dalam_jaringan
Vocab dict
Kato Mivule - Utilizing Noise Addition for Data Privacy, an Overview
Burton Industries ppt 2012
Ad

Similar to Towards A Differential Privacy Preserving Utility Machine Learning Classifier (20)

PDF
A Comparative Analysis of Data Privacy and Utility Parameter Adjustment, Usin...
PDF
Kato Mivule: An Investigation of Data Privacy and Utility Preservation Using ...
PDF
A Comparative Analysis of Data Privacy and Utility Parameter Adjustment, Usin...
PDF
Towards A Differential Privacy and Utility Preserving Machine Learning Classi...
PDF
Privacy Preserving Aggregate Statistics for Mobile Crowdsensing
PDF
THE CRYPTO CLUSTERING FOR ENHANCEMENT OF DATA PRIVACY
PDF
78201919
PDF
78201919
PDF
winbis1005
PPTX
Three Laws of Trusted Data Sharing: (Building a Better Business Case for Dat...
PDF
www.ijerd.com
PPTX
Chapter 2 Introduction to CR_Process.pptx
DOC
Ci2004-10.doc
PDF
Computer And Information Science Roger Lee
PDF
July 2025-Top 10 Read articles ACIJ Advanced Computing: An International Jour...
PDF
An Investigation of Data Privacy and Utility Preservation Using KNN Classific...
PPT
Unexpected Challenges in Large Scale Machine Learning by Charles Parker
PDF
PARTICIPATION ANTICIPATING IN ELECTIONS USING DATA MINING METHODS
PDF
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
PDF
An Investigation of Data Privacy and Utility Using Machine Learning as a Gauge
A Comparative Analysis of Data Privacy and Utility Parameter Adjustment, Usin...
Kato Mivule: An Investigation of Data Privacy and Utility Preservation Using ...
A Comparative Analysis of Data Privacy and Utility Parameter Adjustment, Usin...
Towards A Differential Privacy and Utility Preserving Machine Learning Classi...
Privacy Preserving Aggregate Statistics for Mobile Crowdsensing
THE CRYPTO CLUSTERING FOR ENHANCEMENT OF DATA PRIVACY
78201919
78201919
winbis1005
Three Laws of Trusted Data Sharing: (Building a Better Business Case for Dat...
www.ijerd.com
Chapter 2 Introduction to CR_Process.pptx
Ci2004-10.doc
Computer And Information Science Roger Lee
July 2025-Top 10 Read articles ACIJ Advanced Computing: An International Jour...
An Investigation of Data Privacy and Utility Preservation Using KNN Classific...
Unexpected Challenges in Large Scale Machine Learning by Charles Parker
PARTICIPATION ANTICIPATING IN ELECTIONS USING DATA MINING METHODS
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
An Investigation of Data Privacy and Utility Using Machine Learning as a Gauge

More from Kato Mivule (18)

PDF
A Study of Usability-aware Network Trace Anonymization
PDF
Cancer Diagnostic Prediction with Amazon ML – A Tutorial
PDF
Implementation of Data Privacy and Security in an Online Student Health Recor...
PPTX
Kato Mivule - Towards Agent-based Data Privacy Engineering
PDF
A Codon Frequency Obfuscation Heuristic for Raw Genomic Data Privacy
PDF
An Investigation of Data Privacy and Utility Using Machine Learning as a Gauge
PDF
Lit Review Talk by Kato Mivule: A Review of Genetic Algorithms
PDF
Lit Review Talk by Kato Mivule: Protecting DNA Sequence Anonymity with Genera...
PDF
An Investigation of Data Privacy and Utility Using Machine Learning as a Gauge
PDF
Lit Review Talk - Signal Processing and Machine Learning with Differential Pr...
PDF
Kato Mivule: An Overview of CUDA for High Performance Computing
PDF
Literature Review: The Role of Signal Processing in Meeting Privacy Challenge...
PDF
Kato Mivule: An Overview of Adaptive Boosting – AdaBoost
PDF
Kato Mivule: COGNITIVE 2013 - An Overview of Data Privacy in Multi-Agent Lear...
PPTX
A Robust Layered Control System for a Mobile Robot, Rodney A. Brooks; A Softw...
PDF
Two Pseudo-random Number Generators, an Overview
PDF
Applying Data Privacy Techniques on Published Data in Uganda
PDF
Utilizing Noise Addition For Data Privacy, an Overview
A Study of Usability-aware Network Trace Anonymization
Cancer Diagnostic Prediction with Amazon ML – A Tutorial
Implementation of Data Privacy and Security in an Online Student Health Recor...
Kato Mivule - Towards Agent-based Data Privacy Engineering
A Codon Frequency Obfuscation Heuristic for Raw Genomic Data Privacy
An Investigation of Data Privacy and Utility Using Machine Learning as a Gauge
Lit Review Talk by Kato Mivule: A Review of Genetic Algorithms
Lit Review Talk by Kato Mivule: Protecting DNA Sequence Anonymity with Genera...
An Investigation of Data Privacy and Utility Using Machine Learning as a Gauge
Lit Review Talk - Signal Processing and Machine Learning with Differential Pr...
Kato Mivule: An Overview of CUDA for High Performance Computing
Literature Review: The Role of Signal Processing in Meeting Privacy Challenge...
Kato Mivule: An Overview of Adaptive Boosting – AdaBoost
Kato Mivule: COGNITIVE 2013 - An Overview of Data Privacy in Multi-Agent Lear...
A Robust Layered Control System for a Mobile Robot, Rodney A. Brooks; A Softw...
Two Pseudo-random Number Generators, an Overview
Applying Data Privacy Techniques on Published Data in Uganda
Utilizing Noise Addition For Data Privacy, an Overview

Recently uploaded (20)

PPT
Teaching material agriculture food technology
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
MYSQL Presentation for SQL database connectivity
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Empathic Computing: Creating Shared Understanding
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
cuic standard and advanced reporting.pdf
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
sap open course for s4hana steps from ECC to s4
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Encapsulation_ Review paper, used for researhc scholars
Teaching material agriculture food technology
Dropbox Q2 2025 Financial Results & Investor Presentation
MYSQL Presentation for SQL database connectivity
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Programs and apps: productivity, graphics, security and other tools
Diabetes mellitus diagnosis method based random forest with bat algorithm
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Empathic Computing: Creating Shared Understanding
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
20250228 LYD VKU AI Blended-Learning.pptx
cuic standard and advanced reporting.pdf
“AI and Expert System Decision Support & Business Intelligence Systems”
Understanding_Digital_Forensics_Presentation.pptx
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
sap open course for s4hana steps from ECC to s4
Building Integrated photovoltaic BIPV_UPV.pdf
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Encapsulation_ Review paper, used for researhc scholars

Towards A Differential Privacy Preserving Utility Machine Learning Classifier

  • 1. Complex Adaptive Systems 2012 – Washington DC USA, November 14-16 Towards A Differential Privacy and Utility Preserving Machine Learning Classifier Kato Mivule, Claude Turner, and Soo-Yeon Ji Computer Science Department Bowie State University Complex Adaptive Systems 2012 – Washington DC USA, November 14-16 1
  • 2. Complex Adaptive Systems 2012 – Washington DC USA, Outline November 14-16  Introduction  Related work  Essential Terms  Methodology  Results  Conclusion 2
  • 3. Introduction  Entities transact in ‘big data’ containing personal identifiable information (PII).  Organizations are bound by federal and state law to ensure data privacy.  In the process to achieve privacy, the utility of privatized datasets diminishes.  Achieving balance between privacy and utility is an ongoing problem.  Therefore, we investigate a differential privacy preserving machine learning classification approach that seeks an acceptable level of utility. Complex Adaptive Systems 2012 – Washington DC USA, November 14-16 3
  • 4. Related Work There is a growing interest in investigating privacy preserving data mining solutions that provide a balance between data privacy and utility.  Kifer and Gehrke (2006) did a broad study of enhanced data utility in privacy preserving data publishing by using statistical approaches.  Wong (2007) described how achieving global optimal privacy while maintaining utility is an NP-hard problem.  Krause and Horvitz (2010) noted that endeavours of finding trade-offs between privacy and utility is still an NP-hard problem.  Muralidhar and Sarathy (2011) showed that differential privacy provides strong privacy guarantees but utility is still a problem due to noise levels.  Finding the optimal balance between privacy and utility remains a challenge—even with differential privacy. 4 Complex Adaptive Systems 2012 – Washington DC USA, November 14-16
  • 5. Data Utility verses Privacy  Data utility is the extent of how useful a published dataset is to the consumer of that publicized dataset.  In the course of a data privacy process, original data will lose statistical value despite privacy guarantees. Image Source: Kenneth Corbin/Internet News. Complex Adaptive Systems 2012 – Washington DC USA, November 14-16 5
  • 6. Objective  Achieving an optimal balance between data privacy and utility remains an ongoing challenge.  Such optimality is highly desired and remains our investigation goal. Image Source: Wikipedia, on Confidentiality. Complex Adaptive Systems 2012 – Washington DC USA, November 14-16 6
  • 7. Ensemble classification  Is a machine learning process, in which a collection of several independently trained classifiers are merged to achieve better prediction.  Examples include single trained decision trees joined to make accurate predictions. Complex Adaptive Systems 2012 – Washington DC USA, November 14-16 7
  • 8. AdaBoost Ensemble – Adaptive Boosting  Proposed by Freund and Schapire (1995), uses several iterations by adding weak learners to create a powerful learner, adjusting weights to center on misclassified data in earlier iterations.  Classification Error in AdaBoost Ensemble is computed as below: Complex Adaptive Systems 2012 – Washington DC USA, November 14-16 8
  • 9. AdaBoost Ensemble (Cont’d )  AdaBoost Ensemble computes as follows: Complex Adaptive Systems 2012 – Washington DC USA, November 14-16 9
  • 10. Differential Privacy Complex Adaptive Systems 2012 – Washington DC USA, November 14-16 10
  • 11. Differential Privacy (Cont’d) Complex Adaptive Systems 2012 – Washington DC USA, November 14-16 11
  • 12. Methodology (Cont’d)  We utilized a public available Barack Obama 2008 campaign donations dataset.  The data set contained 17,695 records of original unperturbed data.  Two attributes, the donation amount and income status, are utilized to classify data into three classes.  The three classes are low income, middle income, and high income, for donations $1 to $49, $50 to $80, $81 and above respectively.  Validating our approach, the dataset comprised 50 percent on training and the remainder on testing, on both Original and Privatized datasets.  Oracle database is queried via MATLAB ODBC connector. MATLAB is used for differential privacy and machine learning classification. Complex Adaptive Systems 2012 – Washington DC USA, November 14-16 12
  • 13. Results  Essential statistical traits of the original and differential privacy datasets, a necessary requirement to publish privatized datasets, are kept.  As depicted, the mean, standard deviation, and variance of the original and differential privacy datasets remained the same. Complex Adaptive Systems 2012 – Washington DC USA, November 14-16 13
  • 14. Results (Cont’d)  There is a strong positive covariance of 1060.8 between the two datasets, which means that they grow simultaneously, as illustrated below: Complex Adaptive Systems 2012 – Washington DC USA, November 14-16 14
  • 15. Results (Cont’d)  There is almost no correlation (the correlation was 0.0054) between the original and differentially privatized datasets.  Indicates some privacy assurances, and difficulty for an attacker, dealing only with the privatized dataset, to correctly infer any alterations. Complex Adaptive Systems 2012 – Washington DC USA, November 14-16 15
  • 16. Results (Cont’d)  After applying differential privacy, AdaBoost ensemble classifier is performed.  The outcome of the donors’ dataset was Low, Middle, and High income, for donations 0 to 50, 51 to 80, and 81 to 100, respectively.  This same classification outcome is used for the perturbed dataset to investigate whether the classifier would categorize the perturbed dataset correctly. Complex Adaptive Systems 2012 – Washington DC USA, November 14-16 16
  • 17. Results (Cont’d)  The training dataset from the original data showed that the classification error dropped from 0.25 to 0 with increased weak decision tree learners.  The results changed with the training dataset on the differentially private data when the classification error dropped from 0.588 to 0.58. Complex Adaptive Systems 2012 – Washington DC USA, November 14-16 17
  • 18. Results (Cont’d)  When the same procedure is applied to the test dataset of the original data the classification error dropped from 0.03 to 0.  However, when this procedure perform on the differentially private data, the error rate did not change even with increased number of weak decision tree. Complex Adaptive Systems 2012 – Washington DC USA, November 14-16 18
  • 19. Conclusion  In this study, we found that while differential privacy might guarantee strong confidentiality, providing data utility still remains a challenge.  However, this study is instructive in a variety of ways:  The level of Laplace noise does affect the classification error.  Increasing the number of weak learners is not too significant.  Adjusting the Laplace noise parameter, ε, is essential for further study.  However, accurate classification means loss of privacy.  Tradeoffs must be made between privacy and utility.  We plan on investigating optimization approaches for such tradeoffs. Complex Adaptive Systems 2012 – Washington DC USA, November 14-16 19
  • 20. Complex Adaptive Systems 2012 – Washington DC USA, Questions? November 14-16 Contact: Kato Mivule: kmivule@gmail.com Thank You. 20