SlideShare a Scribd company logo
Sharing Confidential
Data
George Alter
University of Michigan
Disclosure: Risk & Harm
• What do we promise when we conduct
research about people?
– That benefits (usually to society) outweigh risk
of harm (usually to individual)
– That we will protect confidentiality
• Why is confidentiality so important?
– Because people may reveal information to us
that could cause them harm.
– Examples: criminal activity, antisocial activity,
medical conditions...
Who are We Afraid of?
• Parents trying to find out if their child had an
abortion or uses drugs
• Spouse seeking hidden income or infidelity in
a divorce
• Insurance companies seeking to eliminate
risky individuals
• Other criminals and nuisances
• NSA, CIA, FBI, KGB, SABOT, SBL, SMERSH,
KAOS, etc...
What are We Afraid of...
• Direct Identifiers
– Inadvertent release of unnecessary
information (Name, phone number, SSN…)
– Direct identifiers required for analysis
(location, genetic characteristics,…)
• Indirect Identifiers
– Characteristics that identify a subject when
combined (sex, race, age, education,
occupation)
Deductive Disclosure
• A combination of characteristics could
allow an intruder to re-identify an
individual in a survey “deductively,” even
if direct identifiers are removed.
• Dependent on
– Knowing someone in the survey
– Matching cases to a database
Deductive Disclosure
Contextual data increases the risk of disclosure
– Some attributes can be known by an outsider (age,
race)
– Individuals are more identifiable in smaller populations
• The more specific the geography, the more
attention must be paid to disclosure risk.
Contextual data in social science
research
Geographic context
• Neighborhood characteristics, economic
conditions, health services, distance to
resources, etc.
Institutional context
• School
• Hospital
• Prison
Current Survey Designs Increase the
Risks of Disclosing Subjects’ Identities
• Geographically referenced data
• Longitudinal data
• Multi-level data:
– Student, teacher, school, school district
– Patient, clinic, community
Protecting Confidential Data
• Safe data: Modify the data to reduce the risk
of re-identification
• Safe projects: Reviewing research designs
• Safe settings: Physical isolation and secure
technologies
• Safe people: Training and Data use
agreements
• Safe outputs: Results are reviewed before
being released to researchers
Safe data
Disclosure risks can be reduced by:
• Multiple sites rather than single locations
• Keeping sampling locations secret
– Releasing characteristics of contexts without
providing locations
• Oversampling rare characteristics
Safe Data
Data masking
• Grouping values
• Top-coding
• Aggregating geographic areas
• Swapping values
• Suppressing unique cases
• Sampling within a larger data collection
• Adding “noise”
• Replacing real data with synthetic data
Safe Projects
• Research plans are reviewed before access is
approved
• Levels of project review
1. Does the research plan require confidential
data?
2. Would the research plan identify individual
subjects?
3. Is the research scientifically sound? Does it
“serve the public good”?
• Scientific review requires standards and expertise
Safe Settings
• Data protection plans
• Remote submission and execution
• Virtual data enclave
• Physical enclave
Data Protection Plans should address risks:
• unauthorized use of account on computer
• computer break-in by exploiting vulnerability
• hijacking of computer by malware or botware
• interception of network traffic between computers
• loss of computer or media
• theft of computer or media
• eavesdropping of electronic output on computer screen
• unauthorized viewing of paper output
We often focus too much on technology and not enough on
risk.
Safe Settings
Improving Data Security Plans
• Problems
– PIs lack technical expertise
– Requirements are inconsistent and confusing
– Monitoring compliance is expensive
• An alternative: Institution-level data security
protocols
– Tiered guidelines for different levels of risk
– Focus on mitigating risks not specifying technologies
– Certification of researchers
– Institutional oversight
• Remote submission and execution
– User submits program code or scripts, which
are executed in a controlled environment
• Virtual data enclave
– Remote desktop technology prevents moving
data to user’s local computer
– Requires a data use agreement
• Physical enclave
– Users must travel to the data
Safe Settings
Virtual Data Enclave
The Virtual Data Enclave (VDE) provides remote
access to quantitative data in a secure environment.
Safe people
• Data use agreements
• Training
Safe people
• Parts of a data use agreement at ICPSR
– Research plan
– IRB approval
– Data protection plan
– Behavior rules
– Security pledge
– Institutional signature
Informed
Consent
Interview
Data producer
Data archive
Researcher
Data Use
Agreement
Institution
Dataflow
Data
Dissemination
Agreement
Research
Plan
IRB
Approval
Data
Protection
Plan
Data Use Agreement: Behavior rules
To avoid inadvertent disclosure of persons, families, households,
neighborhoods, schools or health services by using the following
guidelines in the release of statistics derived from the dataset.
1. In no table should all cases in any row or column be found in a
single cell.
2. In no case should the total for a row or column of a cross-
tabulation be fewer than ten.
3. In no case should a quantity figure be based on fewer than ten
cases.
4. In no case should a quantity figure be published if one case
contributes more than 60 percent of the amount.
5. In no case should data on an identifiable case, or any of the kinds
of data listed in preceding items 1-3, be derivable through subtraction
or other calculation from the combination of tables released.
Data Use Agreement
The Recipient Institution will treat allegations, by
NAHDAP/ICPSR or other parties, of violations of
this agreement as allegations of violations of its
policies and procedures on scientific integrity and
misconduct. If the allegations are confirmed, the
Recipient Institution will treat the violations as it
would violations of the explicit terms of its
policies on scientific integrity and misconduct.
Problems with DUAs
• DUAs are issued by project.
– Every PI gets a new DUA, even if the Institution
has already signed the DUA for someone else
• Language and conditions in DUAs are not
standard
– Frequent negotiations and lawyering
Reducing the costs of DUAs
• Institution-wide agreements
– One agreement per institution, not per project
– A designated “data steward” adds qualified
researchers to the agreement
– Example: Databrary Agreement
• Covers informed consent, data sharing, data use
• Researcher certification covering multiple
datasets
Disclosure: Graph with extreme values example
no
Arrested in last year?
yes
Data were collected for a sample of 104 people in a county.
Among the variables collected were age, gender, and whether the person was arrested within the last year. Box plots
below show the distribution of age, one plot for those arrested and one for those who were not. The number labels are
case number in the dataset.
The potential identifiability represented by outlying values is compounded here by an unusual combination that could
probably be identified using public records for a county in the U.S. --someone approximately 90 years old was arrested in
the sample. Including extreme values is a disclosure risk for identifiability when combined with other variables in the
dataset.
N 104
min age 12
max age 95
mean age 51
std dev 15
% female 5.2
% arrested 5.8
Safe People: Disclosure risk online tutorial
• Controlled environments allow review of
outputs
o Remote submission and execution
o Virtual data enclaves
o Physical enclaves
• Disclosure checks may be automated, but
manual review is usually necessary
Safe outputs
Weighing Costs and Benefits
• Data protection has costs
– Modifying data affects analysis
– Access restrictions impose burdens on researchers
• Protection measures should be proportional to
risks
– Probability that an individual can be (re-)identified
– Severity of harm resulting from re-identification
Gradient of Risk &
Restriction
SeverityofHarm
Probability of Disclosure
Tiny Risk
Web
Access
Some Risk
Data Use
Agreement
Moderate Risk
- Strong DUA
& Technology
Rules
High Risk
Enclosed Data
Center
Simple Data: minimal
harm & very low
chance of disclosure
Complex Data: low harm & low
probability of disclosure
Complex data: moderate harm &
re-identifiable with difficulty
High severity of
harm & highly
identifiable
Thank you
George Alter
University of Michigan
altergc@umich.edu
What if databases could send data to a trusted
third party, who would compute statistics?
Database 1 Database 2
Secure Multi-Party Computing
MPC does this without the third party.
Encryption
Average Income
Three people with true salaries S1, S2, S3, which
they never reveal.
Each computes random numbers Rij (sent from i to
j).
Report salary plus own random numbers minus
those received, i.e.,
X1 = S1 + (R12 + R13) – (R21 + R31)
X2 = S2 + (R21 + R23) – (R12 + R32)
+ X3 = S3 + (R31 + R32) – (R13 + R23)
Σ = S1 + S2 + S3
Example from Daniel Goroff, Alfred P. Sloan Foundation
Homomorphic Encryption

More Related Content

PPTX
Managing sensitive data at the Australian Data Archive
PDF
Wilbanks Can We Simultaneously Support Both Privacy & Research?
PDF
Space Situational Awareness Forum - U.S Air Force Presentation
PDF
2015 04-18-wilson cg
PPTX
Introduction to Data Management
PPTX
Genome sharing projects around the world nijmegen oct 29 - 2015
PPTX
Finding and Accessing Human Genomics Datasets
PPTX
RDM & ELNs @ Edinburgh
Managing sensitive data at the Australian Data Archive
Wilbanks Can We Simultaneously Support Both Privacy & Research?
Space Situational Awareness Forum - U.S Air Force Presentation
2015 04-18-wilson cg
Introduction to Data Management
Genome sharing projects around the world nijmegen oct 29 - 2015
Finding and Accessing Human Genomics Datasets
RDM & ELNs @ Edinburgh

What's hot (20)

PDF
Guy avoiding-dat apocalypse
PDF
Medical image analysis, retrieval and evaluation infrastructures
PDF
Challenges in medical imaging and the VISCERAL model
PPTX
Managing Confidential Information – Trends and Approaches
PPTX
Developing data services: a tale from two Oregon universities
PPTX
Introduction to research data management; Lecture 01 for GRAD521
PDF
openEHR v COVID-19
PDF
From byte to mind
PDF
Fair by design
PPT
PPTX
June2014 brownbag privacy
PPTX
Reproducibility from an infomatics perspective
PDF
Graham Pryor
PDF
openEHR template development for COVID-19
PDF
Dr. Eliot Siegel: Watson and Deep QA Software in Pursuit of Personalized Medi...
PDF
Altman - Perfectly Anonymous Data is Perfectly Useless Data
PPTX
2013 bio it world
PDF
Guideline based CDSS for COVID-19
PDF
Introduction to research data management
PPTX
Ethics my own effort
Guy avoiding-dat apocalypse
Medical image analysis, retrieval and evaluation infrastructures
Challenges in medical imaging and the VISCERAL model
Managing Confidential Information – Trends and Approaches
Developing data services: a tale from two Oregon universities
Introduction to research data management; Lecture 01 for GRAD521
openEHR v COVID-19
From byte to mind
Fair by design
June2014 brownbag privacy
Reproducibility from an infomatics perspective
Graham Pryor
openEHR template development for COVID-19
Dr. Eliot Siegel: Watson and Deep QA Software in Pursuit of Personalized Medi...
Altman - Perfectly Anonymous Data is Perfectly Useless Data
2013 bio it world
Guideline based CDSS for COVID-19
Introduction to research data management
Ethics my own effort
Ad

Similar to Sharing Confidential Data in ICPSR (20)

PPTX
Research Ethics and Use of Restricted Access Data
PPTX
2014 NCSAM - Data Security and Compliance—What You Need to Know.pptx
PPTX
Ethics and Politics of Big Data
PPTX
Use of data in safe havens: ethics and reproducibility issues
PPTX
Data Sharing with ICPSR: Fueling the Cycle of Science through Discovery, Acce...
PDF
Publishing and sharing sensitive data 28 June
PDF
Cyber Summit 2016: Privacy Issues in Big Data Sharing and Reuse
PPTX
Confidential data management_key_concepts
PDF
Enabling Science with Trust and Security – Guest Keynote
PPTX
week 7.pptx
PDF
Preparing Research Data for Sharing
PPTX
HCF 2019 Panel 5: Nicolas Ball
PPTX
Duality Technologies_Driving Secure Collaboration in Healthcare_mHealth Israel
PPTX
Data-Ethics-and-Privacy-What-Every-Analyst-Should-Know
PPTX
A Lifecycle Approach to Information Privacy
PPTX
Managing data responsibly to enable research interity
PDF
Hivos and Responsible Data
PPTX
Respect Thy Data: The Gospel
PPTX
Burton - Security, Privacy and Trust
Research Ethics and Use of Restricted Access Data
2014 NCSAM - Data Security and Compliance—What You Need to Know.pptx
Ethics and Politics of Big Data
Use of data in safe havens: ethics and reproducibility issues
Data Sharing with ICPSR: Fueling the Cycle of Science through Discovery, Acce...
Publishing and sharing sensitive data 28 June
Cyber Summit 2016: Privacy Issues in Big Data Sharing and Reuse
Confidential data management_key_concepts
Enabling Science with Trust and Security – Guest Keynote
week 7.pptx
Preparing Research Data for Sharing
HCF 2019 Panel 5: Nicolas Ball
Duality Technologies_Driving Secure Collaboration in Healthcare_mHealth Israel
Data-Ethics-and-Privacy-What-Every-Analyst-Should-Know
A Lifecycle Approach to Information Privacy
Managing data responsibly to enable research interity
Hivos and Responsible Data
Respect Thy Data: The Gospel
Burton - Security, Privacy and Trust
Ad

More from ARDC (20)

PPTX
Introduction to ADA
PPTX
Architecture and Standards
PPTX
Data Sharing and Release Legislation
PPT
Australian Dementia Network (ADNet)
PPTX
Investigator-initiated clinical trials: a community perspective
PPTX
NCRIS and the health domain
PPTX
International perspective for sharing publicly funded medical research data
PPTX
Clinical trials data sharing
PPTX
Clinical trials and cohort studies
PPTX
Introduction to vision and scope
PPTX
FAIR for the future: embracing all things data
PDF
ARDC 2018 state engagements - Nov-Dec 2018 - Slides - Ian Duncan
PDF
Skilling-up-in-research-data-management-20181128
PDF
Research data management and sharing of medical data
PPTX
Findable, Accessible, Interoperable and Reusable (FAIR) data
PPTX
Applying FAIR principles to linked datasets: Opportunities and Challenges
PDF
How to make your data count webinar, 26 Nov 2018
PDF
Ready, Set, Go! Join the Top 10 FAIR Data Things Global Sprint
PDF
How FAIR is your data? Copyright, licensing and reuse of data
PDF
Peter neish DMPs BoF eResearch 2018
Introduction to ADA
Architecture and Standards
Data Sharing and Release Legislation
Australian Dementia Network (ADNet)
Investigator-initiated clinical trials: a community perspective
NCRIS and the health domain
International perspective for sharing publicly funded medical research data
Clinical trials data sharing
Clinical trials and cohort studies
Introduction to vision and scope
FAIR for the future: embracing all things data
ARDC 2018 state engagements - Nov-Dec 2018 - Slides - Ian Duncan
Skilling-up-in-research-data-management-20181128
Research data management and sharing of medical data
Findable, Accessible, Interoperable and Reusable (FAIR) data
Applying FAIR principles to linked datasets: Opportunities and Challenges
How to make your data count webinar, 26 Nov 2018
Ready, Set, Go! Join the Top 10 FAIR Data Things Global Sprint
How FAIR is your data? Copyright, licensing and reuse of data
Peter neish DMPs BoF eResearch 2018

Recently uploaded (20)

PDF
Supply Chain Operations Speaking Notes -ICLT Program
PPTX
master seminar digital applications in india
PDF
A systematic review of self-coping strategies used by university students to ...
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PDF
Anesthesia in Laparoscopic Surgery in India
PPTX
Pharma ospi slides which help in ospi learning
PDF
01-Introduction-to-Information-Management.pdf
PDF
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
PDF
VCE English Exam - Section C Student Revision Booklet
PPTX
Tissue processing ( HISTOPATHOLOGICAL TECHNIQUE
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PPTX
Lesson notes of climatology university.
PDF
RMMM.pdf make it easy to upload and study
PDF
Weekly quiz Compilation Jan -July 25.pdf
PPTX
Cell Types and Its function , kingdom of life
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PDF
Trump Administration's workforce development strategy
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
Supply Chain Operations Speaking Notes -ICLT Program
master seminar digital applications in india
A systematic review of self-coping strategies used by university students to ...
Abdominal Access Techniques with Prof. Dr. R K Mishra
Anesthesia in Laparoscopic Surgery in India
Pharma ospi slides which help in ospi learning
01-Introduction-to-Information-Management.pdf
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
VCE English Exam - Section C Student Revision Booklet
Tissue processing ( HISTOPATHOLOGICAL TECHNIQUE
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
FourierSeries-QuestionsWithAnswers(Part-A).pdf
Lesson notes of climatology university.
RMMM.pdf make it easy to upload and study
Weekly quiz Compilation Jan -July 25.pdf
Cell Types and Its function , kingdom of life
Microbial diseases, their pathogenesis and prophylaxis
Trump Administration's workforce development strategy
Final Presentation General Medicine 03-08-2024.pptx
3rd Neelam Sanjeevareddy Memorial Lecture.pdf

Sharing Confidential Data in ICPSR

  • 2. Disclosure: Risk & Harm • What do we promise when we conduct research about people? – That benefits (usually to society) outweigh risk of harm (usually to individual) – That we will protect confidentiality • Why is confidentiality so important? – Because people may reveal information to us that could cause them harm. – Examples: criminal activity, antisocial activity, medical conditions...
  • 3. Who are We Afraid of? • Parents trying to find out if their child had an abortion or uses drugs • Spouse seeking hidden income or infidelity in a divorce • Insurance companies seeking to eliminate risky individuals • Other criminals and nuisances • NSA, CIA, FBI, KGB, SABOT, SBL, SMERSH, KAOS, etc...
  • 4. What are We Afraid of... • Direct Identifiers – Inadvertent release of unnecessary information (Name, phone number, SSN…) – Direct identifiers required for analysis (location, genetic characteristics,…) • Indirect Identifiers – Characteristics that identify a subject when combined (sex, race, age, education, occupation)
  • 5. Deductive Disclosure • A combination of characteristics could allow an intruder to re-identify an individual in a survey “deductively,” even if direct identifiers are removed. • Dependent on – Knowing someone in the survey – Matching cases to a database
  • 6. Deductive Disclosure Contextual data increases the risk of disclosure – Some attributes can be known by an outsider (age, race) – Individuals are more identifiable in smaller populations • The more specific the geography, the more attention must be paid to disclosure risk.
  • 7. Contextual data in social science research Geographic context • Neighborhood characteristics, economic conditions, health services, distance to resources, etc. Institutional context • School • Hospital • Prison
  • 8. Current Survey Designs Increase the Risks of Disclosing Subjects’ Identities • Geographically referenced data • Longitudinal data • Multi-level data: – Student, teacher, school, school district – Patient, clinic, community
  • 9. Protecting Confidential Data • Safe data: Modify the data to reduce the risk of re-identification • Safe projects: Reviewing research designs • Safe settings: Physical isolation and secure technologies • Safe people: Training and Data use agreements • Safe outputs: Results are reviewed before being released to researchers
  • 10. Safe data Disclosure risks can be reduced by: • Multiple sites rather than single locations • Keeping sampling locations secret – Releasing characteristics of contexts without providing locations • Oversampling rare characteristics
  • 11. Safe Data Data masking • Grouping values • Top-coding • Aggregating geographic areas • Swapping values • Suppressing unique cases • Sampling within a larger data collection • Adding “noise” • Replacing real data with synthetic data
  • 12. Safe Projects • Research plans are reviewed before access is approved • Levels of project review 1. Does the research plan require confidential data? 2. Would the research plan identify individual subjects? 3. Is the research scientifically sound? Does it “serve the public good”? • Scientific review requires standards and expertise
  • 13. Safe Settings • Data protection plans • Remote submission and execution • Virtual data enclave • Physical enclave
  • 14. Data Protection Plans should address risks: • unauthorized use of account on computer • computer break-in by exploiting vulnerability • hijacking of computer by malware or botware • interception of network traffic between computers • loss of computer or media • theft of computer or media • eavesdropping of electronic output on computer screen • unauthorized viewing of paper output We often focus too much on technology and not enough on risk. Safe Settings
  • 15. Improving Data Security Plans • Problems – PIs lack technical expertise – Requirements are inconsistent and confusing – Monitoring compliance is expensive • An alternative: Institution-level data security protocols – Tiered guidelines for different levels of risk – Focus on mitigating risks not specifying technologies – Certification of researchers – Institutional oversight
  • 16. • Remote submission and execution – User submits program code or scripts, which are executed in a controlled environment • Virtual data enclave – Remote desktop technology prevents moving data to user’s local computer – Requires a data use agreement • Physical enclave – Users must travel to the data Safe Settings
  • 18. The Virtual Data Enclave (VDE) provides remote access to quantitative data in a secure environment.
  • 19. Safe people • Data use agreements • Training
  • 20. Safe people • Parts of a data use agreement at ICPSR – Research plan – IRB approval – Data protection plan – Behavior rules – Security pledge – Institutional signature
  • 21. Informed Consent Interview Data producer Data archive Researcher Data Use Agreement Institution Dataflow Data Dissemination Agreement Research Plan IRB Approval Data Protection Plan
  • 22. Data Use Agreement: Behavior rules To avoid inadvertent disclosure of persons, families, households, neighborhoods, schools or health services by using the following guidelines in the release of statistics derived from the dataset. 1. In no table should all cases in any row or column be found in a single cell. 2. In no case should the total for a row or column of a cross- tabulation be fewer than ten. 3. In no case should a quantity figure be based on fewer than ten cases. 4. In no case should a quantity figure be published if one case contributes more than 60 percent of the amount. 5. In no case should data on an identifiable case, or any of the kinds of data listed in preceding items 1-3, be derivable through subtraction or other calculation from the combination of tables released.
  • 23. Data Use Agreement The Recipient Institution will treat allegations, by NAHDAP/ICPSR or other parties, of violations of this agreement as allegations of violations of its policies and procedures on scientific integrity and misconduct. If the allegations are confirmed, the Recipient Institution will treat the violations as it would violations of the explicit terms of its policies on scientific integrity and misconduct.
  • 24. Problems with DUAs • DUAs are issued by project. – Every PI gets a new DUA, even if the Institution has already signed the DUA for someone else • Language and conditions in DUAs are not standard – Frequent negotiations and lawyering
  • 25. Reducing the costs of DUAs • Institution-wide agreements – One agreement per institution, not per project – A designated “data steward” adds qualified researchers to the agreement – Example: Databrary Agreement • Covers informed consent, data sharing, data use • Researcher certification covering multiple datasets
  • 26. Disclosure: Graph with extreme values example no Arrested in last year? yes Data were collected for a sample of 104 people in a county. Among the variables collected were age, gender, and whether the person was arrested within the last year. Box plots below show the distribution of age, one plot for those arrested and one for those who were not. The number labels are case number in the dataset. The potential identifiability represented by outlying values is compounded here by an unusual combination that could probably be identified using public records for a county in the U.S. --someone approximately 90 years old was arrested in the sample. Including extreme values is a disclosure risk for identifiability when combined with other variables in the dataset. N 104 min age 12 max age 95 mean age 51 std dev 15 % female 5.2 % arrested 5.8 Safe People: Disclosure risk online tutorial
  • 27. • Controlled environments allow review of outputs o Remote submission and execution o Virtual data enclaves o Physical enclaves • Disclosure checks may be automated, but manual review is usually necessary Safe outputs
  • 28. Weighing Costs and Benefits • Data protection has costs – Modifying data affects analysis – Access restrictions impose burdens on researchers • Protection measures should be proportional to risks – Probability that an individual can be (re-)identified – Severity of harm resulting from re-identification
  • 29. Gradient of Risk & Restriction SeverityofHarm Probability of Disclosure Tiny Risk Web Access Some Risk Data Use Agreement Moderate Risk - Strong DUA & Technology Rules High Risk Enclosed Data Center Simple Data: minimal harm & very low chance of disclosure Complex Data: low harm & low probability of disclosure Complex data: moderate harm & re-identifiable with difficulty High severity of harm & highly identifiable
  • 30. Thank you George Alter University of Michigan altergc@umich.edu
  • 31. What if databases could send data to a trusted third party, who would compute statistics? Database 1 Database 2 Secure Multi-Party Computing MPC does this without the third party. Encryption
  • 32. Average Income Three people with true salaries S1, S2, S3, which they never reveal. Each computes random numbers Rij (sent from i to j). Report salary plus own random numbers minus those received, i.e., X1 = S1 + (R12 + R13) – (R21 + R31) X2 = S2 + (R21 + R23) – (R12 + R32) + X3 = S3 + (R31 + R32) – (R13 + R23) Σ = S1 + S2 + S3 Example from Daniel Goroff, Alfred P. Sloan Foundation Homomorphic Encryption

Editor's Notes

  • #7: Type I disclosure: when an intruder has knowledge that a given person (or organization) is included in a survey and the intruder attempts to find this record. Type II disclosure occurs when an intruder does not know the identity ahead of time and uses externally available resources (linking databases) to attempt to find survey respondents.
  • #12: Or, how do we Protect Waldo?
  • #18: The visual.