SlideShare a Scribd company logo
Privacy-preserving
Metrics
Nonprofit founded in 2013 as a home for
public-benefit digital infrastructure projects,
including Let’s Encrypt, Divvi Up, and Prossimo.
Launched in October 2020, Divvi Up is a system
for privacy-preserving metrics collection based
on the Distributed Aggregation Protocol (DAP),
which is being standardized in the IETF.
Presenting:
Brandon Pitman, Technical
Lead
Sarah Gran, VP of Brand &
Donor Development
The need for privacy-preserving
metrics
● Privacy policies are insufficient as a privacy safeguard
● Hacks happen
● Mere presence of PII is a liability
But…
● Data provides valuable insight
● Data enables improved user experiences
● Data identifies problem areas to fix
Introducing Divvi Up
A service allowing private aggregation of sensitive data.
● No one but the client ever sees the original measurement.
● No one but the collector ever sees the aggregate.
The benefits:
● Users like a technically-enforced guarantee that their data
can’t be mishandled.
● Organizations like a technically-enforced guarantee that they
are not exposed to sensitive user data.
What can be aggregated?
Technically speaking:
● Numerical data: sums, mean, variance, most statistical functions
● Vectors of numbers
● Histograms, or vectors with a constrained number of nonzero
elements
● Extensible: new aggregation functions can be added
Common applications:
● Metrics/telemetry
● Survey results
● Machine-learning training data
How does it work?
Client
Client
Client
Leader
Aggregator
Helper
Aggregator
Collector
Protocol actors
There are several different protocol actors in Divvi Up:
● Client: generates measurements & uploads them to the Aggregators.
● Aggregator: receives report shares from Clients, verifies & aggregates
them, and provides aggregates to Collector. Every deployment involves
a Leader & Helper Aggregator.
○ Leader: directly receives reports from the Clients, drives
aggregation with the Helper, and provides aggregated batches to
the Collector.
○ Helper: driven by the Leader to perform aggregation & collection.
● Collector: retrieves batches of aggregated reports from the
Aggregators.
Protocol actors
Client
Client
Client
Leader
Aggregator
Helper
Aggregator
Collector
Subscribers: ENPA
Current status: turned down.
Use case: private analytics over COVID-19 exposure rates.
Apple & Google deployed the clients; ISRG & NIH operated the
aggregators; MITRE operated the collector.
The initial use-case for the technology behind Divvi Up was to
permit private analytics over COVID-19 exposure rates, operated as
part of an exposure notification system implemented by Apple &
Google during the pandemic.
Subscribers: Mozilla
Current status: in production.
Use case: sensitive telemetry.
Mozilla deploys the clients, and operates the helper aggregator and
the collector; the ISRG operates the leader aggregator.
Mozilla’s initial deployment targets sensitive metrics for their Firefox
web browser, such as determining which domains trigger a browser
crash. Mozilla’s use is interesting as they compose compose Divvi
Up with Oblivious HTTP to fully remove Divvi Up’s ability to see
metadata (e.g. IP address) associated with each report.
Subscribers: Horizontal
Current status: in production.
Use case: sensitive telemetry & survey results.
Horizontal deploys the clients, and operates the helper aggregator
and the collector. The ISRG operates the leader aggregator.
Horizontal has deployed private survey result collection in their
Shira product, and have deployed telemetry into their Tella product.
Horizontal is interesting in that they are the only subscriber who has
deployed our Android client.
Divvi Up & Oblivious HTTP
● Very high-level: OHTTP is an encrypted HTTP proxy requiring two
non-colluding servers, which hides all request metadata (e.g. IP)
from the server.
● OHTTP, when composed with a classic telemetry/aggregation system,
would hide the source of each measurement, but not the
measurement itself.
● Divvi Up hides the individual measurements, but may reveal which
clients contribute to the aggregates.
● Composing OHTTP with Divvi Up allows for private aggregation,
while preventing info leaks from the client exposing metadata to
the aggregators.
Divvi Up & Differential Privacy
● Very high-level: DP is a method to hide whether an individual
contributed to an aggregate via statistical noise.
● DP says nothing about how the aggregate is generated.
● Divvi Up can be composed with DP: Divvi Up protects the individual
measurements from being leaked while producing aggregates, DP
protects the aggregate from leaking information about individual
measurements.
● We are investigating “central” DP (noise added by aggregators) and
“client” DP (noise added by clients).
What’s next
● Discover more about how subscribers will use Divvi Up, as well as
further applications of the underlying DAP technology
● Gain production deployment experience with partners at scale
● Improve efficiency & lower cost to operate
● Continue to refine the Divvi Up subscriber web portal
● Publish DAP as an IETF RFC
Questions now? Ask!
Questions later? contact@divviup.org
Standardization Work
The Distributed Aggregation Protocol (DAP) is used by Divvi Up to
perform private aggregation. DAP inherently requires interoperation
by two non-colluding “aggregator” servers; therefore, DAP is being
standardized at the IETF.
Verifiable Distributed Aggregation Functions (VDAFs) provide the
cryptographic primitives used by the higher-level Distributed
Aggregation Protocol to perform aggregation. VDAF is being
standardized at the CFRG (Crypto Forum Research Group).
Internet Engineering Task Force (IETF) Specification

More Related Content

PDF
(CISOPlatform Summit & SACON 2024) Gen AI & Deepfake In Overall Security.pdf
PPTX
Requirements document for big data use cases
PDF
SplunkLive! Utrecht 2018 - Customer presentation: Irdeto
PDF
Blockade.io : One Click Browser Defense
PDF
Brands as Services: How the IoT Is Creating New Ecosystems
PDF
How a Time Series Database Contributes to a Decentralized Cloud Object Storag...
PDF
The IoT Food Chain – Picking the Right Dining Partner is Important with Dean ...
PDF
What is the Process of IoT Application Development.pdf
(CISOPlatform Summit & SACON 2024) Gen AI & Deepfake In Overall Security.pdf
Requirements document for big data use cases
SplunkLive! Utrecht 2018 - Customer presentation: Irdeto
Blockade.io : One Click Browser Defense
Brands as Services: How the IoT Is Creating New Ecosystems
How a Time Series Database Contributes to a Decentralized Cloud Object Storag...
The IoT Food Chain – Picking the Right Dining Partner is Important with Dean ...
What is the Process of IoT Application Development.pdf

Similar to Open Source Privacy-Preserving Metrics - Sarah Gran & Brandon Pitman (20)

PDF
Managing the Impact of COVID-19 Using Data Virtualization
PPTX
Cordaid opening up - caroline kroon
PPTX
Cordaid opening up caroline kroon
PPTX
Cordaid Opening Up - Caroline Kroon
PPTX
Privacy by Design as a system design strategy - EIC 2019
PPTX
SnapLogic At Tableau Conference - Sept 2013 #tcc13
PDF
apidays LIVE Paris 2021 - Boavitza, Year 2 by Laurent Eskenazi
PPTX
A deep dive into digital lifesytles Allot Communications - Eyal Yaron
PPTX
Phani Pandrangi - The 5 Ecosystem Partners You Need in Your Address Book. Kii
PPTX
Fintech Belgium - MeetUp on The Right Tech for your FinTech - Philippe Cornet...
PDF
Accelerate Self-service Analytics with Universal Semantic Model
PPTX
Big Data & IoT. Opportunities and challenges
PDF
Swipecrypto - INBLOCKS Jakarta 2018
PDF
Real-time Decisioning for Big Data
PDF
Building Enterprise SDI with Geonode
PPTX
Ingo Jochim & Andrew Walter - how our cloud works
PDF
Self-Service Analytics with Guard Rails
PPTX
C 04-internet of things - horizon watch trend report (client version) 28jan2015
PPTX
Cloud Options for a Modern Architecture
PDF
How businesses can benefit from privacy preserving synthetic data
Managing the Impact of COVID-19 Using Data Virtualization
Cordaid opening up - caroline kroon
Cordaid opening up caroline kroon
Cordaid Opening Up - Caroline Kroon
Privacy by Design as a system design strategy - EIC 2019
SnapLogic At Tableau Conference - Sept 2013 #tcc13
apidays LIVE Paris 2021 - Boavitza, Year 2 by Laurent Eskenazi
A deep dive into digital lifesytles Allot Communications - Eyal Yaron
Phani Pandrangi - The 5 Ecosystem Partners You Need in Your Address Book. Kii
Fintech Belgium - MeetUp on The Right Tech for your FinTech - Philippe Cornet...
Accelerate Self-service Analytics with Universal Semantic Model
Big Data & IoT. Opportunities and challenges
Swipecrypto - INBLOCKS Jakarta 2018
Real-time Decisioning for Big Data
Building Enterprise SDI with Geonode
Ingo Jochim & Andrew Walter - how our cloud works
Self-Service Analytics with Guard Rails
C 04-internet of things - horizon watch trend report (client version) 28jan2015
Cloud Options for a Modern Architecture
How businesses can benefit from privacy preserving synthetic data
Ad

More from All Things Open (20)

PDF
Agentic AI for Developers and Data Scientists Build an AI Agent in 10 Lines o...
PPTX
Big Data on a Small Budget: Scalable Data Visualization for the Rest of Us - ...
PDF
AI 3-in-1: Agents, RAG, and Local Models - Brent Laster
PDF
Let's Create a GitHub Copilot Extension! - Nick Taylor, Pomerium
PDF
Leveraging Pre-Trained Transformer Models for Protein Function Prediction - T...
PDF
Gen AI: AI Agents - Making LLMs work together in an organized way - Brent Las...
PDF
You Don't Need an AI Strategy, But You Do Need to Be Strategic About AI - Jes...
PPTX
DON’T PANIC: AI IS COMING – The Hitchhiker’s Guide to AI - Mark Hinkle, Perip...
PDF
Fine-Tuning Large Language Models with Declarative ML Orchestration - Shivay ...
PDF
Leveraging Knowledge Graphs for RAG: A Smarter Approach to Contextual AI Appl...
PPTX
Artificial Intelligence Needs Community Intelligence - Sriram Raghavan, IBM R...
PDF
Don't just talk to AI, do more with AI: how to improve productivity with AI a...
PPTX
Open-Source GenAI vs. Enterprise GenAI: Navigating the Future of AI Innovatio...
PDF
The Death of the Browser - Rachel-Lee Nabors, AgentQL
PDF
Making Operating System updates fast, easy, and safe
PDF
Reshaping the landscape of belonging to transform community
PDF
The Unseen, Underappreciated Security Work Your Maintainers May (or may not) ...
PDF
Integrating Diversity, Equity, and Inclusion into Product Design
PDF
The Open Source Ecosystem for eBPF in Kubernetes
PDF
Open-Source Low-Code - Craig St. Jean, Xebia
Agentic AI for Developers and Data Scientists Build an AI Agent in 10 Lines o...
Big Data on a Small Budget: Scalable Data Visualization for the Rest of Us - ...
AI 3-in-1: Agents, RAG, and Local Models - Brent Laster
Let's Create a GitHub Copilot Extension! - Nick Taylor, Pomerium
Leveraging Pre-Trained Transformer Models for Protein Function Prediction - T...
Gen AI: AI Agents - Making LLMs work together in an organized way - Brent Las...
You Don't Need an AI Strategy, But You Do Need to Be Strategic About AI - Jes...
DON’T PANIC: AI IS COMING – The Hitchhiker’s Guide to AI - Mark Hinkle, Perip...
Fine-Tuning Large Language Models with Declarative ML Orchestration - Shivay ...
Leveraging Knowledge Graphs for RAG: A Smarter Approach to Contextual AI Appl...
Artificial Intelligence Needs Community Intelligence - Sriram Raghavan, IBM R...
Don't just talk to AI, do more with AI: how to improve productivity with AI a...
Open-Source GenAI vs. Enterprise GenAI: Navigating the Future of AI Innovatio...
The Death of the Browser - Rachel-Lee Nabors, AgentQL
Making Operating System updates fast, easy, and safe
Reshaping the landscape of belonging to transform community
The Unseen, Underappreciated Security Work Your Maintainers May (or may not) ...
Integrating Diversity, Equity, and Inclusion into Product Design
The Open Source Ecosystem for eBPF in Kubernetes
Open-Source Low-Code - Craig St. Jean, Xebia
Ad

Recently uploaded (20)

PDF
Empathic Computing: Creating Shared Understanding
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Machine learning based COVID-19 study performance prediction
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
GDG Cloud Iasi [PUBLIC] Florian Blaga - Unveiling the Evolution of Cybersecur...
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
KodekX | Application Modernization Development
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Advanced IT Governance
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PPTX
breach-and-attack-simulation-cybersecurity-india-chennai-defenderrabbit-2025....
PDF
Modernizing your data center with Dell and AMD
PDF
Review of recent advances in non-invasive hemoglobin estimation
DOCX
The AUB Centre for AI in Media Proposal.docx
Empathic Computing: Creating Shared Understanding
The Rise and Fall of 3GPP – Time for a Sabbatical?
Machine learning based COVID-19 study performance prediction
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
GDG Cloud Iasi [PUBLIC] Florian Blaga - Unveiling the Evolution of Cybersecur...
“AI and Expert System Decision Support & Business Intelligence Systems”
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
KodekX | Application Modernization Development
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Advanced IT Governance
Per capita expenditure prediction using model stacking based on satellite ima...
Diabetes mellitus diagnosis method based random forest with bat algorithm
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Advanced methodologies resolving dimensionality complications for autism neur...
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
breach-and-attack-simulation-cybersecurity-india-chennai-defenderrabbit-2025....
Modernizing your data center with Dell and AMD
Review of recent advances in non-invasive hemoglobin estimation
The AUB Centre for AI in Media Proposal.docx

Open Source Privacy-Preserving Metrics - Sarah Gran & Brandon Pitman

  • 2. Nonprofit founded in 2013 as a home for public-benefit digital infrastructure projects, including Let’s Encrypt, Divvi Up, and Prossimo. Launched in October 2020, Divvi Up is a system for privacy-preserving metrics collection based on the Distributed Aggregation Protocol (DAP), which is being standardized in the IETF. Presenting: Brandon Pitman, Technical Lead Sarah Gran, VP of Brand & Donor Development
  • 3. The need for privacy-preserving metrics ● Privacy policies are insufficient as a privacy safeguard ● Hacks happen ● Mere presence of PII is a liability But… ● Data provides valuable insight ● Data enables improved user experiences ● Data identifies problem areas to fix
  • 4. Introducing Divvi Up A service allowing private aggregation of sensitive data. ● No one but the client ever sees the original measurement. ● No one but the collector ever sees the aggregate. The benefits: ● Users like a technically-enforced guarantee that their data can’t be mishandled. ● Organizations like a technically-enforced guarantee that they are not exposed to sensitive user data.
  • 5. What can be aggregated? Technically speaking: ● Numerical data: sums, mean, variance, most statistical functions ● Vectors of numbers ● Histograms, or vectors with a constrained number of nonzero elements ● Extensible: new aggregation functions can be added Common applications: ● Metrics/telemetry ● Survey results ● Machine-learning training data
  • 6. How does it work? Client Client Client Leader Aggregator Helper Aggregator Collector
  • 7. Protocol actors There are several different protocol actors in Divvi Up: ● Client: generates measurements & uploads them to the Aggregators. ● Aggregator: receives report shares from Clients, verifies & aggregates them, and provides aggregates to Collector. Every deployment involves a Leader & Helper Aggregator. ○ Leader: directly receives reports from the Clients, drives aggregation with the Helper, and provides aggregated batches to the Collector. ○ Helper: driven by the Leader to perform aggregation & collection. ● Collector: retrieves batches of aggregated reports from the Aggregators.
  • 9. Subscribers: ENPA Current status: turned down. Use case: private analytics over COVID-19 exposure rates. Apple & Google deployed the clients; ISRG & NIH operated the aggregators; MITRE operated the collector. The initial use-case for the technology behind Divvi Up was to permit private analytics over COVID-19 exposure rates, operated as part of an exposure notification system implemented by Apple & Google during the pandemic.
  • 10. Subscribers: Mozilla Current status: in production. Use case: sensitive telemetry. Mozilla deploys the clients, and operates the helper aggregator and the collector; the ISRG operates the leader aggregator. Mozilla’s initial deployment targets sensitive metrics for their Firefox web browser, such as determining which domains trigger a browser crash. Mozilla’s use is interesting as they compose compose Divvi Up with Oblivious HTTP to fully remove Divvi Up’s ability to see metadata (e.g. IP address) associated with each report.
  • 11. Subscribers: Horizontal Current status: in production. Use case: sensitive telemetry & survey results. Horizontal deploys the clients, and operates the helper aggregator and the collector. The ISRG operates the leader aggregator. Horizontal has deployed private survey result collection in their Shira product, and have deployed telemetry into their Tella product. Horizontal is interesting in that they are the only subscriber who has deployed our Android client.
  • 12. Divvi Up & Oblivious HTTP ● Very high-level: OHTTP is an encrypted HTTP proxy requiring two non-colluding servers, which hides all request metadata (e.g. IP) from the server. ● OHTTP, when composed with a classic telemetry/aggregation system, would hide the source of each measurement, but not the measurement itself. ● Divvi Up hides the individual measurements, but may reveal which clients contribute to the aggregates. ● Composing OHTTP with Divvi Up allows for private aggregation, while preventing info leaks from the client exposing metadata to the aggregators.
  • 13. Divvi Up & Differential Privacy ● Very high-level: DP is a method to hide whether an individual contributed to an aggregate via statistical noise. ● DP says nothing about how the aggregate is generated. ● Divvi Up can be composed with DP: Divvi Up protects the individual measurements from being leaked while producing aggregates, DP protects the aggregate from leaking information about individual measurements. ● We are investigating “central” DP (noise added by aggregators) and “client” DP (noise added by clients).
  • 14. What’s next ● Discover more about how subscribers will use Divvi Up, as well as further applications of the underlying DAP technology ● Gain production deployment experience with partners at scale ● Improve efficiency & lower cost to operate ● Continue to refine the Divvi Up subscriber web portal ● Publish DAP as an IETF RFC
  • 15. Questions now? Ask! Questions later? contact@divviup.org
  • 16. Standardization Work The Distributed Aggregation Protocol (DAP) is used by Divvi Up to perform private aggregation. DAP inherently requires interoperation by two non-colluding “aggregator” servers; therefore, DAP is being standardized at the IETF. Verifiable Distributed Aggregation Functions (VDAFs) provide the cryptographic primitives used by the higher-level Distributed Aggregation Protocol to perform aggregation. VDAF is being standardized at the CFRG (Crypto Forum Research Group).
  • 17. Internet Engineering Task Force (IETF) Specification