SlideShare a Scribd company logo
#DevTO @mirocupak
Miro Cupak
VP Engineering, DNAstack
30/04/2018
Building a global search
engine for genetic data
#DevTO @mirocupak
What and why?
!2
• Beacon Network (https://beacon-network.org/)
• Beacon Project from the Global Alliance for Genomics and Health
(GA4GH)
• largest search and discovery engine of human genomic variation
#DevTO @mirocupak
Beacon
!3
• experiment to test the willingness of international sites to share genetic
data in the simplest of all technical contexts
• design principles
• A beacon has to be technically simple.
• A beacon has to minimize risks associated with genomic data sharing.
• It has to be possible to make a beacon publicly available.
• simple web service allowing users to query institution’s databases to
determine whether they contain a genetic variant of interest
• started in March 2014, quickly gained traction
• receives questions of the form Do you have information about this
mutation?
• responds with yes or no, optionally with additional information about the
mutation
#DevTO @mirocupak !4
#DevTO @mirocupak !5
https://beacon-network.org
#DevTO @mirocupak !6
https://beacon-network.org
#DevTO @mirocupak
Standard
!7
• 0.1 (2014): simple, Apache Avro
• 0.2 (2015): complex, datasets, self description, Apache Avro
• 0.3 (2016): simplified, improved, modular and extensible, tooling,
moving towards Protocol Buffers
• 0.4 (2018): flexible, complex variants, data use conditions, developer-
friendly, OpenAPI
• now working on 0.4.1, 0.5 coming soon, 1.0 in 2018
#DevTO @mirocupak
Beacon Network
!8
• federation of queries across beacons
• de-facto registry
• programmatically accessible, unified beacon API
• aggregation
• participant resolution
• flexible, dynamic and easily extensible query
execution pipeline - query parameter translation,
request construction, response fetching over network,
parsing
#DevTO @mirocupak
Search execution
!9
#DevTO @mirocupak
Size
!10
• ~100 installations, 40 institutions, 18 countries, 6 continents
#DevTO @mirocupak
Users
!11
• 13k users, 136 countries
#DevTO @mirocupak !12
Searches
#DevTO @mirocupak !13
Other fun stats
• popular parameter values
• variants
• deleteriousness
• rarity
• genes
• disorders and clinical abnormalities
#DevTO @mirocupak
Questions?
!14
https://github.com/mcupak/beacon-of-beacons
https://github.com/ga4gh-beacon/
https://dnastack.com/#/team/careers
https://mirocupak.com

More Related Content

PPTX
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
PPTX
Pistoia Alliance conference April 2016: Big Data: Mathew Woodwark
PPTX
Pistoia Alliance conference April 2016: Robotics: Steve Rees
PPTX
Why should researchers care about data curation?
PPTX
A practical guide to practicing open science
PDF
OpenAIRE introduction at the 8th OpenAIRE workshop
PPTX
The Future of FAIR Data: An international social, legal and technological inf...
PPTX
Avoiding the tower of babel - The Role of Data Description Standards in Biome...
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
Pistoia Alliance conference April 2016: Big Data: Mathew Woodwark
Pistoia Alliance conference April 2016: Robotics: Steve Rees
Why should researchers care about data curation?
A practical guide to practicing open science
OpenAIRE introduction at the 8th OpenAIRE workshop
The Future of FAIR Data: An international social, legal and technological inf...
Avoiding the tower of babel - The Role of Data Description Standards in Biome...

Similar to Building a Global Search Engine for Genetic Data (13)

PDF
How we built a global search engine for genetic data
PDF
How we built a global search engine for genetic data
PDF
How we've made a global search engine for genetic data
PDF
Building an Internet of Genomics
PDF
How to Light a Beacon
PDF
Beacon Network: A System for Global Genomic Data Sharing
PDF
Beacon Network
PDF
Beacon Network
PDF
Beacon Network: A System for Global Genomic Data Sharing
PDF
Beacon: A Protocol for Federated Discovery and Sharing of Genomic Data
PDF
Beacon Development
PDF
Beacon API
PDF
Faunus: Graph Analytics Engine
How we built a global search engine for genetic data
How we built a global search engine for genetic data
How we've made a global search engine for genetic data
Building an Internet of Genomics
How to Light a Beacon
Beacon Network: A System for Global Genomic Data Sharing
Beacon Network
Beacon Network
Beacon Network: A System for Global Genomic Data Sharing
Beacon: A Protocol for Federated Discovery and Sharing of Genomic Data
Beacon Development
Beacon API
Faunus: Graph Analytics Engine
Ad

More from Miro Cupak (20)

PDF
Exploring the latest and greatest from Java 14
PDF
Exploring reactive programming in Java
PDF
Exploring the last year of Java
PDF
Local variable type inference - Will it compile?
PDF
The Good, the Bad and the Ugly of Java API design
PDF
Local variable type inference - Will it compile?
PDF
Exploring reactive programming in Java
PDF
The good, the bad, and the ugly of Java API design
PDF
Master class in modern Java
PDF
The good, the bad, and the ugly of Java API design
PDF
Exploring reactive programming in Java
PDF
The good, the bad, and the ugly of Java API design
PDF
Writing clean code with modern Java
PDF
The good, the bad, and the ugly of Java API design
PDF
Master class in modern Java
PDF
Exploring reactive programming in Java
PDF
Writing clean code with modern Java
PDF
Exploring what's new in Java 10 and 11 (and 12)
PDF
Exploring what's new in Java 10 and 11
PDF
Exploring what's new in Java in 2018
Exploring the latest and greatest from Java 14
Exploring reactive programming in Java
Exploring the last year of Java
Local variable type inference - Will it compile?
The Good, the Bad and the Ugly of Java API design
Local variable type inference - Will it compile?
Exploring reactive programming in Java
The good, the bad, and the ugly of Java API design
Master class in modern Java
The good, the bad, and the ugly of Java API design
Exploring reactive programming in Java
The good, the bad, and the ugly of Java API design
Writing clean code with modern Java
The good, the bad, and the ugly of Java API design
Master class in modern Java
Exploring reactive programming in Java
Writing clean code with modern Java
Exploring what's new in Java 10 and 11 (and 12)
Exploring what's new in Java 10 and 11
Exploring what's new in Java in 2018
Ad

Recently uploaded (20)

PPTX
Oracle Fusion HCM Cloud Demo for Beginners
PPTX
Tech Workshop Escape Room Tech Workshop
PDF
EaseUS PDF Editor Pro 6.2.0.2 Crack with License Key 2025
PPTX
Patient Appointment Booking in Odoo with online payment
PPTX
Why Generative AI is the Future of Content, Code & Creativity?
PPTX
Custom Software Development Services.pptx.pptx
PDF
Autodesk AutoCAD Crack Free Download 2025
PDF
How AI/LLM recommend to you ? GDG meetup 16 Aug by Fariman Guliev
PPTX
Monitoring Stack: Grafana, Loki & Promtail
PPTX
Computer Software and OS of computer science of grade 11.pptx
PDF
AI-Powered Threat Modeling: The Future of Cybersecurity by Arun Kumar Elengov...
PPTX
Advanced SystemCare Ultimate Crack + Portable (2025)
PPTX
Embracing Complexity in Serverless! GOTO Serverless Bengaluru
PDF
Types of Token_ From Utility to Security.pdf
DOCX
How to Use SharePoint as an ISO-Compliant Document Management System
PDF
How Tridens DevSecOps Ensures Compliance, Security, and Agility
PDF
MCP Security Tutorial - Beginner to Advanced
PDF
Time Tracking Features That Teams and Organizations Actually Need
PPTX
"Secure File Sharing Solutions on AWS".pptx
PPTX
Log360_SIEM_Solutions Overview PPT_Feb 2020.pptx
Oracle Fusion HCM Cloud Demo for Beginners
Tech Workshop Escape Room Tech Workshop
EaseUS PDF Editor Pro 6.2.0.2 Crack with License Key 2025
Patient Appointment Booking in Odoo with online payment
Why Generative AI is the Future of Content, Code & Creativity?
Custom Software Development Services.pptx.pptx
Autodesk AutoCAD Crack Free Download 2025
How AI/LLM recommend to you ? GDG meetup 16 Aug by Fariman Guliev
Monitoring Stack: Grafana, Loki & Promtail
Computer Software and OS of computer science of grade 11.pptx
AI-Powered Threat Modeling: The Future of Cybersecurity by Arun Kumar Elengov...
Advanced SystemCare Ultimate Crack + Portable (2025)
Embracing Complexity in Serverless! GOTO Serverless Bengaluru
Types of Token_ From Utility to Security.pdf
How to Use SharePoint as an ISO-Compliant Document Management System
How Tridens DevSecOps Ensures Compliance, Security, and Agility
MCP Security Tutorial - Beginner to Advanced
Time Tracking Features That Teams and Organizations Actually Need
"Secure File Sharing Solutions on AWS".pptx
Log360_SIEM_Solutions Overview PPT_Feb 2020.pptx

Building a Global Search Engine for Genetic Data