SlideShare a Scribd company logo
Scaling up an openEHR CDR
Christian Chevalley, Khon-Kaen, Thailand
christian@adoc.co.th
– Born in Geneva, Switzerland
– Studied Physics and Computer Science at Geneva
University
– Worked for several blue chip companies (HP, Sun
Microsystems)
– Developed 5 commercial enterprise systems for
Finance and Healthcare
– Founded ADOC Software in 2009: a Thailand based
operation, BOI supported
– Wrote EtherCIS in 2011
– Migrated EtherCIS to EHRbase in 2019
●
Governance
– Hanover Medical School (https://www.mhh.de/en/)
– Vitasystems Gmbh (https://www.vitagroup.ag/de_DE/Ueber-uns/vitasystems)
– HiGHmed Medical Informatics (https://highmed.org/), sponsored by:
●
German Ministry of Education and Research (https://www.bmbf.de/en/index.html)
●
Medical Informatics Initiative Germany (https://www.medizininformatik-initiative.de/en/about-
initiative)
– Open Source!
EHRBase: What Is it?
●
openEHR CDR: Reference Model (RM 1.0.4), ADL 1.4
●
Transactional DB Centric Application (PostgreSQL 11+)
●
OpenEHR REST API incl. AQL
●
Development:
– Java 11, jOOQ, Archie, SQL
– Test Automation, Continuous Integration: Robot, Circle CI
– Load Testing: jmeter
– Quality Checking: Sonar Analysis (sonarcloud.io)
Scalability: Some Numbers
●
Deal with > 10’s Mio EHRs
●
Avg nnn compositions/EHR
●
> nnn TB of data (even PB!)
●
> nnnn concurrent users
Many Challenges
●
Multiple Levels of Technical Limitations
– Storage I/Os
– DB (even stated as limitless...)
– Network Latency
– Middleware latency (!) (in particular transformations)
●
Overlapping NFRs
– Multi-Tenancy
– Secondary use (analytics)
– Availability
– Security
– Administration: maintenance, disaster management, monitoring
My Observations
●
Two areas of concern
– CRUD
– Querying (AQL)
●
Query/transaction has to be really fast (~ 1ms or less)
– Minimize middleware/DB transactions
●
ONE query to the DB
●
Resolve containments and paths before launching the query
– Optimize DB model
●
Deal with limitations (denormalization of ITEM_STRUCTURE)
●
Indexing
●
Monitor query execution (query planner)
●
Keep SQL translations as short as possible
Observation/Optimization
●
DB CRUD should be performed in ONE transaction
●
Query (AQL) is accelerated by pre-calculation of
value points paths. Then executed in ONE
transaction
●
OpenEHR middleware (many) format
transformations remain costly!
Benchmark
650 000 EHRs - 130 000 000 compositions
PostgreSQL cluster with 5 nodes, (12 vCPU, 8 GB RAM, 3 TB disk)
select e/ehr_id/value, a/uid, o/data[at0001]/events[at0002]/data
[at0003]/items[at0004]/value from EHR e contains COMPOSITION a[openEHR-
EHR-COMPOSITION.sample_encounter.v1] contains OBSERVATION o[openEHR-EHR-
OBSERVATION.sample_blood_pressure.v1] where o/data[at0001]/events[at0002]
/data[at0003]/items[at0004]/value/magnitude > 20 limit 50
Distributing Transaction Load
●
Deploy DB as a “dumb” cluster
●
Deploy DB as a hyperscale cluster
●
Deploy the middleware as a distributed cluster
w/distributed AQL optimizer
DB Dumb Cluster (1)
Pros
- Easy to deploy (at the beginning)
Cons
- DB maintenance:schema, migration,
backup/recovery
- Storage (replication!)
- No parallelization
- No failover of node
- Heavy procedure to add nodes
- Expensive in a Cloud environment
- Security
- No easy secondary usage
- Has an impact on code logic!
DB Dumb Cluster (2)
Pros
- Somehow easy to deploy (at the
beginning)
Cons
- DB maintenance:
schema,
migration,
backup/recovery
- Storage (replication!)
- No parallelization
- Heavy procedure to add nodes
- Expensive in a Cloud environment
- Security
- Potentially reach DB limits...
HyperScale DB
(Citus, YugabyteDB etc.)
Pros
- Transparent DB maintenance
(single master for admin)
- distributed storage
- parallelism
- Automated failover
- Tools to maintain nodes
- Distributed Security policy
Cons
- Can be tricky to deploy (DB
system setting, driver, may require
additional sharding key...)
EHRbase Cluster+HyperScale DB
(Citus, YugabyteDB etc.)
Pros
- Distributed Middleware
processing
Cons
- Can be tricky to deploy
Conclusion
●
Assuming the right topology (cluster + db sharding), operation
involves Capacity Planning: monitoring, thresholds,
orchestration tool etc.
●
Other infrastructure aspects must be factored in:
– Network latency between nodes
– Storage technology (SSD, write ahead, caching)
– Significant operating concepts and administration
– Requires skills to be administered properly

More Related Content

PDF
Querying EHR Data with Archetype Query Language
PPTX
Big data ppt
PDF
Apache Spark Introduction
PPT
Introduction to Routine Health Information System Slides
PDF
Introduction to Big Data Analytics and Data Science
PDF
Getting started with Web Scraping in Python
PPTX
Hadoop online training
PDF
Building Dynamic Pipelines in Azure Data Factory (SQLSaturday Oslo)
Querying EHR Data with Archetype Query Language
Big data ppt
Apache Spark Introduction
Introduction to Routine Health Information System Slides
Introduction to Big Data Analytics and Data Science
Getting started with Web Scraping in Python
Hadoop online training
Building Dynamic Pipelines in Azure Data Factory (SQLSaturday Oslo)

What's hot (20)

PDF
Spark SQL
PPT
XML Databases
PPTX
Grid computing
PPTX
Introduction to ML with Apache Spark MLlib
PPTX
Anatomy of a data driven architecture - Tamir Dresher
PDF
PPTX
Apache Kylin on HBase: Extreme OLAP engine for big data
PPTX
Hadoop Presentation - PPT
PPTX
Database indexing techniques
DOCX
Liturature servey of rain technlogy by narayan dudhe
PPT
Facebook api
PDF
Big Data Analytics Tutorial | Big Data Analytics for Beginners | Hadoop Tutor...
PDF
Banco dados lógico (dedutivo)
PPTX
Design of Hadoop Distributed File System
PPTX
Lecture 3 Computer Science Research SEM1 22_23 (1).pptx
PPTX
Knowledge Discovery and Data Mining
PPT
New Hospital Presentation
PDF
Flow chart of hospital management system
PPTX
Grid computing
PPTX
ADF Demo_ppt.pptx
Spark SQL
XML Databases
Grid computing
Introduction to ML with Apache Spark MLlib
Anatomy of a data driven architecture - Tamir Dresher
Apache Kylin on HBase: Extreme OLAP engine for big data
Hadoop Presentation - PPT
Database indexing techniques
Liturature servey of rain technlogy by narayan dudhe
Facebook api
Big Data Analytics Tutorial | Big Data Analytics for Beginners | Hadoop Tutor...
Banco dados lógico (dedutivo)
Design of Hadoop Distributed File System
Lecture 3 Computer Science Research SEM1 22_23 (1).pptx
Knowledge Discovery and Data Mining
New Hospital Presentation
Flow chart of hospital management system
Grid computing
ADF Demo_ppt.pptx
Ad

Similar to Scaling up an openEHR CDR (20)

PPTX
EHRbase, open source openEHR CDR
PPT
Design and implementation of Clinical Databases using openEHR
PPTX
openEHR Medinfo2015 Brazil Sponsor Session
PDF
1 3 introduction to open_ehr
PDF
openEHR Technical Workshop Intro MIE 2016
PDF
OpenEhr rEvolution Sardinia 2019
PPTX
Digital assembly Cardiff HANDI-HOPD workshop
PDF
Dr. Ian McNicoll Digital Health Assembly 2015
PPTX
Digital assembly 2015 Cardiff HANDI-HOPD workshop
PPT
OpenEMR.features.ppt
PDF
RippleStack & EtherCIS: Shinkansen to openEHR
PPT
Developing openEHR EHRs - core functionalities
PDF
Redox Enterprise One Pager
PDF
1 7 open_ehr in context
PPTX
openEHR sll-2015final
PDF
Thoughts on Epic & Cerner's Embrace of 3rd-Party Collaboration
PPTX
Epic EHR Integration vs Other Systems Comparison Guide 2025.pptx
PDF
IRJET- A Survey on Big Data Frameworks and Approaches in Health Care Sector
PPT
Towards the Implementation of an openEHR-based Open Source EHR Platform (a vi...
PDF
EHRGen demo presentation
EHRbase, open source openEHR CDR
Design and implementation of Clinical Databases using openEHR
openEHR Medinfo2015 Brazil Sponsor Session
1 3 introduction to open_ehr
openEHR Technical Workshop Intro MIE 2016
OpenEhr rEvolution Sardinia 2019
Digital assembly Cardiff HANDI-HOPD workshop
Dr. Ian McNicoll Digital Health Assembly 2015
Digital assembly 2015 Cardiff HANDI-HOPD workshop
OpenEMR.features.ppt
RippleStack & EtherCIS: Shinkansen to openEHR
Developing openEHR EHRs - core functionalities
Redox Enterprise One Pager
1 7 open_ehr in context
openEHR sll-2015final
Thoughts on Epic & Cerner's Embrace of 3rd-Party Collaboration
Epic EHR Integration vs Other Systems Comparison Guide 2025.pptx
IRJET- A Survey on Big Data Frameworks and Approaches in Health Care Sector
Towards the Implementation of an openEHR-based Open Source EHR Platform (a vi...
EHRGen demo presentation
Ad

More from openEHR-Japan (20)

PDF
openEHR Progress in China 2022
PDF
Modelling clinical knowledge
PPTX
2021年から2022年にかけてのopenEHR Project概況
PPTX
Updates of openEHR and Japan activity
PPTX
SMART on openEHR
PPTX
openEHR Updates 2020-2021
PPTX
openEHR / ISO 13606入門
PDF
openEHR template development for COVID-19
PDF
Opening remarks
PDF
Guideline based CDSS for COVID-19
PDF
openEHR v COVID-19
PPTX
openEHR/ISO13606入門
ODP
2019年版openEHRプロジェクトの近況について
ODP
Loclisation of openEHR in Japan and nation-wide EHR project
PDF
openEHR in China 2019-06
PDF
openEHR in China, 2018
PDF
千年カルテプロジェクト成果と事業化
PPTX
openEHR activities, 2017-2018
ODP
Updates of openEHR, 2017-2018
PPTX
openEHRについて最近の動向(2017年7月)
openEHR Progress in China 2022
Modelling clinical knowledge
2021年から2022年にかけてのopenEHR Project概況
Updates of openEHR and Japan activity
SMART on openEHR
openEHR Updates 2020-2021
openEHR / ISO 13606入門
openEHR template development for COVID-19
Opening remarks
Guideline based CDSS for COVID-19
openEHR v COVID-19
openEHR/ISO13606入門
2019年版openEHRプロジェクトの近況について
Loclisation of openEHR in Japan and nation-wide EHR project
openEHR in China 2019-06
openEHR in China, 2018
千年カルテプロジェクト成果と事業化
openEHR activities, 2017-2018
Updates of openEHR, 2017-2018
openEHRについて最近の動向(2017年7月)

Recently uploaded (20)

PPTX
3. Adherance Complianace.pptx pharmacy pci
PPTX
Nancy Caroline Emergency Paramedic Chapter 8
PPTX
COMMUNICATION SKILSS IN NURSING PRACTICE
PPTX
DeployedMedicineMedical EquipmentTCCC.pptx
PPTX
Trichuris trichiura infection
PPTX
Newer Technologies in medical field.pptx
PPTX
PEDIATRIC OSCE, MBBS, by Dr. Sangit Chhantyal(IOM)..pptx
PPTX
Bronchial_Asthma_in_acute_exacerbation_.pptx
PPTX
PE and Health 7 Quarter 3 Lesson 1 Day 3,4 and 5.pptx
PDF
Dr. Jasvant Modi - Passionate About Philanthropy
PPT
KULIAH UG WANITA Prof Endang 121110 (1).ppt
DOCX
ch 9 botes for OB aka Pregnant women eww
PPTX
Care Facilities Alcatel lucenst Presales
PDF
_OB Finals 24.pdf notes for pregnant women
PDF
Priorities Critical Care Nursing 7th Edition by Urden Stacy Lough Test Bank.pdf
PPTX
Medical aspects of impairment including all the domains mentioned in ICF
PDF
MINERAL & VITAMIN CHARTS fggfdtujhfd.pdf
PPTX
Nancy Caroline Emergency Paramedic Chapter 11
PPTX
Pulmonary Circulation PPT final for easy
PPTX
First Aid and Basic Life Support Training.pptx
3. Adherance Complianace.pptx pharmacy pci
Nancy Caroline Emergency Paramedic Chapter 8
COMMUNICATION SKILSS IN NURSING PRACTICE
DeployedMedicineMedical EquipmentTCCC.pptx
Trichuris trichiura infection
Newer Technologies in medical field.pptx
PEDIATRIC OSCE, MBBS, by Dr. Sangit Chhantyal(IOM)..pptx
Bronchial_Asthma_in_acute_exacerbation_.pptx
PE and Health 7 Quarter 3 Lesson 1 Day 3,4 and 5.pptx
Dr. Jasvant Modi - Passionate About Philanthropy
KULIAH UG WANITA Prof Endang 121110 (1).ppt
ch 9 botes for OB aka Pregnant women eww
Care Facilities Alcatel lucenst Presales
_OB Finals 24.pdf notes for pregnant women
Priorities Critical Care Nursing 7th Edition by Urden Stacy Lough Test Bank.pdf
Medical aspects of impairment including all the domains mentioned in ICF
MINERAL & VITAMIN CHARTS fggfdtujhfd.pdf
Nancy Caroline Emergency Paramedic Chapter 11
Pulmonary Circulation PPT final for easy
First Aid and Basic Life Support Training.pptx

Scaling up an openEHR CDR

  • 1. Scaling up an openEHR CDR Christian Chevalley, Khon-Kaen, Thailand christian@adoc.co.th – Born in Geneva, Switzerland – Studied Physics and Computer Science at Geneva University – Worked for several blue chip companies (HP, Sun Microsystems) – Developed 5 commercial enterprise systems for Finance and Healthcare – Founded ADOC Software in 2009: a Thailand based operation, BOI supported – Wrote EtherCIS in 2011 – Migrated EtherCIS to EHRbase in 2019
  • 2. ● Governance – Hanover Medical School (https://www.mhh.de/en/) – Vitasystems Gmbh (https://www.vitagroup.ag/de_DE/Ueber-uns/vitasystems) – HiGHmed Medical Informatics (https://highmed.org/), sponsored by: ● German Ministry of Education and Research (https://www.bmbf.de/en/index.html) ● Medical Informatics Initiative Germany (https://www.medizininformatik-initiative.de/en/about- initiative) – Open Source!
  • 3. EHRBase: What Is it? ● openEHR CDR: Reference Model (RM 1.0.4), ADL 1.4 ● Transactional DB Centric Application (PostgreSQL 11+) ● OpenEHR REST API incl. AQL ● Development: – Java 11, jOOQ, Archie, SQL – Test Automation, Continuous Integration: Robot, Circle CI – Load Testing: jmeter – Quality Checking: Sonar Analysis (sonarcloud.io)
  • 4. Scalability: Some Numbers ● Deal with > 10’s Mio EHRs ● Avg nnn compositions/EHR ● > nnn TB of data (even PB!) ● > nnnn concurrent users
  • 5. Many Challenges ● Multiple Levels of Technical Limitations – Storage I/Os – DB (even stated as limitless...) – Network Latency – Middleware latency (!) (in particular transformations) ● Overlapping NFRs – Multi-Tenancy – Secondary use (analytics) – Availability – Security – Administration: maintenance, disaster management, monitoring
  • 6. My Observations ● Two areas of concern – CRUD – Querying (AQL) ● Query/transaction has to be really fast (~ 1ms or less) – Minimize middleware/DB transactions ● ONE query to the DB ● Resolve containments and paths before launching the query – Optimize DB model ● Deal with limitations (denormalization of ITEM_STRUCTURE) ● Indexing ● Monitor query execution (query planner) ● Keep SQL translations as short as possible
  • 7. Observation/Optimization ● DB CRUD should be performed in ONE transaction ● Query (AQL) is accelerated by pre-calculation of value points paths. Then executed in ONE transaction ● OpenEHR middleware (many) format transformations remain costly!
  • 8. Benchmark 650 000 EHRs - 130 000 000 compositions PostgreSQL cluster with 5 nodes, (12 vCPU, 8 GB RAM, 3 TB disk) select e/ehr_id/value, a/uid, o/data[at0001]/events[at0002]/data [at0003]/items[at0004]/value from EHR e contains COMPOSITION a[openEHR- EHR-COMPOSITION.sample_encounter.v1] contains OBSERVATION o[openEHR-EHR- OBSERVATION.sample_blood_pressure.v1] where o/data[at0001]/events[at0002] /data[at0003]/items[at0004]/value/magnitude > 20 limit 50
  • 9. Distributing Transaction Load ● Deploy DB as a “dumb” cluster ● Deploy DB as a hyperscale cluster ● Deploy the middleware as a distributed cluster w/distributed AQL optimizer
  • 10. DB Dumb Cluster (1) Pros - Easy to deploy (at the beginning) Cons - DB maintenance:schema, migration, backup/recovery - Storage (replication!) - No parallelization - No failover of node - Heavy procedure to add nodes - Expensive in a Cloud environment - Security - No easy secondary usage - Has an impact on code logic!
  • 11. DB Dumb Cluster (2) Pros - Somehow easy to deploy (at the beginning) Cons - DB maintenance: schema, migration, backup/recovery - Storage (replication!) - No parallelization - Heavy procedure to add nodes - Expensive in a Cloud environment - Security - Potentially reach DB limits...
  • 12. HyperScale DB (Citus, YugabyteDB etc.) Pros - Transparent DB maintenance (single master for admin) - distributed storage - parallelism - Automated failover - Tools to maintain nodes - Distributed Security policy Cons - Can be tricky to deploy (DB system setting, driver, may require additional sharding key...)
  • 13. EHRbase Cluster+HyperScale DB (Citus, YugabyteDB etc.) Pros - Distributed Middleware processing Cons - Can be tricky to deploy
  • 14. Conclusion ● Assuming the right topology (cluster + db sharding), operation involves Capacity Planning: monitoring, thresholds, orchestration tool etc. ● Other infrastructure aspects must be factored in: – Network latency between nodes – Storage technology (SSD, write ahead, caching) – Significant operating concepts and administration – Requires skills to be administered properly