SlideShare a Scribd company logo
Patient-Like-Mine
A Real Time, Visual Analytics Tool for Clinical Decision Support
Peter Li1
, Ph.D., Simon N. Yates2
, Jenna K. Lovely3
, Pharm.D. R.Ph. BCPS, David W. Larson4
, M.D. MBA
Department of Surgery, Mayo Clinic, Rochester, MN USA
1
jenli.peter@gmail.com, 2
yates.simon@mayo.edu, 3
lovely.jenna@mayo.edu, 4
larson.david2@mayo.edu
Abstract— We developed a real-time, visual analytics tool for
clinical decision support. The system expands the “recall of
past experience” approach that a provider (physician) uses to
formulate a course of action for a given patient. By utilizing
Big-Data techniques, we enable the provider to recall all
similar patients from an institution’s electronic medical record
(EMR) repository, to explore “what-if” scenarios, and to
collect these evidence-based cohorts for future statistical
validation and pattern mining.
Keywords- electronic medical record, clinical decision
support, real-time analytics, visual analytics, data mining.
I. INTRODUCTION
When determining the course of action for a given patient,
the physician has to integrate clinical knowledge, state of the
patient, and his/her own personal experience. Clinical
knowledge is composed of models of diseases and their
progression and interventions. However, many patients have
comorbidities and medical histories that cause the disease
progression and the optimal treatment plan to deviate from
the known clinical models. As a result, the physician (or, in
general, the provider team) has to recall from their own
experience of similar past cases to develop the care plan.
Unfortunately, this approach is often subjective and limited
to the team’s ability to recall all relevant cases. We
developed an interactive, visual analytics tool, “Patient-Like-
Mine”, in the Mayo Clinic’s Enhanced Analytics to Surgical
Excellence (EASE) program that focuses on improving
Clinical Pathway compliance [1] as well as recognizing
patterns early to assist this clinical decision making process.
We remove the limits of subjectivity by accessing the
institutional Electronic Medical Record (EMR) database so
that the collective experience from ALL the past and present
patients with a similar background could be utilized in real-
time care planning.
Operational requirements of “Patient-Like-Mine”
includes:
• Perform search on a large (>1 billion facts) and complex
(>1 thousand properties) with up-to-date (>1 thousand
data points/second) dataset in real-time.
• Align the resulting patients’ (i.e. the cohort’s) medical
histories to provider-chosen “landmark” events, e.g.
time of surgery or date of admission.
• Restrict the value and the relative time of ANY clinically
relevant parameter based on the provider’s judgment,
e.g. “systolic blood pressure between 50 and 90, 3 days
post-operation”.
• Provide an interactive, graphical user interface to
compare the given patient to “similar patients” and to
explore “what-if” scenarios in a transparent fashion.
II. APPROACH
The architecture for this project is based on the following
four major components:
• Automated deployment for Azure public (and private)
cloud nodes with Health Insurance Portability and
Accountability Act (HIPAA)-compliant disk encryption,
strict firewall configurations, and secure login. The
cloud is necessary for scalable performance, but we also
need to secure the protected health information (PHI)
that will now be placed onto the cloud nodes.
• A scalable search engine that can expand and shrink
based on existing load. For this project, we chose
ElasticSearch (http://elastic.co) for model simplicity
(JSON-based), query expressiveness (structured and
text), inherent performance (subsecond response), and
secure transport protocol (SSL/TLS). We developed
data transform and importing tools for Fast Healthcare
Interoperability Resources (FHIR, http://hl7.org/fhir),
Reference Information Model (RIM, http://hl7.org/
implement/standards/rim.cfm), and SQL-based sources
that captured EMR at Mayo Clinic.
• A schema-driven abstraction layer for ElasticSearch
query building and data export. This generates
consistent queries from declarative API calls and
converts the return results from nested JSON to a tabular
format for UI ingestion and downstream statistical
analysis.
• An intuitive, graphical UI for exploring “Patient-Like-
Mine”. A single patient’s structured clinical parameters
can be plotted via a “strip-chart” graphic with a
zoomable and scrollable interface. By extension, a set of
“similar patients” in a cohort would paint a distribution
contour over the same chart. By changing the constraints
over clinical outcome and intervention parameters, we
can analyze “what-if” scenarios to facilitate clinical
decision making.
The project is divided into 3 phases. The first phase is
architectural and implementation feasibility. Second phase is
clinical practice validation. Third phase is distribution and
expansion into additional clinical practices.
III. RESULTS
We present the results from the recently completed first
phase, demonstrating architectural and implementation
feasibility.
A. Cloud Deployment
We developed an Azure-cloud deployment tool based on
Linux shell scripts for ease of portability and extensibility. A
given cloud deployment is defined by a set of configuration
files that describe the system (IT) and application parameters
(Figure 1). System configuration includes disk encryption
via LUKS (for encryption-at-rest), firewall settings, and
logins via public-key based ssh only. Passwords, certificates,
and keys are generated on-the-fly so that each deployment
will has its own set of security tokens.
Figure 1. System architecture for cloud deployment.
Current applications supported are ElasticSearch (ES)
and nodejs (Figure 2). ES can be deployed with SHIELD
module for secure internode and client communication (for
encryption-in-flight). While other ES modules, such as
Kibana, are also deployed, but they are not considered to be
secure at this time, and is limited to non-production
environments. Nodejs is deployed on intranet/on-premise
VM that has credentials to connect to local LDAP server for
authentication and authorization. ES will only allow
connections from this nodejs server via https.
Figure 2. Application architecture for cloud deployment.
B. ElasticSearch Engine
ES is an open source, commercial product that serves as
the search engine for many well-known internet sites (see
http://elastic.co). It is built on top of Apache Lucene
indexing engine (http://lucene.apache.org) processing JSON-
based documents, but with distributed, multi-node scalability
as a core feature. It has a very large set of querying
commands, flexible aggregations, and parent-child
relationship. In addition, it has a large ecosystem of tools and
language support.
Typical EMR captures clinical data from a messaging
format (HL7 v2 messages). However, with the industry
acceptance of FHIR and RIM messaging objects, we are
seeing more complex and polymorphic schema. These
complex datasets require transformations to an analytics-
friendly fact or event schema. We also took advantage of ES
supported parent-child relationships and JSON-based nested
array elements to provide additional properties for landmark
event alignment and relative times (Figure 3).
Figure 3. Mapping of HL7 schema to Event-based schema.
C. Schema-based Abstraction Module
While ElasticSearch query language is very expressive
and powerful, it is also very easy to create queries that return
non-intuitive answers because of the different contexts
involving nested elements, parent-child relationships, and
aggregates. Constraint clauses associated with different
parent-child/nested elements need to be repeated at different
places of the query construct to ensure proper object
selection and projection (Figure 4). We built an abstraction
module that takes a set of schema-based constraints and
generates a consistent query for execution.
To simplify downstream processing of returned results
from an ElasticSearch query, this abstraction module also
includes a schema-based specification for transforming
JSON-based data into a flatter table structure for use. This
also simplifies exports to any statistical analysis that may
need to be performed. This module is implemented in
Javascript for use in both client browser and nodejs web
server.
Encrypted
Data
Volume
Azure
Firewall
Azure Cloud Environment
Corporate
VPN
Gateway
Azure
VPN
Gateway
Cloud Nodes
Cloud-based
VMs
Cloud Nodes
Cloud-based
VMs
Internet
Encrypted
Traffic
SSL
encrypted
packets
LUKS
encrypted
blocks
Azure
Portal
Deployment Node/s
Linux/shell or
Windows/powershell
Secure RESTful
Azure Commands
and Response
Azure
Deployments
Intranet Environment
Corporate
Firewall
SSH Multi-Factor
Authentication
Config
Files
Restricted
Permissions
SSH
for nodes
Deployed Assets
Firewall
Encrypted
Data
Volume
Deployment
Data, Keys,
Passwords
Deployment Node/s
Linux/shell or
Windows/powershell
https
User
ES Slave
VM
Encrypted
Data
Volume
ES Slave
VM
Encrypted
Data
Volume
ES Master
VM
Encrypted
Data
Volume
...
ES Master
VM
Encrypted
Data
Volume
LDAP
Nodejs
VM
Config &
Encrypted
Certificate
Files
Config &
Enscrypted
Certificate
Files
VPN Tunnel
ssh/scp
https
https
ssh configuration
Pa#ent
Event
Rela#ve+
Time
Ref+
Event
Document
RefDef
Parent/Child/Rou#ng
Id
source
value
Name
first
last
BirthDate
Gender
Race
…
AttributesClass
Id
source
value
Type
ClinDate
Display
Observation
value
Order
…
Id
source
value
RefEvent
RelTime
…
Id
source
value
RefDef
…
Id
source
value
Display
EventType
EventTime
PatientQual
…
contains
is;acontains
contains
generates
generates
Pa#ent;Like;Mine+Schema
FHIR+Resources+
(mongo+DB)
Observa#on+
Medica#onAdministra#on+
Encounter+
Order+
Procedure+
…
Pa#ent
Document FHIR;based+
Mongo2ES+
Mapper
RDBMS
SQL;based+
Mapper
RIM;based+
repository
RIM;based+
Mapper
Legend:
Figure 4. Constraint duplication for consistent results under ES.
D. Intuitive UI
Data overload is often a complaint about today’s EMR.
In addition to presenting the patient data, we will also be
presenting data from 10’s to 1000’s of similar patients. The
cohort data must be summarized so that a visual comparison
can be made readily between the patient of interest and the
cohort. We extend the box-and-whisker statistical plots into
continuous contours to support continuous variables (Figure
5). This graphic allows the provider to quickly determine if
the trajectory of a patient’s clinical parameter is within
nominal bounds, based on a cohort of similar patients.
Figure 5. Construction of cohort contour graphs.
To build a specific real-time cohort, we allow the user to
create, move, and size constraint “boxes” over the graph of
the patient data. Each constraint box acts as a filter for
matching patients whose data intersect the “bounding box”
(Figure 6). A physician can decide when and where to place
such constraints to create a specific cohort based on his/her
clinical assessment, i.e. a customized, ad-hoc “patient-like-
mine” cohort to predict patient outcome and to select best
treatment options.
Figure 6. Impact of constraint creation on contours.
IV. CONCLUSION
The objective of the EASE program is to improve patient
care and outcome. Our big-data approach creates a
transparent, interactive environment that enables a provider
to formulate more specific plan for a given patient using real-
world evidence found in an institutional EMR. By using a
very flexible UI, the provider or team can also explore
“what-if” scenarios that would have previously taken a
statistical/database team and considerable time to develop.
However, the number of “similar” patients in any given
repository can be limited, reducing the robustness of this
recall-based paradigm. By building this system on a cloud-
based architecture, it can solve this problem simply by
including more patients from other institutions. Thereby
increasing the chance of finding patients that match a set of
complex or rare clinical features.
Our next phase is to demonstrate clinical utility and
relevance through deployment in a controlled practice setting
that will provide systematic feedback for improvements and
define novel use cases.
ACKNOWLEDGMENT
This project is funded by Mayo Clinic Clinical Practice
Committee and the Office of Information and Knowledge
Management. We also thank the efforts of the many people
on the core EASE team and Mayo Clinic IT support staff.
REFERENCES
[1] DW Larson, JK Lovely, RR Cima, EJ Dozois, H Chua, BG Wolff, JH
Pemberton, RR Devine, M Huebner., “Outcomes after
implementation of a multimodal standard care pathway for
laparoscopic colorectal surgery.,” Br J Surg. 2014 Jul;101(8):1023-30.
doi: 10.1002/bjs.9534. Epub 2014 May 15.
Pa#ent
Event
Rela#ve+
Time
Id
source
value
Name
first
last
BirthDate
Gender
Race
…
Id
source
value
Type
ClinDate
Display
Observation
value
Order
…
Id
source
value
RefEvent
RelTime
…
contains
contains
A
B
C
ElasticSearch Pseudoquery:
{
query: {
<clause A>,
child: { Event : {
<clause B>,
child: { RelativeTime : {
<clause C>
} }
} }
},
inner : { Event : {
query : {
<clause B>,
child : {
<clause C>
}
},
inner : { RelativeTime : {
query : {
<clause C>
}
} }
} }
}
Pa#ent'A''
2015-04
Pa#ent'B'
2015-01
Pa#ent'C'
2014-12
Pa#ent'D'
2015-03
Our'Pa#ent'in'May'2015'
with'a'"Landmark"'event
…
1.'Find'others'
with'same'event
2.'Align'to'our'
#me'frame
4.'Calculate'5%,'
10%'25%,'75%,'
90%'95%#le'
contours
3.'Overlay
5.'Compare
Add#Constraint
Add#Constraint

More Related Content

PDF
IRJET- Data Analysis and Solution Prediction using Elasticsearch in Healt...
PDF
A META DATA VAULT APPROACH FOR EVOLUTIONARY INTEGRATION OF BIG DATA SETS: CAS...
PDF
NEO4J, SQLITE AND MYSQL FOR HOSPITAL LOCALIZATION
PPTX
MongoDB at Medtronic
PDF
Interoperability & standards
PPTX
Ehr models, standards and semantic interoperability
PDF
A statistical data fusion technique in virtual data integration environment
PDF
Data Convergence White Paper
IRJET- Data Analysis and Solution Prediction using Elasticsearch in Healt...
A META DATA VAULT APPROACH FOR EVOLUTIONARY INTEGRATION OF BIG DATA SETS: CAS...
NEO4J, SQLITE AND MYSQL FOR HOSPITAL LOCALIZATION
MongoDB at Medtronic
Interoperability & standards
Ehr models, standards and semantic interoperability
A statistical data fusion technique in virtual data integration environment
Data Convergence White Paper

What's hot (19)

PDF
A unified approach for spatial data query
DOCX
RESEARCH DIRECTIONS FOR ENGINEERING BIG DATA ANALYTICS SOFTWARE
PDF
Open Data Convergence
PDF
Effective data mining for proper
PDF
Role of Data Cleaning in Data Warehouse
PDF
Implementation of Matching Tree Technique for Online Record Linkage
PDF
ONTOLOGY-DRIVEN INFORMATION RETRIEVAL FOR HEALTHCARE INFORMATION SYSTEM : ...
PDF
IRJET-Efficient Data Linkage Technique using one Class Clustering Tree for Da...
PDF
Pivoting approach-eav-data-dinu-2006
PDF
New proximity estimate for incremental update of non uniformly distributed cl...
PDF
International Journal of Computational Engineering Research(IJCER)
PPTX
Data Warehousing AWS 12345
PPTX
Aleksandar Zivaljevic - Annotation of clinical datasets using openEHR Archety...
PDF
SOURCE CODE RETRIEVAL USING SEQUENCE BASED SIMILARITY
PDF
AUTHORIZATION FRAMEWORK FOR MEDICAL DATA
PDF
Real Time Web-based Data Monitoring and Manipulation System to Improve Transl...
PPTX
Datawarehousing Terminology
PDF
PERFORMANCE EVALUATION OF STRUCTURED AND SEMI-STRUCTURED BIOINFORMATICS TOOLS...
PDF
PERFORMANCE EVALUATION OF STRUCTURED AND SEMI-STRUCTURED BIOINFORMATICS TOOLS...
A unified approach for spatial data query
RESEARCH DIRECTIONS FOR ENGINEERING BIG DATA ANALYTICS SOFTWARE
Open Data Convergence
Effective data mining for proper
Role of Data Cleaning in Data Warehouse
Implementation of Matching Tree Technique for Online Record Linkage
ONTOLOGY-DRIVEN INFORMATION RETRIEVAL FOR HEALTHCARE INFORMATION SYSTEM : ...
IRJET-Efficient Data Linkage Technique using one Class Clustering Tree for Da...
Pivoting approach-eav-data-dinu-2006
New proximity estimate for incremental update of non uniformly distributed cl...
International Journal of Computational Engineering Research(IJCER)
Data Warehousing AWS 12345
Aleksandar Zivaljevic - Annotation of clinical datasets using openEHR Archety...
SOURCE CODE RETRIEVAL USING SEQUENCE BASED SIMILARITY
AUTHORIZATION FRAMEWORK FOR MEDICAL DATA
Real Time Web-based Data Monitoring and Manipulation System to Improve Transl...
Datawarehousing Terminology
PERFORMANCE EVALUATION OF STRUCTURED AND SEMI-STRUCTURED BIOINFORMATICS TOOLS...
PERFORMANCE EVALUATION OF STRUCTURED AND SEMI-STRUCTURED BIOINFORMATICS TOOLS...
Ad

Similar to Patient-Like-Mine (20)

PDF
Ibrahem
PDF
A Systems Approach To Qualitative Data Management And Analysis
PDF
Achieving Privacy in Publishing Search logs
PDF
Poster (1)
PPT
Recording and Reasoning Over Data Provenance in Web and Grid Services
PPT
Driving Deep Semantics in Middleware and Networks: What, why and how?
PDF
Dynamic Fine-grained Access Control and Multi-Field Keyword Search in Cloud B...
PPTX
Advance database management project
PPTX
CiMH hollywood 2010
PDF
Research-KS-Jun2015
PDF
Self Service BI for Healthcare
PDF
Self Service BI for Healthcare
PDF
BIG DATA TECHNOLOGY ACCELERATE GENOMICS PRECISION MEDICINE
PDF
Big Data Technology Accelerate Genomics Precision Medicine
PDF
Cri big data
PDF
Cal Essay
PDF
Dynamic Rule Base Construction and Maintenance Scheme for Disease Prediction
PDF
CLOUD-BASED DEVELOPMENT OF SMART AND CONNECTED DATA IN HEALTHCARE APPLICATION
PDF
CLOUD-BASED DEVELOPMENT OF SMART AND CONNECTED DATA IN HEALTHCARE APPLICATION
PDF
CLOUD-BASED DEVELOPMENT OF SMART AND CONNECTED DATA IN HEALTHCARE APPLICATION
Ibrahem
A Systems Approach To Qualitative Data Management And Analysis
Achieving Privacy in Publishing Search logs
Poster (1)
Recording and Reasoning Over Data Provenance in Web and Grid Services
Driving Deep Semantics in Middleware and Networks: What, why and how?
Dynamic Fine-grained Access Control and Multi-Field Keyword Search in Cloud B...
Advance database management project
CiMH hollywood 2010
Research-KS-Jun2015
Self Service BI for Healthcare
Self Service BI for Healthcare
BIG DATA TECHNOLOGY ACCELERATE GENOMICS PRECISION MEDICINE
Big Data Technology Accelerate Genomics Precision Medicine
Cri big data
Cal Essay
Dynamic Rule Base Construction and Maintenance Scheme for Disease Prediction
CLOUD-BASED DEVELOPMENT OF SMART AND CONNECTED DATA IN HEALTHCARE APPLICATION
CLOUD-BASED DEVELOPMENT OF SMART AND CONNECTED DATA IN HEALTHCARE APPLICATION
CLOUD-BASED DEVELOPMENT OF SMART AND CONNECTED DATA IN HEALTHCARE APPLICATION
Ad

Recently uploaded (20)

PPTX
Major-Components-ofNKJNNKNKNKNKronment.pptx
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PDF
Foundation of Data Science unit number two notes
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PPTX
Business Ppt On Nestle.pptx huunnnhhgfvu
PPTX
Business Acumen Training GuidePresentation.pptx
PPTX
Database Infoormation System (DBIS).pptx
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPTX
Supervised vs unsupervised machine learning algorithms
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PPTX
Introduction to Knowledge Engineering Part 1
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PPT
Quality review (1)_presentation of this 21
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PPT
Chapter 3 METAL JOINING.pptnnnnnnnnnnnnn
PPTX
Global journeys: estimating international migration
PPTX
05. PRACTICAL GUIDE TO MICROSOFT EXCEL.pptx
Major-Components-ofNKJNNKNKNKNKronment.pptx
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
Foundation of Data Science unit number two notes
oil_refinery_comprehensive_20250804084928 (1).pptx
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
Introduction-to-Cloud-ComputingFinal.pptx
Business Ppt On Nestle.pptx huunnnhhgfvu
Business Acumen Training GuidePresentation.pptx
Database Infoormation System (DBIS).pptx
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
Supervised vs unsupervised machine learning algorithms
IBA_Chapter_11_Slides_Final_Accessible.pptx
Introduction to Knowledge Engineering Part 1
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
Quality review (1)_presentation of this 21
Acceptance and paychological effects of mandatory extra coach I classes.pptx
Chapter 3 METAL JOINING.pptnnnnnnnnnnnnn
Global journeys: estimating international migration
05. PRACTICAL GUIDE TO MICROSOFT EXCEL.pptx

Patient-Like-Mine

  • 1. Patient-Like-Mine A Real Time, Visual Analytics Tool for Clinical Decision Support Peter Li1 , Ph.D., Simon N. Yates2 , Jenna K. Lovely3 , Pharm.D. R.Ph. BCPS, David W. Larson4 , M.D. MBA Department of Surgery, Mayo Clinic, Rochester, MN USA 1 jenli.peter@gmail.com, 2 yates.simon@mayo.edu, 3 lovely.jenna@mayo.edu, 4 larson.david2@mayo.edu Abstract— We developed a real-time, visual analytics tool for clinical decision support. The system expands the “recall of past experience” approach that a provider (physician) uses to formulate a course of action for a given patient. By utilizing Big-Data techniques, we enable the provider to recall all similar patients from an institution’s electronic medical record (EMR) repository, to explore “what-if” scenarios, and to collect these evidence-based cohorts for future statistical validation and pattern mining. Keywords- electronic medical record, clinical decision support, real-time analytics, visual analytics, data mining. I. INTRODUCTION When determining the course of action for a given patient, the physician has to integrate clinical knowledge, state of the patient, and his/her own personal experience. Clinical knowledge is composed of models of diseases and their progression and interventions. However, many patients have comorbidities and medical histories that cause the disease progression and the optimal treatment plan to deviate from the known clinical models. As a result, the physician (or, in general, the provider team) has to recall from their own experience of similar past cases to develop the care plan. Unfortunately, this approach is often subjective and limited to the team’s ability to recall all relevant cases. We developed an interactive, visual analytics tool, “Patient-Like- Mine”, in the Mayo Clinic’s Enhanced Analytics to Surgical Excellence (EASE) program that focuses on improving Clinical Pathway compliance [1] as well as recognizing patterns early to assist this clinical decision making process. We remove the limits of subjectivity by accessing the institutional Electronic Medical Record (EMR) database so that the collective experience from ALL the past and present patients with a similar background could be utilized in real- time care planning. Operational requirements of “Patient-Like-Mine” includes: • Perform search on a large (>1 billion facts) and complex (>1 thousand properties) with up-to-date (>1 thousand data points/second) dataset in real-time. • Align the resulting patients’ (i.e. the cohort’s) medical histories to provider-chosen “landmark” events, e.g. time of surgery or date of admission. • Restrict the value and the relative time of ANY clinically relevant parameter based on the provider’s judgment, e.g. “systolic blood pressure between 50 and 90, 3 days post-operation”. • Provide an interactive, graphical user interface to compare the given patient to “similar patients” and to explore “what-if” scenarios in a transparent fashion. II. APPROACH The architecture for this project is based on the following four major components: • Automated deployment for Azure public (and private) cloud nodes with Health Insurance Portability and Accountability Act (HIPAA)-compliant disk encryption, strict firewall configurations, and secure login. The cloud is necessary for scalable performance, but we also need to secure the protected health information (PHI) that will now be placed onto the cloud nodes. • A scalable search engine that can expand and shrink based on existing load. For this project, we chose ElasticSearch (http://elastic.co) for model simplicity (JSON-based), query expressiveness (structured and text), inherent performance (subsecond response), and secure transport protocol (SSL/TLS). We developed data transform and importing tools for Fast Healthcare Interoperability Resources (FHIR, http://hl7.org/fhir), Reference Information Model (RIM, http://hl7.org/ implement/standards/rim.cfm), and SQL-based sources that captured EMR at Mayo Clinic. • A schema-driven abstraction layer for ElasticSearch query building and data export. This generates consistent queries from declarative API calls and converts the return results from nested JSON to a tabular format for UI ingestion and downstream statistical analysis. • An intuitive, graphical UI for exploring “Patient-Like- Mine”. A single patient’s structured clinical parameters can be plotted via a “strip-chart” graphic with a zoomable and scrollable interface. By extension, a set of “similar patients” in a cohort would paint a distribution contour over the same chart. By changing the constraints over clinical outcome and intervention parameters, we can analyze “what-if” scenarios to facilitate clinical decision making. The project is divided into 3 phases. The first phase is architectural and implementation feasibility. Second phase is clinical practice validation. Third phase is distribution and expansion into additional clinical practices. III. RESULTS We present the results from the recently completed first phase, demonstrating architectural and implementation feasibility.
  • 2. A. Cloud Deployment We developed an Azure-cloud deployment tool based on Linux shell scripts for ease of portability and extensibility. A given cloud deployment is defined by a set of configuration files that describe the system (IT) and application parameters (Figure 1). System configuration includes disk encryption via LUKS (for encryption-at-rest), firewall settings, and logins via public-key based ssh only. Passwords, certificates, and keys are generated on-the-fly so that each deployment will has its own set of security tokens. Figure 1. System architecture for cloud deployment. Current applications supported are ElasticSearch (ES) and nodejs (Figure 2). ES can be deployed with SHIELD module for secure internode and client communication (for encryption-in-flight). While other ES modules, such as Kibana, are also deployed, but they are not considered to be secure at this time, and is limited to non-production environments. Nodejs is deployed on intranet/on-premise VM that has credentials to connect to local LDAP server for authentication and authorization. ES will only allow connections from this nodejs server via https. Figure 2. Application architecture for cloud deployment. B. ElasticSearch Engine ES is an open source, commercial product that serves as the search engine for many well-known internet sites (see http://elastic.co). It is built on top of Apache Lucene indexing engine (http://lucene.apache.org) processing JSON- based documents, but with distributed, multi-node scalability as a core feature. It has a very large set of querying commands, flexible aggregations, and parent-child relationship. In addition, it has a large ecosystem of tools and language support. Typical EMR captures clinical data from a messaging format (HL7 v2 messages). However, with the industry acceptance of FHIR and RIM messaging objects, we are seeing more complex and polymorphic schema. These complex datasets require transformations to an analytics- friendly fact or event schema. We also took advantage of ES supported parent-child relationships and JSON-based nested array elements to provide additional properties for landmark event alignment and relative times (Figure 3). Figure 3. Mapping of HL7 schema to Event-based schema. C. Schema-based Abstraction Module While ElasticSearch query language is very expressive and powerful, it is also very easy to create queries that return non-intuitive answers because of the different contexts involving nested elements, parent-child relationships, and aggregates. Constraint clauses associated with different parent-child/nested elements need to be repeated at different places of the query construct to ensure proper object selection and projection (Figure 4). We built an abstraction module that takes a set of schema-based constraints and generates a consistent query for execution. To simplify downstream processing of returned results from an ElasticSearch query, this abstraction module also includes a schema-based specification for transforming JSON-based data into a flatter table structure for use. This also simplifies exports to any statistical analysis that may need to be performed. This module is implemented in Javascript for use in both client browser and nodejs web server. Encrypted Data Volume Azure Firewall Azure Cloud Environment Corporate VPN Gateway Azure VPN Gateway Cloud Nodes Cloud-based VMs Cloud Nodes Cloud-based VMs Internet Encrypted Traffic SSL encrypted packets LUKS encrypted blocks Azure Portal Deployment Node/s Linux/shell or Windows/powershell Secure RESTful Azure Commands and Response Azure Deployments Intranet Environment Corporate Firewall SSH Multi-Factor Authentication Config Files Restricted Permissions SSH for nodes Deployed Assets Firewall Encrypted Data Volume Deployment Data, Keys, Passwords Deployment Node/s Linux/shell or Windows/powershell https User ES Slave VM Encrypted Data Volume ES Slave VM Encrypted Data Volume ES Master VM Encrypted Data Volume ... ES Master VM Encrypted Data Volume LDAP Nodejs VM Config & Encrypted Certificate Files Config & Enscrypted Certificate Files VPN Tunnel ssh/scp https https ssh configuration Pa#ent Event Rela#ve+ Time Ref+ Event Document RefDef Parent/Child/Rou#ng Id source value Name first last BirthDate Gender Race … AttributesClass Id source value Type ClinDate Display Observation value Order … Id source value RefEvent RelTime … Id source value RefDef … Id source value Display EventType EventTime PatientQual … contains is;acontains contains generates generates Pa#ent;Like;Mine+Schema FHIR+Resources+ (mongo+DB) Observa#on+ Medica#onAdministra#on+ Encounter+ Order+ Procedure+ … Pa#ent Document FHIR;based+ Mongo2ES+ Mapper RDBMS SQL;based+ Mapper RIM;based+ repository RIM;based+ Mapper Legend:
  • 3. Figure 4. Constraint duplication for consistent results under ES. D. Intuitive UI Data overload is often a complaint about today’s EMR. In addition to presenting the patient data, we will also be presenting data from 10’s to 1000’s of similar patients. The cohort data must be summarized so that a visual comparison can be made readily between the patient of interest and the cohort. We extend the box-and-whisker statistical plots into continuous contours to support continuous variables (Figure 5). This graphic allows the provider to quickly determine if the trajectory of a patient’s clinical parameter is within nominal bounds, based on a cohort of similar patients. Figure 5. Construction of cohort contour graphs. To build a specific real-time cohort, we allow the user to create, move, and size constraint “boxes” over the graph of the patient data. Each constraint box acts as a filter for matching patients whose data intersect the “bounding box” (Figure 6). A physician can decide when and where to place such constraints to create a specific cohort based on his/her clinical assessment, i.e. a customized, ad-hoc “patient-like- mine” cohort to predict patient outcome and to select best treatment options. Figure 6. Impact of constraint creation on contours. IV. CONCLUSION The objective of the EASE program is to improve patient care and outcome. Our big-data approach creates a transparent, interactive environment that enables a provider to formulate more specific plan for a given patient using real- world evidence found in an institutional EMR. By using a very flexible UI, the provider or team can also explore “what-if” scenarios that would have previously taken a statistical/database team and considerable time to develop. However, the number of “similar” patients in any given repository can be limited, reducing the robustness of this recall-based paradigm. By building this system on a cloud- based architecture, it can solve this problem simply by including more patients from other institutions. Thereby increasing the chance of finding patients that match a set of complex or rare clinical features. Our next phase is to demonstrate clinical utility and relevance through deployment in a controlled practice setting that will provide systematic feedback for improvements and define novel use cases. ACKNOWLEDGMENT This project is funded by Mayo Clinic Clinical Practice Committee and the Office of Information and Knowledge Management. We also thank the efforts of the many people on the core EASE team and Mayo Clinic IT support staff. REFERENCES [1] DW Larson, JK Lovely, RR Cima, EJ Dozois, H Chua, BG Wolff, JH Pemberton, RR Devine, M Huebner., “Outcomes after implementation of a multimodal standard care pathway for laparoscopic colorectal surgery.,” Br J Surg. 2014 Jul;101(8):1023-30. doi: 10.1002/bjs.9534. Epub 2014 May 15. Pa#ent Event Rela#ve+ Time Id source value Name first last BirthDate Gender Race … Id source value Type ClinDate Display Observation value Order … Id source value RefEvent RelTime … contains contains A B C ElasticSearch Pseudoquery: { query: { <clause A>, child: { Event : { <clause B>, child: { RelativeTime : { <clause C> } } } } }, inner : { Event : { query : { <clause B>, child : { <clause C> } }, inner : { RelativeTime : { query : { <clause C> } } } } } } Pa#ent'A'' 2015-04 Pa#ent'B' 2015-01 Pa#ent'C' 2014-12 Pa#ent'D' 2015-03 Our'Pa#ent'in'May'2015' with'a'"Landmark"'event … 1.'Find'others' with'same'event 2.'Align'to'our' #me'frame 4.'Calculate'5%,' 10%'25%,'75%,' 90%'95%#le' contours 3.'Overlay 5.'Compare Add#Constraint Add#Constraint