SlideShare a Scribd company logo
Better Data Leads to Better Analytics:
Three Ways to Improve Healthcare Data Quality
in an EDW
Written by
Jason B. Buskirk
Chief Operating Officer
Health Care DataWorks
 
	
  
	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  
2
Better Data Leads to Better Analytics:
Three Ways to Improve Healthcare Data Quality in an EDW
Too often, organizations embark on Enterprise Data Warehouse (EDW) projects with the notion
that all their data needs will be met once the implementation is complete. It is understandable
why this thinking becomes pervasive throughout the organization. Typically, organizations have
decided to take on such projects after lengthy and time-intensive meetings, presentations
and reviews to bring together the myriad interests of its key stakeholders, followed by the due
diligence necessary to secure the funding and select the technology partner. Expectations begin
to run very high.
While an EDW undoubtedly will empower organizations to do more with their data than ever before
and the investment will pay dividends in terms of the value it brings, an EDW is only as good as the
data that is fed into it. Every organization will encounter data quality issues during or leading up to
EDW implementation, and these issues can negatively affect the timeline of the implementation. If
there are issues with data quality, the organization will find that, when it comes time to extract the
data, it will not be as useful as expected. It is important to discover and address data quality issues
as early as possible. Not doing so becomes expensive, both in terms of the developers’ time and
the lack of trust that could occur within the organization. Think of it this way: If you put bad data in,
you get bad data out, and the sooner you find the bad data, the better off your project will be. This
white paper details three ways to improve data quality in an EDW.
Establish realistic expectations
Improving data quality starts with understanding the data challenges and proactively
communicating and working with stakeholders to address potential pitfalls. Taking these steps
will contribute to a successful, cost-efficient and relatively smooth implementation that can
achieve results at a quicker pace.
It is important that everyone in the organization knows that the EDW will only be as effective
as the data that goes into it. This will help manage expectations and reduce potential frustration.
Everyone wants access to data that is relevant, understandable and, ultimately, results in
actionable knowledge. But the reality is an organization will not know how bad its data is until
it begins the task of profiling the data that is to be extracted.
 
	
  
	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  
3
Know the causes of data issues
Virtually all the data issues encountered with a data warehouse implementation are not
technological in nature – they are operational. These operational causes of data issues generally
fall into two broad categories:
• Data collection requirements. Organizations have multiple systems capturing and
storing their electronic medical records, financial records and human resource information.
But these systems tend to operate in silos. This often contributes to issues around when
and if data is collected. Some systems may require data elements to be populated, while
others may not make them mandatory for data capture. This leads to sparse data sets that
could have limited usefulness in the future.
• Lack of standardization. Because myriad systems are in use and individual departments
can track data in different ways, problems with standardization often arise and take many
forms. For example, two units within a health system track the same information – patient
gender. In one system, the information is input and categorized as “male” or “female.” In
the other system, gender is input as a “1” or “2.” Even though these issues can and
should be fixed during the extract process, the time needed to identify these issues and
decide how the data should be stored in the data warehouse is something the organization
needs to take into consideration when planning the data warehouse project.
Improve data quality
By taking the following steps before the implementation process begins, organizations can cleanse
and improve the quality of the data, positioning the organization for a successful enterprise data
warehouse project.
• Establish a governance body or data quality group to create consistent standards.
Most organizations do not have this in place prior to an EDW implementation. The body
or group should be comprised of stakeholders who know which data is being collected,
how it is being categorized, how and where it is stored, and all the other details critical
to establishing an organization-wide standard. The goal should be to identify “bad” and
non-standardized data. Doing this sooner rather than later can ensure the most
cost-efficient implementation.
 
	
  
	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  
4
Organizations have
two options: They
can build their own
data model or buy
one. Health Care
DataWorks, for
instance, offers a
mature data model
that is proven over
many years of
effective use.
	
  
• Identify subject matter experts to play an ongoing role in the implementation
process. These should be individuals who understand the data and know how it can
be used. Make them part of the implementation team. They are valuable resources in
that they not only know the data, but also understand how existing operational systems
work. By including them on the team, you will identify data quality issues earlier in your
implementation. Their involvement will also help provide built-in credibility when it
comes time to go live.
It’s also important to remember that these subject matter expert resources be freed up
from a time commitment standpoint to devote the required attention to the implementation
process. It is an in-kind investment that is worthwhile because of the positive outcome
that will result.
• Standardize your data model up front. Having a
data model up front will not only accelerate the data
warehouse’s implementation timeline, it also will assist
the organization with the data issues mentioned earlier
by connecting multiple and disparate source systems.
Remember, data elements will be captured
inconsistently by different operational systems. When
the data model is populated, it will have a place to store
each data element regardless of the source system.
Data quality rules can be implemented to populate the
data based on data availability in each source system.
In the example of gender mentioned earlier, the same
data elements may be stored using different data values.
Possessing and populating a robust data model will
force an organization to standardize these data
elements and serve as a blueprint for how these data
elements should be handled. In this example, the data
model will have a conformed dimension to standardize
gender values.
 
	
  
	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  
5
Organizations have two options for obtaining a data model: They can build their own data
model or buy one. Health Care DataWorks, for instance, offers a mature data model that
is proven over many years of effective use. Regardless of how the organization proceeds,
the data model needs to be in place up front in order for an organization to be ready for
the data quality issues that it should expect.
Conclusion
Organizations can expect data quality challenges when undertaking an EDW implementation. But
when they understand the potential pitfalls, remain committed to improving the quality of data, and
involve their internal experts and users in the process, they will be well on the way to adding value
to the entire organization in the most cost-effective and timely manner.
 
	
  
	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  
6
	
  
About the Author
Jason Buskirk is responsible for managing the day-to-day operations of Health Care DataWorks
(HCD) and leading product strategies for the company's pre-built analytics applications. He is one
of the company's founders.
Prior to HCD, Buskirk worked for Deloitte Consulting, where he implemented analytic applications
built using Oracle's Business Intelligence Enterprise Edition. Buskirk also served as Manager of
the Information Warehouse and Research Information Systems at the Wexner Medical Center at
The Ohio State University.
Buskirk holds a bachelor's degree in computer information systems from DeVry University.
About Health Care DataWorks
Health Care DataWorks, Inc., a leading provider of business intelligence solutions, empowers
healthcare organizations to improve their quality of care and reduce costs. Through its pioneering
KnowledgeEdge™ product suite, including its enterprise data model, analytic dashboards,
applications, and reports, Health Care DataWorks delivers an Enterprise Data Warehouse
necessary for hospitals and health systems to effectively and efficiently gain deeper insights
into their operations. For more information, visit www.hcdataworks.com.
 
	
  
	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  
7
Contacting Health Care DataWorks
Published: February 2012
©2012 Health Care DataWorks, Inc. ALL RIGHTS RESERVED
Phone
1-877-979-HCDW (4239)
Email
info@hcdataworks.com
Address
1801 Watermark Drive,
Suite 250
Columbus, OH 43215
Web
www.hcdataworks.com

More Related Content

PPTX
20 Years in Healthcare Analytics & Data Warehousing: What did we learn? What'...
PDF
Getting Ahead Of The Game: Proactive Data Governance
PDF
Informatica Transforming Healthcare eBook
PDF
LS_WhitePaper_NextGenAnalyticsMay2016
PPTX
The Data Driven University - Automating Data Governance and Stewardship in Au...
PDF
Data Governance Maturity Model
PPTX
Part 2 - 20 Years in Healthcare Analytics & Data Warehousing: What did we lea...
PDF
Data governance - An Insight
20 Years in Healthcare Analytics & Data Warehousing: What did we learn? What'...
Getting Ahead Of The Game: Proactive Data Governance
Informatica Transforming Healthcare eBook
LS_WhitePaper_NextGenAnalyticsMay2016
The Data Driven University - Automating Data Governance and Stewardship in Au...
Data Governance Maturity Model
Part 2 - 20 Years in Healthcare Analytics & Data Warehousing: What did we lea...
Data governance - An Insight

What's hot (20)

PDF
Data Management
PPT
Innovative Insights for Smarter Care: Care Management and Analytics
PDF
Beyond Firefighting: A Leaders Guide to Proactive Data Quality Management
PDF
Data Governance PowerPoint Presentation Slides
PDF
Data-Ed: Unlock Business Value through Data Quality Engineering
PDF
Challenges in integrating various DBMS during SAP implementation
PPTX
Advancements in Legal Entity Data Quality
PDF
Business impact without data governance
PPTX
AMCTO presentation on moving from records managment to information management
PDF
Data Governance Maturity Model Thesis
PPT
JR's Lifetime Advanced Analytics
PPTX
Workflow enhances ECM adoption_LaserFicheEpower14
PDF
Data driven decision making
PPT
SDM Presentation V1.0
PPTX
Choosing an Analytics Solution in Healthcare
PPTX
Realizing the Promise of Precision Medicine
PPTX
Expand ecm acrossorg_empower15
PDF
Role of Operational System Design in Data Warehouse Implementation: Identifyi...
PDF
Change Management: The Secret to a Successful SAS® Implementation
PPTX
Governance And Data Protection In The Health Sector - Billy Hawkes
Data Management
Innovative Insights for Smarter Care: Care Management and Analytics
Beyond Firefighting: A Leaders Guide to Proactive Data Quality Management
Data Governance PowerPoint Presentation Slides
Data-Ed: Unlock Business Value through Data Quality Engineering
Challenges in integrating various DBMS during SAP implementation
Advancements in Legal Entity Data Quality
Business impact without data governance
AMCTO presentation on moving from records managment to information management
Data Governance Maturity Model Thesis
JR's Lifetime Advanced Analytics
Workflow enhances ECM adoption_LaserFicheEpower14
Data driven decision making
SDM Presentation V1.0
Choosing an Analytics Solution in Healthcare
Realizing the Promise of Precision Medicine
Expand ecm acrossorg_empower15
Role of Operational System Design in Data Warehouse Implementation: Identifyi...
Change Management: The Secret to a Successful SAS® Implementation
Governance And Data Protection In The Health Sector - Billy Hawkes
Ad

Similar to Hcd wp-2012-better dataleadstobetteranalytics (20)

PDF
Tips --Break Down the Barriers to Better Data Analytics
PPTX
DC Salesforce1 Tour Data Governance Lunch Best Practices deck
PDF
Qlik wp 2021_q3_data_governance_in_the_modern_data_analytics_pipeline
PDF
Data Cleaning
PPTX
The Data Operating System: Changing the Digital Trajectory of Healthcare
PPTX
The Data Operating System: Changing the Digital Trajectory of Healthcare
PDF
5 Common Data Science Challenges and Effective Solutions.pdf
PPTX
Best Practices of Data Governance.pptx
PPTX
Healthcare data challenges
PPTX
Healthcare Data Challenges
PPTX
How Data Integration and Governance Enables HR to Drive Value .pptx
PPTX
Chapter 4 : Introduction to BigData.pptx
PPT
Image Resampling Detection Based on Convolutional Neural Network Yaohua Liang...
PDF
Rapid-fire BI
DOC
Comprehensive Data Governance Program
PDF
Why Data Standards?
PPTX
Cff data governance best practices
PDF
Data Quality Assessment: Key Features and Best Practices | Mr. Business Magazine
PPTX
Securing big data (july 2012)
PPT
The data science process and fundamentals ppt
Tips --Break Down the Barriers to Better Data Analytics
DC Salesforce1 Tour Data Governance Lunch Best Practices deck
Qlik wp 2021_q3_data_governance_in_the_modern_data_analytics_pipeline
Data Cleaning
The Data Operating System: Changing the Digital Trajectory of Healthcare
The Data Operating System: Changing the Digital Trajectory of Healthcare
5 Common Data Science Challenges and Effective Solutions.pdf
Best Practices of Data Governance.pptx
Healthcare data challenges
Healthcare Data Challenges
How Data Integration and Governance Enables HR to Drive Value .pptx
Chapter 4 : Introduction to BigData.pptx
Image Resampling Detection Based on Convolutional Neural Network Yaohua Liang...
Rapid-fire BI
Comprehensive Data Governance Program
Why Data Standards?
Cff data governance best practices
Data Quality Assessment: Key Features and Best Practices | Mr. Business Magazine
Securing big data (july 2012)
The data science process and fundamentals ppt
Ad

More from Health Care DataWorks (17)

PDF
Hrr cmo-benefits
PDF
Hrr cmio-benefits
PDF
Hrr cio-benefits
PDF
Hrr cfo-benefits
PDF
Hcd wp-2012-value basedpurchasingwhathospitalsandhealthsystemsneed
PDF
Hcd wp-2012-howan enterprisedatawarehousecanmake
PDF
Hcd wp-2012-better analysisofrevenuecycleandvbp
PDF
Hcd fast-facts-2013
PDF
Hcd corporateoverviewbrochure
PDF
Vbp data sheet
PDF
Data sheet ke-top-of-hospital
PDF
Data sheet ke-patient-experience
PDF
Datasheet ke-operating-room
PDF
Datasheet ke-is-support-center
PDF
Datasheet ke-event-reporting
PDF
Datasheet ke-emergency-department
PDF
Datasheet ke-admit-discharge-and-transfer
Hrr cmo-benefits
Hrr cmio-benefits
Hrr cio-benefits
Hrr cfo-benefits
Hcd wp-2012-value basedpurchasingwhathospitalsandhealthsystemsneed
Hcd wp-2012-howan enterprisedatawarehousecanmake
Hcd wp-2012-better analysisofrevenuecycleandvbp
Hcd fast-facts-2013
Hcd corporateoverviewbrochure
Vbp data sheet
Data sheet ke-top-of-hospital
Data sheet ke-patient-experience
Datasheet ke-operating-room
Datasheet ke-is-support-center
Datasheet ke-event-reporting
Datasheet ke-emergency-department
Datasheet ke-admit-discharge-and-transfer

Recently uploaded (20)

PPTX
Cloud computing and distributed systems.
PDF
Spectral efficient network and resource selection model in 5G networks
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Modernizing your data center with Dell and AMD
PDF
Empathic Computing: Creating Shared Understanding
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Approach and Philosophy of On baking technology
PDF
Electronic commerce courselecture one. Pdf
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PPTX
Big Data Technologies - Introduction.pptx
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
Cloud computing and distributed systems.
Spectral efficient network and resource selection model in 5G networks
MYSQL Presentation for SQL database connectivity
Advanced methodologies resolving dimensionality complications for autism neur...
Modernizing your data center with Dell and AMD
Empathic Computing: Creating Shared Understanding
The AUB Centre for AI in Media Proposal.docx
Digital-Transformation-Roadmap-for-Companies.pptx
Reach Out and Touch Someone: Haptics and Empathic Computing
NewMind AI Monthly Chronicles - July 2025
Mobile App Security Testing_ A Comprehensive Guide.pdf
Diabetes mellitus diagnosis method based random forest with bat algorithm
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Approach and Philosophy of On baking technology
Electronic commerce courselecture one. Pdf
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
Big Data Technologies - Introduction.pptx
20250228 LYD VKU AI Blended-Learning.pptx
Per capita expenditure prediction using model stacking based on satellite ima...

Hcd wp-2012-better dataleadstobetteranalytics

  • 1. Better Data Leads to Better Analytics: Three Ways to Improve Healthcare Data Quality in an EDW Written by Jason B. Buskirk Chief Operating Officer Health Care DataWorks
  • 2.                                                                                                                                                                                                                                                                                       2 Better Data Leads to Better Analytics: Three Ways to Improve Healthcare Data Quality in an EDW Too often, organizations embark on Enterprise Data Warehouse (EDW) projects with the notion that all their data needs will be met once the implementation is complete. It is understandable why this thinking becomes pervasive throughout the organization. Typically, organizations have decided to take on such projects after lengthy and time-intensive meetings, presentations and reviews to bring together the myriad interests of its key stakeholders, followed by the due diligence necessary to secure the funding and select the technology partner. Expectations begin to run very high. While an EDW undoubtedly will empower organizations to do more with their data than ever before and the investment will pay dividends in terms of the value it brings, an EDW is only as good as the data that is fed into it. Every organization will encounter data quality issues during or leading up to EDW implementation, and these issues can negatively affect the timeline of the implementation. If there are issues with data quality, the organization will find that, when it comes time to extract the data, it will not be as useful as expected. It is important to discover and address data quality issues as early as possible. Not doing so becomes expensive, both in terms of the developers’ time and the lack of trust that could occur within the organization. Think of it this way: If you put bad data in, you get bad data out, and the sooner you find the bad data, the better off your project will be. This white paper details three ways to improve data quality in an EDW. Establish realistic expectations Improving data quality starts with understanding the data challenges and proactively communicating and working with stakeholders to address potential pitfalls. Taking these steps will contribute to a successful, cost-efficient and relatively smooth implementation that can achieve results at a quicker pace. It is important that everyone in the organization knows that the EDW will only be as effective as the data that goes into it. This will help manage expectations and reduce potential frustration. Everyone wants access to data that is relevant, understandable and, ultimately, results in actionable knowledge. But the reality is an organization will not know how bad its data is until it begins the task of profiling the data that is to be extracted.
  • 3.                                                                                                                                                                                                                                                                                       3 Know the causes of data issues Virtually all the data issues encountered with a data warehouse implementation are not technological in nature – they are operational. These operational causes of data issues generally fall into two broad categories: • Data collection requirements. Organizations have multiple systems capturing and storing their electronic medical records, financial records and human resource information. But these systems tend to operate in silos. This often contributes to issues around when and if data is collected. Some systems may require data elements to be populated, while others may not make them mandatory for data capture. This leads to sparse data sets that could have limited usefulness in the future. • Lack of standardization. Because myriad systems are in use and individual departments can track data in different ways, problems with standardization often arise and take many forms. For example, two units within a health system track the same information – patient gender. In one system, the information is input and categorized as “male” or “female.” In the other system, gender is input as a “1” or “2.” Even though these issues can and should be fixed during the extract process, the time needed to identify these issues and decide how the data should be stored in the data warehouse is something the organization needs to take into consideration when planning the data warehouse project. Improve data quality By taking the following steps before the implementation process begins, organizations can cleanse and improve the quality of the data, positioning the organization for a successful enterprise data warehouse project. • Establish a governance body or data quality group to create consistent standards. Most organizations do not have this in place prior to an EDW implementation. The body or group should be comprised of stakeholders who know which data is being collected, how it is being categorized, how and where it is stored, and all the other details critical to establishing an organization-wide standard. The goal should be to identify “bad” and non-standardized data. Doing this sooner rather than later can ensure the most cost-efficient implementation.
  • 4.                                                                                                                                                                                                                                                                                       4 Organizations have two options: They can build their own data model or buy one. Health Care DataWorks, for instance, offers a mature data model that is proven over many years of effective use.   • Identify subject matter experts to play an ongoing role in the implementation process. These should be individuals who understand the data and know how it can be used. Make them part of the implementation team. They are valuable resources in that they not only know the data, but also understand how existing operational systems work. By including them on the team, you will identify data quality issues earlier in your implementation. Their involvement will also help provide built-in credibility when it comes time to go live. It’s also important to remember that these subject matter expert resources be freed up from a time commitment standpoint to devote the required attention to the implementation process. It is an in-kind investment that is worthwhile because of the positive outcome that will result. • Standardize your data model up front. Having a data model up front will not only accelerate the data warehouse’s implementation timeline, it also will assist the organization with the data issues mentioned earlier by connecting multiple and disparate source systems. Remember, data elements will be captured inconsistently by different operational systems. When the data model is populated, it will have a place to store each data element regardless of the source system. Data quality rules can be implemented to populate the data based on data availability in each source system. In the example of gender mentioned earlier, the same data elements may be stored using different data values. Possessing and populating a robust data model will force an organization to standardize these data elements and serve as a blueprint for how these data elements should be handled. In this example, the data model will have a conformed dimension to standardize gender values.
  • 5.                                                                                                                                                                                                                                                                                       5 Organizations have two options for obtaining a data model: They can build their own data model or buy one. Health Care DataWorks, for instance, offers a mature data model that is proven over many years of effective use. Regardless of how the organization proceeds, the data model needs to be in place up front in order for an organization to be ready for the data quality issues that it should expect. Conclusion Organizations can expect data quality challenges when undertaking an EDW implementation. But when they understand the potential pitfalls, remain committed to improving the quality of data, and involve their internal experts and users in the process, they will be well on the way to adding value to the entire organization in the most cost-effective and timely manner.
  • 6.                                                                                                                                                                                                                                                                                       6   About the Author Jason Buskirk is responsible for managing the day-to-day operations of Health Care DataWorks (HCD) and leading product strategies for the company's pre-built analytics applications. He is one of the company's founders. Prior to HCD, Buskirk worked for Deloitte Consulting, where he implemented analytic applications built using Oracle's Business Intelligence Enterprise Edition. Buskirk also served as Manager of the Information Warehouse and Research Information Systems at the Wexner Medical Center at The Ohio State University. Buskirk holds a bachelor's degree in computer information systems from DeVry University. About Health Care DataWorks Health Care DataWorks, Inc., a leading provider of business intelligence solutions, empowers healthcare organizations to improve their quality of care and reduce costs. Through its pioneering KnowledgeEdge™ product suite, including its enterprise data model, analytic dashboards, applications, and reports, Health Care DataWorks delivers an Enterprise Data Warehouse necessary for hospitals and health systems to effectively and efficiently gain deeper insights into their operations. For more information, visit www.hcdataworks.com.
  • 7.                                                                                                                                                                                                                                                                                       7 Contacting Health Care DataWorks Published: February 2012 ©2012 Health Care DataWorks, Inc. ALL RIGHTS RESERVED Phone 1-877-979-HCDW (4239) Email info@hcdataworks.com Address 1801 Watermark Drive, Suite 250 Columbus, OH 43215 Web www.hcdataworks.com