SlideShare a Scribd company logo
a software division of
Creating a Project Plan for a Data Warehouse Testing Assignment
Chris Thompson
Senior Solutions Architect
Mike Calabrese
Senior Solutions Architect
QuerySurge™
the smart Data Testing solution
QuerySurgeTM
™
QuerySurge™
a software division of
SENIOR DOMAIN EXPERT, DATA TESTING PRACTICE
• Military veteran - Aviation electronics technician in the
U.S Navy
• BS in computer science from the University of Delaware
• Successful implementations of QA projects in the Data
space for over 15 years
• Employee for RTTS for the past 21 years
• Started with RTTS as an entry level Test Engineer
• Worked in numerous fields including Pharmaceutical,
Utilities and Retail
Chris Thompson
QuerySurge™
a software division of
SENIOR DOMAIN EXPERT, DATA TESTING PRACTICE
• Joined RTTS as a Test Engineer in 2009
• Over a decade of experience successfully
implementing automated functional, data validation
and ETL testing solutions for multiple clients across
many industry verticals.
• Mike is a technical expert on QuerySurge, RTTS’
flagship data testing solution, and supports clients
around the world with their QuerySurge
implementations.
• BS in Computer Engineering from Hofstra University
Mike Calabrese
QuerySurge™
a software division of
Introduction
• Data Testing is an integral part of the development of any data
project including, data warehouse, data migration and integration
projects
• Bad Data from defects can cause companies to make decisions that
could cost millions of dollars or in a health-related field could cost
dearly
QuerySurge™
a software division of
Handles more than 1 million customer transactions every hour
• data imported into databases that contain > 2.5 petabytes of data
• equivalent to 167 times the information contained in all the books in the US Library of Congress.
Facebook handles 40 billion photos from its user base.
Google processes 1 Terabyte per hour
Twitter processes 85 million tweets per day
eBay processes 80 Terabytes per day
Introduction
QuerySurge™
a software division of
Introduction
What is a Data Source?
• A Data Source is a pool of data available for extraction.
• The concept of the Data Source is technologically neutral – it is not associated
with any specific technology.
• The most common Data Sources are databases, files, and XML documents.
QuerySurge™
a software division of
Introduction
What is a Data Warehouse? (In this case, the target)
• A collection of data or information intended to support business
decision making.
• Data Warehouses contain a wide variety of data that present a
coherent picture of business conditions.
• A Data Warehouse is a huge repository of electronically organized
data mainly meant for the purpose of reporting and analysis.
• Most Data Warehouses are sent data from multiple sources
(Databases and Files).
• A place where historical data is stored for archival, analysis and
security purposes.
Legacy DB
CRM/ERP
DB
Finance DB
QuerySurge™
a software division of
Introduction
What is ETL?
• In computing, the term Extract, Transform and Load (ETL) refers to a data
handling process that involves:
− Extract data from outside sources
− Transform data to fit operational or reporting needs
− Load data into the endpoint target (usually a database, more specifically a
Data Warehouse)
− Why ETL? Businesses need to load the Data Warehouse regularly
(incrementally/daily/weekly) so that it can serve its purpose of supporting
business analysis
QuerySurge™
a software division of
Introduction
Legacy DB
CRM/ERP
DB
Finance DB
Source Data ETL Process Target DWH
Extract
Transform
Load
QuerySurge™
a software division of
Introduction
Data Warehouse
Data Mart
Data Mart
BI Tool
BI Tool
Inventory
‘We have
212 Widgets
in the east
warehouse’
Customer Service
‘The paint
came off my
widget’
Advertising
‘Running a
new radio ad
today’
Transactional Analytical
QuerySurge™
a software division of
Introduction
Test Points and “ETL Legs”
• An ‘ETL Leg’ refers to a single ETL process that moves/transforms data between
two discrete points.
• A full ETL process may have multiple legs
• Test points are usually across single ETL legs –
the verification is between the source and
the target for that leg.
• Example: an operational source database
(source test point) is extracted, transformed
and loaded into a Data Warehouse (target test point).
Testing is conducted across this ETL leg.
Inventory
Data Warehouse
QuerySurge™
a software division of
Introduction
Legacy
DB
CRM/E
RP DB
Finance
DB
Data Sources ETL Process Target DW ETL Process Data Mart
ETL Process
Staging
ETL Leg
ETL Leg
ETL Leg
ETL ETL ETL
QuerySurge™
a software division of
Introduction
Single Leg​ Multi Leg​
More tests need to be created​ Less tests need to be created​
Tests are less complex​ Tests are more complex​
Defects are easier to pinpoint​ Defects are more difficult to pinpoint​
Execution time tends to be longer​ Execution time tends to be shorter​
Single Leg vs. Multi Leg Approaches
QuerySurge™
a software division of
Introduction
Data Mapping Document
A data mapping document is frequently called a source-to-target map and is
generally created in a spreadsheet.
This document acts as a central part of the functional requirements. The following
information is contained within the mapping document:
•Source database information
▪Source table
▪Source column
•Target database information
▪Target table
▪Target column
•Data transformation logic
•Optional requirements
QuerySurge™
a software division of
Introduction
QuerySurge™
a software division of
Introduction
• Direct Map
• Selective column and row type
• Translation
• Lookups
• Transpose
• Field Splitting
• Field Merging
• Calculated and Derived
Transformation Types
QuerySurge™
a software division of
Introduction
Testing Methods – Automation Tool
• Automation with QuerySurge offers
− Bulk data verification, testing sample sizes up to 100%
− Management of test assets
− Test Scheduling
− Persistent access to test data
− Reporting
An automated data testing approach with QuerySurge can significantly improve
coverage, organization and efficiency when compared to the previously mentioned
manual testing techniques.
QuerySurge™
a software division of
The Project Plan
What you will need:
− Gather project documents and assets
• Mapping documents
• Requirement documents
• Data Model documents
− Estimate the time to review documentation
− Determine the number of test engineer resources
− Determine the number of ETL or test legs
− Determine the number of cycles or releases
QuerySurge™
a software division of
The Project Plan
− Determine complexity of project mappings
• Low Complexity: No transformation logic (1-to-1 mapping) or minor transformation
logic including a change to data types from source to target, selective row filtering, and
minor translations
• Medium Complexity: Transformation logic including translations, joins across tables,
field splitting, and field merging
• High Complexity: Transformation logic including major translations, multiple joins across
tables, calculated or aggregated fields, transposing, derived fields, match and merge.
− Is QuerySurge installed and configured for the project?
− Does the lead or test engineers require training?
QuerySurge™
a software division of
The Project Plan
Question
Review documentation 4
Number of Test engineers 1
Number of ETL Legs 1
Number of Releases/Cycles 4
Low Complexity Tests 7
Medium Complexity Tests 21
High Complexity Tests 8
QuerySurge™
a software division of
Any questions?

More Related Content

PDF
QuerySurge AI webinar
PPTX
Data Warehouse Testing in the Pharmaceutical Industry
PPTX
Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...
PDF
How to Automate your Enterprise Application / ERP Testing
PDF
Leveraging HPE ALM & QuerySurge to test HPE Vertica
PDF
Leveraging AI to Simplify and Speed Up ETL Testing
PPTX
Improve the Health of Your Data
PPTX
Data Warehousing in Pharma: How to Find Bad Data while Meeting Regulatory Req...
QuerySurge AI webinar
Data Warehouse Testing in the Pharmaceutical Industry
Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...
How to Automate your Enterprise Application / ERP Testing
Leveraging HPE ALM & QuerySurge to test HPE Vertica
Leveraging AI to Simplify and Speed Up ETL Testing
Improve the Health of Your Data
Data Warehousing in Pharma: How to Find Bad Data while Meeting Regulatory Req...

Similar to Creating a Project Plan for a Data Warehouse Testing Assignment (20)

PDF
QuerySurge - the automated Data Testing solution
PPTX
Query Wizards - data testing made easy - no programming
PDF
QuerySurge Slide Deck for Big Data Testing Webinar
PDF
Automated Testing of Microsoft Power BI Reports
PPTX
An introduction to QuerySurge webinar
PDF
TestGuild and QuerySurge Presentation -DevOps for Data Testing
PPTX
Big Data Testing: Ensuring MongoDB Data Quality
PDF
QuerySurge for DevOps
PPTX
Webinar - QuerySurge and Azure DevOps in the Azure Cloud
PPTX
Testing Big Data: Automated Testing of Hadoop with QuerySurge
PDF
Completing the Data Equation: Test Data + Data Validation = Success
PPTX
QuerySurge integration with ETL / DataStage
PPTX
Testing Big Data: Automated ETL Testing of Hadoop
PDF
2010.03.16 Pollock.Edw2010.Modern D Ifor Warehousing
PPT
ETL Testing Training Presentation
PPT
ETL Testing - Introduction to ETL testing
PPT
ETL Testing - Introduction to ETL Testing
PPT
ETL Testing - Introduction to ETL testing
PDF
the Data World Distilled
PDF
5 Steps To Master Data Management
QuerySurge - the automated Data Testing solution
Query Wizards - data testing made easy - no programming
QuerySurge Slide Deck for Big Data Testing Webinar
Automated Testing of Microsoft Power BI Reports
An introduction to QuerySurge webinar
TestGuild and QuerySurge Presentation -DevOps for Data Testing
Big Data Testing: Ensuring MongoDB Data Quality
QuerySurge for DevOps
Webinar - QuerySurge and Azure DevOps in the Azure Cloud
Testing Big Data: Automated Testing of Hadoop with QuerySurge
Completing the Data Equation: Test Data + Data Validation = Success
QuerySurge integration with ETL / DataStage
Testing Big Data: Automated ETL Testing of Hadoop
2010.03.16 Pollock.Edw2010.Modern D Ifor Warehousing
ETL Testing Training Presentation
ETL Testing - Introduction to ETL testing
ETL Testing - Introduction to ETL Testing
ETL Testing - Introduction to ETL testing
the Data World Distilled
5 Steps To Master Data Management
Ad

More from RTTS (10)

PDF
Improving Automated Testing Projects with UFT
PDF
JMeter webinar - integration with InfluxDB and Grafana
PDF
State of the Market - Data Quality in 2023
PDF
RTTS Postman and API Testing Webinar Slides.pdf
PDF
Creating a Data validation and Testing Strategy
PPTX
Implementing Azure DevOps with your Testing Project
PDF
Whitepaper: Volume Testing Thick Clients and Databases
PDF
Case study: Open Source Automation Framework using Selenium WebDriver
PDF
Enterprise Business Intelligence & Data Warehousing: The Data Quality Conundrum
PDF
RTTS - the Software Quality Experts
Improving Automated Testing Projects with UFT
JMeter webinar - integration with InfluxDB and Grafana
State of the Market - Data Quality in 2023
RTTS Postman and API Testing Webinar Slides.pdf
Creating a Data validation and Testing Strategy
Implementing Azure DevOps with your Testing Project
Whitepaper: Volume Testing Thick Clients and Databases
Case study: Open Source Automation Framework using Selenium WebDriver
Enterprise Business Intelligence & Data Warehousing: The Data Quality Conundrum
RTTS - the Software Quality Experts
Ad

Recently uploaded (20)

PPTX
Online Work Permit System for Fast Permit Processing
PDF
Odoo Companies in India – Driving Business Transformation.pdf
PDF
2025 Textile ERP Trends: SAP, Odoo & Oracle
PDF
PTS Company Brochure 2025 (1).pdf.......
PPTX
Introduction to Artificial Intelligence
PDF
How to Migrate SBCGlobal Email to Yahoo Easily
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 41
PDF
How Creative Agencies Leverage Project Management Software.pdf
PDF
Softaken Excel to vCard Converter Software.pdf
PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PDF
top salesforce developer skills in 2025.pdf
PDF
Which alternative to Crystal Reports is best for small or large businesses.pdf
PPTX
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
PDF
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
PPTX
ISO 45001 Occupational Health and Safety Management System
PDF
System and Network Administration Chapter 2
PDF
medical staffing services at VALiNTRY
PDF
Navsoft: AI-Powered Business Solutions & Custom Software Development
PDF
Design an Analysis of Algorithms I-SECS-1021-03
PDF
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
Online Work Permit System for Fast Permit Processing
Odoo Companies in India – Driving Business Transformation.pdf
2025 Textile ERP Trends: SAP, Odoo & Oracle
PTS Company Brochure 2025 (1).pdf.......
Introduction to Artificial Intelligence
How to Migrate SBCGlobal Email to Yahoo Easily
Internet Downloader Manager (IDM) Crack 6.42 Build 41
How Creative Agencies Leverage Project Management Software.pdf
Softaken Excel to vCard Converter Software.pdf
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
top salesforce developer skills in 2025.pdf
Which alternative to Crystal Reports is best for small or large businesses.pdf
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
ISO 45001 Occupational Health and Safety Management System
System and Network Administration Chapter 2
medical staffing services at VALiNTRY
Navsoft: AI-Powered Business Solutions & Custom Software Development
Design an Analysis of Algorithms I-SECS-1021-03
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf

Creating a Project Plan for a Data Warehouse Testing Assignment

  • 1. a software division of Creating a Project Plan for a Data Warehouse Testing Assignment Chris Thompson Senior Solutions Architect Mike Calabrese Senior Solutions Architect QuerySurge™ the smart Data Testing solution QuerySurgeTM ™
  • 2. QuerySurge™ a software division of SENIOR DOMAIN EXPERT, DATA TESTING PRACTICE • Military veteran - Aviation electronics technician in the U.S Navy • BS in computer science from the University of Delaware • Successful implementations of QA projects in the Data space for over 15 years • Employee for RTTS for the past 21 years • Started with RTTS as an entry level Test Engineer • Worked in numerous fields including Pharmaceutical, Utilities and Retail Chris Thompson
  • 3. QuerySurge™ a software division of SENIOR DOMAIN EXPERT, DATA TESTING PRACTICE • Joined RTTS as a Test Engineer in 2009 • Over a decade of experience successfully implementing automated functional, data validation and ETL testing solutions for multiple clients across many industry verticals. • Mike is a technical expert on QuerySurge, RTTS’ flagship data testing solution, and supports clients around the world with their QuerySurge implementations. • BS in Computer Engineering from Hofstra University Mike Calabrese
  • 4. QuerySurge™ a software division of Introduction • Data Testing is an integral part of the development of any data project including, data warehouse, data migration and integration projects • Bad Data from defects can cause companies to make decisions that could cost millions of dollars or in a health-related field could cost dearly
  • 5. QuerySurge™ a software division of Handles more than 1 million customer transactions every hour • data imported into databases that contain > 2.5 petabytes of data • equivalent to 167 times the information contained in all the books in the US Library of Congress. Facebook handles 40 billion photos from its user base. Google processes 1 Terabyte per hour Twitter processes 85 million tweets per day eBay processes 80 Terabytes per day Introduction
  • 6. QuerySurge™ a software division of Introduction What is a Data Source? • A Data Source is a pool of data available for extraction. • The concept of the Data Source is technologically neutral – it is not associated with any specific technology. • The most common Data Sources are databases, files, and XML documents.
  • 7. QuerySurge™ a software division of Introduction What is a Data Warehouse? (In this case, the target) • A collection of data or information intended to support business decision making. • Data Warehouses contain a wide variety of data that present a coherent picture of business conditions. • A Data Warehouse is a huge repository of electronically organized data mainly meant for the purpose of reporting and analysis. • Most Data Warehouses are sent data from multiple sources (Databases and Files). • A place where historical data is stored for archival, analysis and security purposes. Legacy DB CRM/ERP DB Finance DB
  • 8. QuerySurge™ a software division of Introduction What is ETL? • In computing, the term Extract, Transform and Load (ETL) refers to a data handling process that involves: − Extract data from outside sources − Transform data to fit operational or reporting needs − Load data into the endpoint target (usually a database, more specifically a Data Warehouse) − Why ETL? Businesses need to load the Data Warehouse regularly (incrementally/daily/weekly) so that it can serve its purpose of supporting business analysis
  • 9. QuerySurge™ a software division of Introduction Legacy DB CRM/ERP DB Finance DB Source Data ETL Process Target DWH Extract Transform Load
  • 10. QuerySurge™ a software division of Introduction Data Warehouse Data Mart Data Mart BI Tool BI Tool Inventory ‘We have 212 Widgets in the east warehouse’ Customer Service ‘The paint came off my widget’ Advertising ‘Running a new radio ad today’ Transactional Analytical
  • 11. QuerySurge™ a software division of Introduction Test Points and “ETL Legs” • An ‘ETL Leg’ refers to a single ETL process that moves/transforms data between two discrete points. • A full ETL process may have multiple legs • Test points are usually across single ETL legs – the verification is between the source and the target for that leg. • Example: an operational source database (source test point) is extracted, transformed and loaded into a Data Warehouse (target test point). Testing is conducted across this ETL leg. Inventory Data Warehouse
  • 12. QuerySurge™ a software division of Introduction Legacy DB CRM/E RP DB Finance DB Data Sources ETL Process Target DW ETL Process Data Mart ETL Process Staging ETL Leg ETL Leg ETL Leg ETL ETL ETL
  • 13. QuerySurge™ a software division of Introduction Single Leg​ Multi Leg​ More tests need to be created​ Less tests need to be created​ Tests are less complex​ Tests are more complex​ Defects are easier to pinpoint​ Defects are more difficult to pinpoint​ Execution time tends to be longer​ Execution time tends to be shorter​ Single Leg vs. Multi Leg Approaches
  • 14. QuerySurge™ a software division of Introduction Data Mapping Document A data mapping document is frequently called a source-to-target map and is generally created in a spreadsheet. This document acts as a central part of the functional requirements. The following information is contained within the mapping document: •Source database information ▪Source table ▪Source column •Target database information ▪Target table ▪Target column •Data transformation logic •Optional requirements
  • 16. QuerySurge™ a software division of Introduction • Direct Map • Selective column and row type • Translation • Lookups • Transpose • Field Splitting • Field Merging • Calculated and Derived Transformation Types
  • 17. QuerySurge™ a software division of Introduction Testing Methods – Automation Tool • Automation with QuerySurge offers − Bulk data verification, testing sample sizes up to 100% − Management of test assets − Test Scheduling − Persistent access to test data − Reporting An automated data testing approach with QuerySurge can significantly improve coverage, organization and efficiency when compared to the previously mentioned manual testing techniques.
  • 18. QuerySurge™ a software division of The Project Plan What you will need: − Gather project documents and assets • Mapping documents • Requirement documents • Data Model documents − Estimate the time to review documentation − Determine the number of test engineer resources − Determine the number of ETL or test legs − Determine the number of cycles or releases
  • 19. QuerySurge™ a software division of The Project Plan − Determine complexity of project mappings • Low Complexity: No transformation logic (1-to-1 mapping) or minor transformation logic including a change to data types from source to target, selective row filtering, and minor translations • Medium Complexity: Transformation logic including translations, joins across tables, field splitting, and field merging • High Complexity: Transformation logic including major translations, multiple joins across tables, calculated or aggregated fields, transposing, derived fields, match and merge. − Is QuerySurge installed and configured for the project? − Does the lead or test engineers require training?
  • 20. QuerySurge™ a software division of The Project Plan Question Review documentation 4 Number of Test engineers 1 Number of ETL Legs 1 Number of Releases/Cycles 4 Low Complexity Tests 7 Medium Complexity Tests 21 High Complexity Tests 8