SlideShare a Scribd company logo
#ATAGTR201
7
16th
17th
March
Bee-Hive approach on Big Data testing
- Usharani Subramanian
Agile Testing Alliance Global Testing Retreat 2017
Structured
Unstructured
External Files - CSV
Streaming
Heterogeneous
Sources
ApplicationsINGESTION TRANSFORMATION ANALYTICS
Data Lake
BI & Visualization tools
Functional UI applications
Excel/CSV Reports
Agile Testing Alliance Global Testing Retreat 2017
Big Data Ecosystem Testing Big Data Extraction Testing
Data Transformation Testing
Data analytics and Visualization Testing
Non Functional Testing
• Pre-Hadoop Validation
• Meta Data Analysis and Validation
• Impala & HDFS Data Storage Validation
• Validation on Data Extraction from Source
• Referential Integrity & Constraint Validation
• Heterogeneous Data Integration Validation
• MDM Validation
• Data Quality Validation
• Data Correctness/ Completeness Validation
• Business Rule Validation
• HDFS to SSAS validation
• Dashboard Validation
• Visualization Validation
• Report generation and Validation
• Performance Validation
• Security Validation
• Regulatory Compliance Validation
Big Data Tools
• Query Surge, Talend, Informatica - Functional
• YCSB-CDH, JMX Statics, JMeter – Performance
• OWASP ZAP, Spider – Security scanner
• Manual UI Validations over Dashboards
Agile Testing Alliance Global Testing Retreat 2017
Functional Automation & Data validation – QuerySurge, Informatica, Talend, etc
Pre-hadoop validation – Amazon EMR, Sqoop, etc
Performance Testing - YCSB-CDH, JMX Statics, Jmeter
Security Testing – UI level through ZAP, Spider: other through Regulatory Compliance
CI/CD Scheduler – Jenkins, oozie, etc
Agile Testing Alliance Global Testing Retreat 2017
Workflow Data Testing:
1. UI > Impala: Scenarios derived from UI
2. Excel > ODBC > HDFS: Random Queries
3. ThirdParty tool > ODBC > HDFS: Random Queries
System Integration Testing:
1. Big Data Ecosystem Validation (Architecture level Validation)
1. Data validation from Heterogeneous Data Sources to HDFS
2. Data validation from Data lake to Analytical server
3. Data validation at Application layer
4. UI validation of Application layer
• Performance
• Responsive Web Designing – Desired pixel?
• Browser Compatibility – Desired browser?
Performance Testing : Load and Stress
1. Query performance on Impala
2. UI level Performance validation
Security Testing:
1. Security aspects covered as a part of Regulatory compliance (PCI, HIPPA, etc)
2. Application level Vulnerability scanning
Agile Testing Alliance Global Testing Retreat 2017
• 4 Vs of Big Data – 5th
V?
• Big data is just the Data Engineering part – Preparatory Stage. ROI lies in
data Science part of it
• Real ROI of Big Data lies in Machine learning & Artificial Intelligence. How?
• In Machine learning, the Big Data helps to build a better model that
facilitates better business decisions
Agile Testing Alliance Global Testing Retreat 2017
Data Engineering
Clean
Refine
Transfor
m
Store
Social Media
Database
s
Internet of Things
Streaming Data
DataIngestion
Data Science
Extract Feature Vectors
Apply Modelling
Customer Personalization
Feature Vectors
Deduce model based on
Self learning Algorithm
Use existing data to make
better business decisions
70% Training
Data
30% Evaluation
Data
• How Big Data helps building the better Machine learning model?
Agile Testing Alliance Global Testing Retreat 2017
Thank You

More Related Content

PPTX
ATAGTR2017 Analytics Testing
PPTX
ATAGTR2017 Static and dynamic code analysis for mobile applications - Act ear...
PPTX
ATAGTR2017 Test the REST
PPTX
ATAGTR2017 Performance Testing of Big Data Application
PPTX
ATAGTR2017 Testing of Connected Cars Based on IOT
PPTX
ATAGTR2017 Security Testing / IoT Testing in Real World
PPTX
ATAGTR2017 Machine Learning telepathy for Shift Right approach of testing
PPTX
ATAGTR2017 Performance Automation in Dev-Ops
ATAGTR2017 Analytics Testing
ATAGTR2017 Static and dynamic code analysis for mobile applications - Act ear...
ATAGTR2017 Test the REST
ATAGTR2017 Performance Testing of Big Data Application
ATAGTR2017 Testing of Connected Cars Based on IOT
ATAGTR2017 Security Testing / IoT Testing in Real World
ATAGTR2017 Machine Learning telepathy for Shift Right approach of testing
ATAGTR2017 Performance Automation in Dev-Ops

What's hot (20)

PPTX
ATAGTR2017 Artificial Intelligence in Software Testing – Demystified
PPTX
ATAGTR2017 Detect Layout Bugs by Simulating Human Eye
PPTX
ATAGTR2017 HikeRunner: Load Test Framework
PPTX
Be a User first; then a Tester
PPTX
ATAGTR2017 Security Test Driven Development (STDD)
PDF
ATAGTR2017 What Lies Beneath Robotics Process Automation
PDF
Digital Assurance - Today & Tomorrow
PPTX
Amalgamation of BDD, parallel execution and mobile automation
PPTX
The State of Testing 2017
PPTX
Whitebox Testing for Blackbox Testers: Simplifying API Testing
PDF
Quality at Speed: More API Testing, Less UI Testing
PDF
Automation As An Ally
PDF
Turn Performance Testing up to 11
PPTX
Use Automation to Assist -Not Replace- Manual Testing
PDF
ATAGTR2017 Test Craftsmanship - From Effectiveness to Greatness
PDF
Adopting a security attitude in DevOps via DevOpsSec
PPTX
Oscon2014 Netflix API - Top 10 Lessons Learned
PPTX
QASymphony Atlanta Customer User Group Fall 2017
PDF
CP-SAT - Certified Professional Selenium Automation Testing
PDF
Measuring your way_to_successful_automation_webinar
ATAGTR2017 Artificial Intelligence in Software Testing – Demystified
ATAGTR2017 Detect Layout Bugs by Simulating Human Eye
ATAGTR2017 HikeRunner: Load Test Framework
Be a User first; then a Tester
ATAGTR2017 Security Test Driven Development (STDD)
ATAGTR2017 What Lies Beneath Robotics Process Automation
Digital Assurance - Today & Tomorrow
Amalgamation of BDD, parallel execution and mobile automation
The State of Testing 2017
Whitebox Testing for Blackbox Testers: Simplifying API Testing
Quality at Speed: More API Testing, Less UI Testing
Automation As An Ally
Turn Performance Testing up to 11
Use Automation to Assist -Not Replace- Manual Testing
ATAGTR2017 Test Craftsmanship - From Effectiveness to Greatness
Adopting a security attitude in DevOps via DevOpsSec
Oscon2014 Netflix API - Top 10 Lessons Learned
QASymphony Atlanta Customer User Group Fall 2017
CP-SAT - Certified Professional Selenium Automation Testing
Measuring your way_to_successful_automation_webinar
Ad

Similar to ATAGTR2017 Bee-Hive approach for Big Data Testing [End to End Continuous Test Automation solution for Big Data] (20)

PPTX
Skilwise Big data
PPTX
ATAGTR2017 Performance Testing and Non-Functional Testing Strategy for Big Da...
PPTX
Skillwise Big Data part 2
PPTX
Accelerate ROI with infa marketplace
PDF
Test Automation for Data Warehouses
PPTX
Enterprise large scale graph analytics and computing base on distribute graph...
PPTX
Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...
PPTX
Hadoop Summit 2017 Enterprise Graph Analytics
PDF
Automated Testing of Microsoft Power BI Reports
PDF
DataOps , cbuswaw April '23
PPTX
Hadoop summit 2017 enterprise graph analytics
PPTX
Data Warehouse Testing in the Pharmaceutical Industry
PPTX
DWBI Testing and Analytics Testing Services
PDF
Overview - IBM Big Data Platform
PPTX
ALIGNED Data Curation Methods and Tools
PPTX
Information Virtualization: Query Federation on Data Lakes
PDF
A Study in Borderless Over Perimeter
PPTX
Testing Big Data: Automated Testing of Hadoop with QuerySurge
PPTX
Customer Feedback Analytics for Starbucks
PPTX
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
Skilwise Big data
ATAGTR2017 Performance Testing and Non-Functional Testing Strategy for Big Da...
Skillwise Big Data part 2
Accelerate ROI with infa marketplace
Test Automation for Data Warehouses
Enterprise large scale graph analytics and computing base on distribute graph...
Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...
Hadoop Summit 2017 Enterprise Graph Analytics
Automated Testing of Microsoft Power BI Reports
DataOps , cbuswaw April '23
Hadoop summit 2017 enterprise graph analytics
Data Warehouse Testing in the Pharmaceutical Industry
DWBI Testing and Analytics Testing Services
Overview - IBM Big Data Platform
ALIGNED Data Curation Methods and Tools
Information Virtualization: Query Federation on Data Lakes
A Study in Borderless Over Perimeter
Testing Big Data: Automated Testing of Hadoop with QuerySurge
Customer Feedback Analytics for Starbucks
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
Ad

More from Agile Testing Alliance (20)

PPTX
#Interactive Session by Anindita Rath and Mahathee Dandibhotla, "From Good to...
PDF
#Interactive Session by Ajay Balamurugadas, "Where Are The Real Testers In T...
PPTX
#Interactive Session by Jishnu Nambiar and Mayur Ovhal, "Monitoring Web Per...
PDF
#Interactive Session by Pradipta Biswas and Sucheta Saurabh Chitale, "Navigat...
PDF
#Interactive Session by Apoorva Ram, "The Art of Storytelling for Testers" at...
PPTX
#Interactive Session by Nikhil Jain, "Catch All Mail With Graph" at #ATAGTR2023.
PPTX
#Interactive Session by Ashok Kumar S, "Test Data the key to robust test cove...
PPTX
#Interactive Session by Seema Kohli, "Test Leadership in the Era of Artificia...
PDF
#Interactive Session by Ashwini Lalit, RRR of Test Automation Maintenance" at...
PPTX
#Interactive Session by Srithanga Aishvarya T, "Machine Learning Model to aut...
PPTX
#Interactive Session by Kirti Ranjan Satapathy and Nandini K, "Elements of Qu...
PPTX
#Interactive Session by Sudhir Upadhyay and Ashish Kumar, "Strengthening Test...
PPTX
#Interactive Session by Sayan Deb Kundu, "Testing Gen AI Applications" at #AT...
PDF
#Interactive Session by Dinesh Boravke, "Zero Defects – Myth or Reality" at #...
PPTX
#Interactive Session by Saby Saurabh Bhardwaj, "Redefine Quality Assurance –...
PDF
#Keynote Session by Sanjay Kumar, "Innovation Inspired Testing!!" at #ATAGTR2...
PDF
#Keynote Session by Schalk Cronje, "Don’t Containerize me" at #ATAGTR2023.
PPTX
#Interactive Session by Chidambaram Vetrivel and Venkatesh Belde, "Revolution...
PDF
#Interactive Session by Aniket Diwakar Kadukar and Padimiti Vaidik Eswar Dat...
PPTX
#Interactive Session by Vivek Patle and Jahnavi Umarji, "Empowering Functiona...
#Interactive Session by Anindita Rath and Mahathee Dandibhotla, "From Good to...
#Interactive Session by Ajay Balamurugadas, "Where Are The Real Testers In T...
#Interactive Session by Jishnu Nambiar and Mayur Ovhal, "Monitoring Web Per...
#Interactive Session by Pradipta Biswas and Sucheta Saurabh Chitale, "Navigat...
#Interactive Session by Apoorva Ram, "The Art of Storytelling for Testers" at...
#Interactive Session by Nikhil Jain, "Catch All Mail With Graph" at #ATAGTR2023.
#Interactive Session by Ashok Kumar S, "Test Data the key to robust test cove...
#Interactive Session by Seema Kohli, "Test Leadership in the Era of Artificia...
#Interactive Session by Ashwini Lalit, RRR of Test Automation Maintenance" at...
#Interactive Session by Srithanga Aishvarya T, "Machine Learning Model to aut...
#Interactive Session by Kirti Ranjan Satapathy and Nandini K, "Elements of Qu...
#Interactive Session by Sudhir Upadhyay and Ashish Kumar, "Strengthening Test...
#Interactive Session by Sayan Deb Kundu, "Testing Gen AI Applications" at #AT...
#Interactive Session by Dinesh Boravke, "Zero Defects – Myth or Reality" at #...
#Interactive Session by Saby Saurabh Bhardwaj, "Redefine Quality Assurance –...
#Keynote Session by Sanjay Kumar, "Innovation Inspired Testing!!" at #ATAGTR2...
#Keynote Session by Schalk Cronje, "Don’t Containerize me" at #ATAGTR2023.
#Interactive Session by Chidambaram Vetrivel and Venkatesh Belde, "Revolution...
#Interactive Session by Aniket Diwakar Kadukar and Padimiti Vaidik Eswar Dat...
#Interactive Session by Vivek Patle and Jahnavi Umarji, "Empowering Functiona...

Recently uploaded (20)

PDF
cuic standard and advanced reporting.pdf
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Network Security Unit 5.pdf for BCA BBA.
PPTX
Big Data Technologies - Introduction.pptx
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
Electronic commerce courselecture one. Pdf
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPTX
Spectroscopy.pptx food analysis technology
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Approach and Philosophy of On baking technology
PPT
Teaching material agriculture food technology
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
KodekX | Application Modernization Development
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
cuic standard and advanced reporting.pdf
MYSQL Presentation for SQL database connectivity
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Network Security Unit 5.pdf for BCA BBA.
Big Data Technologies - Introduction.pptx
Encapsulation_ Review paper, used for researhc scholars
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Electronic commerce courselecture one. Pdf
Dropbox Q2 2025 Financial Results & Investor Presentation
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Spectroscopy.pptx food analysis technology
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Approach and Philosophy of On baking technology
Teaching material agriculture food technology
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
KodekX | Application Modernization Development
MIND Revenue Release Quarter 2 2025 Press Release
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf

ATAGTR2017 Bee-Hive approach for Big Data Testing [End to End Continuous Test Automation solution for Big Data]

  • 1. #ATAGTR201 7 16th 17th March Bee-Hive approach on Big Data testing - Usharani Subramanian
  • 2. Agile Testing Alliance Global Testing Retreat 2017 Structured Unstructured External Files - CSV Streaming Heterogeneous Sources ApplicationsINGESTION TRANSFORMATION ANALYTICS Data Lake BI & Visualization tools Functional UI applications Excel/CSV Reports
  • 3. Agile Testing Alliance Global Testing Retreat 2017 Big Data Ecosystem Testing Big Data Extraction Testing Data Transformation Testing Data analytics and Visualization Testing Non Functional Testing • Pre-Hadoop Validation • Meta Data Analysis and Validation • Impala & HDFS Data Storage Validation • Validation on Data Extraction from Source • Referential Integrity & Constraint Validation • Heterogeneous Data Integration Validation • MDM Validation • Data Quality Validation • Data Correctness/ Completeness Validation • Business Rule Validation • HDFS to SSAS validation • Dashboard Validation • Visualization Validation • Report generation and Validation • Performance Validation • Security Validation • Regulatory Compliance Validation Big Data Tools • Query Surge, Talend, Informatica - Functional • YCSB-CDH, JMX Statics, JMeter – Performance • OWASP ZAP, Spider – Security scanner • Manual UI Validations over Dashboards
  • 4. Agile Testing Alliance Global Testing Retreat 2017 Functional Automation & Data validation – QuerySurge, Informatica, Talend, etc Pre-hadoop validation – Amazon EMR, Sqoop, etc Performance Testing - YCSB-CDH, JMX Statics, Jmeter Security Testing – UI level through ZAP, Spider: other through Regulatory Compliance CI/CD Scheduler – Jenkins, oozie, etc
  • 5. Agile Testing Alliance Global Testing Retreat 2017 Workflow Data Testing: 1. UI > Impala: Scenarios derived from UI 2. Excel > ODBC > HDFS: Random Queries 3. ThirdParty tool > ODBC > HDFS: Random Queries System Integration Testing: 1. Big Data Ecosystem Validation (Architecture level Validation) 1. Data validation from Heterogeneous Data Sources to HDFS 2. Data validation from Data lake to Analytical server 3. Data validation at Application layer 4. UI validation of Application layer • Performance • Responsive Web Designing – Desired pixel? • Browser Compatibility – Desired browser? Performance Testing : Load and Stress 1. Query performance on Impala 2. UI level Performance validation Security Testing: 1. Security aspects covered as a part of Regulatory compliance (PCI, HIPPA, etc) 2. Application level Vulnerability scanning
  • 6. Agile Testing Alliance Global Testing Retreat 2017 • 4 Vs of Big Data – 5th V? • Big data is just the Data Engineering part – Preparatory Stage. ROI lies in data Science part of it • Real ROI of Big Data lies in Machine learning & Artificial Intelligence. How? • In Machine learning, the Big Data helps to build a better model that facilitates better business decisions
  • 7. Agile Testing Alliance Global Testing Retreat 2017 Data Engineering Clean Refine Transfor m Store Social Media Database s Internet of Things Streaming Data DataIngestion Data Science Extract Feature Vectors Apply Modelling Customer Personalization Feature Vectors Deduce model based on Self learning Algorithm Use existing data to make better business decisions 70% Training Data 30% Evaluation Data • How Big Data helps building the better Machine learning model?
  • 8. Agile Testing Alliance Global Testing Retreat 2017 Thank You