SlideShare a Scribd company logo
What is Big Data And Why learn Hadoop 
View Big Data and Hadoop Course at www.edureka.co/my-course/big-data-and-hadoop 
www.edureka.co/big-data-and-hadoop
How it Works? 
LIVE Online Class 
Class Recording in LMS 
24/7 Post Class Support 
Module Wise Quiz 
Project Work 
Verifiable Certificate 
Twitter @edurekaIN, Facebook /edurekaIN, Slide 2 use #askEdureka for Questions www.edureka.co/big-data-and-hadoop
Objectives 
 What is TDD ? 
 I Can’t follow TDD because… 
 Traditional Development Cycle Vs TDD 
 Why Unit Test Pig? 
 What is PigUnit? 
 TDD Using PigUnit- Demo 
Slide 3 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.co/big-data-and-hadoop
What is TDD? 
 TDD stands for Test Driven Development 
 Test Driven Development aims to shorten the development cycles 
 It aims to “get something now and perfect it later” approach 
 The typical process involves “RED-GREEN-REFACTOR” cycle 
 It’s a part of larger software design paradigm- “Extreme Programming” 
 Test Driven Development requires tests to be written before code itself! 
 It leads to a better code which is just enough to pass the tests 
 100% code coverage is ensured for TDD based code 
Slide 4 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.co/big-data-and-hadoop
I Can’t follow TDD Because… 
 “It’s working! Let’s freeze it for now” 
 The release date is quite aggressive! 
 It slows down our development cycle 
 We are already short staffed 
 What are Testers supposed to do? 
All (or possibly more) reasons above lead the teams for “Technical Debt” 
Slide 5 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.co/big-data-and-hadoop
-Albert Einstein 
Slide 6 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.co/big-data-and-hadoop
Time Taken to Fix Bugs 
1000 
750 
500 
250 
0 
Design Implementation QA Post-release 
Slide 7 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.co/big-data-and-hadoop
Traditional Development 
Test 
Design 
Implement 
Slide 8 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.co/big-data-and-hadoop
TDD 
Implement 
Design 
Test 
Test 
Slide 9 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.co/big-data-and-hadoop
TDD 
Design 
Test Test 
Implement 
Slide 10 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.co/big-data-and-hadoop
TDD 
Design 
Test Test 
Implement 
Slide 11 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.co/big-data-and-hadoop
Why Unit Test Pig? 
 Pig is NOT a programming language 
 Pig is a Data Flow Language 
 It just converts the Pig Latin data flows to Map-Reduce jobs 
 The best use-case for Pig in Big Data projects is for “Data Factory” operations 
 Since we are not talking about a “programming language”, does testing make sense? 
 Pig already comes with the diagnostic operators, so extra testing will be overhead! 
All of the above reasons lead to even bigger problems, as the testing in Big Data world is data driven in nature 
Slide 12 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.co/big-data-and-hadoop
What is PigUnit? 
 PigUnit is the Unit testing framework for Pig scripts 
 It is not really a *Unit framework 
 It’s a library which can be used within JUnit tests to 
» Run Pig scripts from within JUnit tests 
» Override variables in Pig scripts to provide data from tests rather than from external sources such as HDFS 
» Inspect the values of your Pig script relations 
» Make your STORE statements into no-ops so that your Pig scripts run without side effects. 
Slide 13 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.co/big-data-and-hadoop
TDD Using PigUnit - Demo 
Slide 14 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.co/big-data-and-hadoop
Introduction to Big data tdd and pig unit

More Related Content

PDF
Accelerating Product Delivery with Design Sprints
PPTX
Breathing the breath of the monster combining agile and context-driven
PPTX
A Happy Marriage between Context-Driven and Agile
PPTX
Psychology and Engineering of Testing
DOCX
Project scope Course Project 3
PPTX
Zero Bugs
PPTX
Test Estimation Hacks: Tips, Tricks and Tools Webinar
PDF
Lean Product Management User-Centered App Design
Accelerating Product Delivery with Design Sprints
Breathing the breath of the monster combining agile and context-driven
A Happy Marriage between Context-Driven and Agile
Psychology and Engineering of Testing
Project scope Course Project 3
Zero Bugs
Test Estimation Hacks: Tips, Tricks and Tools Webinar
Lean Product Management User-Centered App Design

What's hot (20)

PDF
How To Handle Exploding Complexity in Product Development
PPT
Week 5. Part4 - Hands On Activity
PPTX
5 whys nhsiq 2014
PPTX
SQT training - Technology Enabled Learning
PDF
Developer Nirvana
PDF
Dr. house would be a great product management
PPT
The Development Graveyard: How Software Projects Die
PPTX
Agile Practices
PPT
Images - 7 mistakes
PDF
Majcon at abap code_retreat_stjohann_2017_fast track tdd
PDF
Just test it - discovering new products and reducing risk through rapid proto...
PDF
Lessons about experiments
PPT
Herman- Pieter Nijhof - Where Do Old Testers Go?
PDF
WomenTechMakers - Why I have the best job!
PPTX
5 whys retro
PPTX
Pat Hermens - From 100 to 1,000+ deployments a day - Codemotion Amsterdam 2019
PDF
Deck 8983a1d9-68df-4447-8481-3b4fd0de734c-31(1)
PDF
10 Steps to Developing Great Ideas on time and on budget using Lean & Agile...
PPTX
Seven Mistakes During Devops Implementation
KEY
Five whys summary
How To Handle Exploding Complexity in Product Development
Week 5. Part4 - Hands On Activity
5 whys nhsiq 2014
SQT training - Technology Enabled Learning
Developer Nirvana
Dr. house would be a great product management
The Development Graveyard: How Software Projects Die
Agile Practices
Images - 7 mistakes
Majcon at abap code_retreat_stjohann_2017_fast track tdd
Just test it - discovering new products and reducing risk through rapid proto...
Lessons about experiments
Herman- Pieter Nijhof - Where Do Old Testers Go?
WomenTechMakers - Why I have the best job!
5 whys retro
Pat Hermens - From 100 to 1,000+ deployments a day - Codemotion Amsterdam 2019
Deck 8983a1d9-68df-4447-8481-3b4fd0de734c-31(1)
10 Steps to Developing Great Ideas on time and on budget using Lean & Agile...
Seven Mistakes During Devops Implementation
Five whys summary
Ad

Similar to Introduction to Big data tdd and pig unit (20)

PPTX
Why we fail at ml ai why we fail at ml_ai
PPTX
Leernetwerk cloud praktoraat engels
PDF
DevOps - the Future of Agile - Why/What/How - from Enterprise DevOps Israel 2015
PDF
AAC2025_Danninger_Fail fast succeed smarter.pdf
PDF
Continuous Delivery - The Next 10 years
PDF
Modeling and Performance Analysis of Scrumban with Test-Driven Development us...
PDF
Prototyping & User Testing
DOCX
mca online self
PDF
Adi Wijaya - Scrum in Data Science, What Works and What Doesn’t
PDF
Adi Wijaya - Scrum in Data Science, What Works and What Doesn’t
PPT
TDD with BizTalk
PPTX
[DSC Croatia 22] How we create and leverage data services in GitLab - Radovan...
PPT
PPTX
Measure Your DevOps Success: Using Goal-based KPIs to Drive Results and Demon...
PDF
Data science is not Software Development and how Experiment Management can ma...
PDF
[DSC MENA 24] Abdelrahman_Ghallab_-_Data_Product_mgmt.pdf
PPTX
Continuous Intelligence Workshop
PPTX
DevOps Dilemma - Make Dev work with Ops!
PDF
Scaling Test first for the Enterprise
PPT
Professional Drupal Development The Economis
Why we fail at ml ai why we fail at ml_ai
Leernetwerk cloud praktoraat engels
DevOps - the Future of Agile - Why/What/How - from Enterprise DevOps Israel 2015
AAC2025_Danninger_Fail fast succeed smarter.pdf
Continuous Delivery - The Next 10 years
Modeling and Performance Analysis of Scrumban with Test-Driven Development us...
Prototyping & User Testing
mca online self
Adi Wijaya - Scrum in Data Science, What Works and What Doesn’t
Adi Wijaya - Scrum in Data Science, What Works and What Doesn’t
TDD with BizTalk
[DSC Croatia 22] How we create and leverage data services in GitLab - Radovan...
Measure Your DevOps Success: Using Goal-based KPIs to Drive Results and Demon...
Data science is not Software Development and how Experiment Management can ma...
[DSC MENA 24] Abdelrahman_Ghallab_-_Data_Product_mgmt.pdf
Continuous Intelligence Workshop
DevOps Dilemma - Make Dev work with Ops!
Scaling Test first for the Enterprise
Professional Drupal Development The Economis
Ad

More from Edureka! (20)

PDF
What to learn during the 21 days Lockdown | Edureka
PDF
Top 10 Dying Programming Languages in 2020 | Edureka
PDF
Top 5 Trending Business Intelligence Tools | Edureka
PDF
Tableau Tutorial for Data Science | Edureka
PDF
Python Programming Tutorial | Edureka
PDF
Top 5 PMP Certifications | Edureka
PDF
Top Maven Interview Questions in 2020 | Edureka
PDF
Linux Mint Tutorial | Edureka
PDF
How to Deploy Java Web App in AWS| Edureka
PDF
Importance of Digital Marketing | Edureka
PDF
RPA in 2020 | Edureka
PDF
Email Notifications in Jenkins | Edureka
PDF
EA Algorithm in Machine Learning | Edureka
PDF
Cognitive AI Tutorial | Edureka
PDF
AWS Cloud Practitioner Tutorial | Edureka
PDF
Blue Prism Top Interview Questions | Edureka
PDF
Big Data on AWS Tutorial | Edureka
PDF
A star algorithm | A* Algorithm in Artificial Intelligence | Edureka
PDF
Kubernetes Installation on Ubuntu | Edureka
PDF
Introduction to DevOps | Edureka
What to learn during the 21 days Lockdown | Edureka
Top 10 Dying Programming Languages in 2020 | Edureka
Top 5 Trending Business Intelligence Tools | Edureka
Tableau Tutorial for Data Science | Edureka
Python Programming Tutorial | Edureka
Top 5 PMP Certifications | Edureka
Top Maven Interview Questions in 2020 | Edureka
Linux Mint Tutorial | Edureka
How to Deploy Java Web App in AWS| Edureka
Importance of Digital Marketing | Edureka
RPA in 2020 | Edureka
Email Notifications in Jenkins | Edureka
EA Algorithm in Machine Learning | Edureka
Cognitive AI Tutorial | Edureka
AWS Cloud Practitioner Tutorial | Edureka
Blue Prism Top Interview Questions | Edureka
Big Data on AWS Tutorial | Edureka
A star algorithm | A* Algorithm in Artificial Intelligence | Edureka
Kubernetes Installation on Ubuntu | Edureka
Introduction to DevOps | Edureka

Recently uploaded (20)

PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PPTX
Machine Learning_overview_presentation.pptx
PPTX
Cloud computing and distributed systems.
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
cuic standard and advanced reporting.pdf
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PDF
Spectral efficient network and resource selection model in 5G networks
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Electronic commerce courselecture one. Pdf
PPT
Teaching material agriculture food technology
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
Big Data Technologies - Introduction.pptx
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Encapsulation theory and applications.pdf
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
Spectroscopy.pptx food analysis technology
PPTX
MYSQL Presentation for SQL database connectivity
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Machine Learning_overview_presentation.pptx
Cloud computing and distributed systems.
Mobile App Security Testing_ A Comprehensive Guide.pdf
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
cuic standard and advanced reporting.pdf
NewMind AI Weekly Chronicles - August'25-Week II
Spectral efficient network and resource selection model in 5G networks
20250228 LYD VKU AI Blended-Learning.pptx
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Electronic commerce courselecture one. Pdf
Teaching material agriculture food technology
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Big Data Technologies - Introduction.pptx
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Encapsulation theory and applications.pdf
Dropbox Q2 2025 Financial Results & Investor Presentation
Spectroscopy.pptx food analysis technology
MYSQL Presentation for SQL database connectivity

Introduction to Big data tdd and pig unit

  • 1. What is Big Data And Why learn Hadoop View Big Data and Hadoop Course at www.edureka.co/my-course/big-data-and-hadoop www.edureka.co/big-data-and-hadoop
  • 2. How it Works? LIVE Online Class Class Recording in LMS 24/7 Post Class Support Module Wise Quiz Project Work Verifiable Certificate Twitter @edurekaIN, Facebook /edurekaIN, Slide 2 use #askEdureka for Questions www.edureka.co/big-data-and-hadoop
  • 3. Objectives  What is TDD ?  I Can’t follow TDD because…  Traditional Development Cycle Vs TDD  Why Unit Test Pig?  What is PigUnit?  TDD Using PigUnit- Demo Slide 3 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.co/big-data-and-hadoop
  • 4. What is TDD?  TDD stands for Test Driven Development  Test Driven Development aims to shorten the development cycles  It aims to “get something now and perfect it later” approach  The typical process involves “RED-GREEN-REFACTOR” cycle  It’s a part of larger software design paradigm- “Extreme Programming”  Test Driven Development requires tests to be written before code itself!  It leads to a better code which is just enough to pass the tests  100% code coverage is ensured for TDD based code Slide 4 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.co/big-data-and-hadoop
  • 5. I Can’t follow TDD Because…  “It’s working! Let’s freeze it for now”  The release date is quite aggressive!  It slows down our development cycle  We are already short staffed  What are Testers supposed to do? All (or possibly more) reasons above lead the teams for “Technical Debt” Slide 5 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.co/big-data-and-hadoop
  • 6. -Albert Einstein Slide 6 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.co/big-data-and-hadoop
  • 7. Time Taken to Fix Bugs 1000 750 500 250 0 Design Implementation QA Post-release Slide 7 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.co/big-data-and-hadoop
  • 8. Traditional Development Test Design Implement Slide 8 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.co/big-data-and-hadoop
  • 9. TDD Implement Design Test Test Slide 9 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.co/big-data-and-hadoop
  • 10. TDD Design Test Test Implement Slide 10 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.co/big-data-and-hadoop
  • 11. TDD Design Test Test Implement Slide 11 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.co/big-data-and-hadoop
  • 12. Why Unit Test Pig?  Pig is NOT a programming language  Pig is a Data Flow Language  It just converts the Pig Latin data flows to Map-Reduce jobs  The best use-case for Pig in Big Data projects is for “Data Factory” operations  Since we are not talking about a “programming language”, does testing make sense?  Pig already comes with the diagnostic operators, so extra testing will be overhead! All of the above reasons lead to even bigger problems, as the testing in Big Data world is data driven in nature Slide 12 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.co/big-data-and-hadoop
  • 13. What is PigUnit?  PigUnit is the Unit testing framework for Pig scripts  It is not really a *Unit framework  It’s a library which can be used within JUnit tests to » Run Pig scripts from within JUnit tests » Override variables in Pig scripts to provide data from tests rather than from external sources such as HDFS » Inspect the values of your Pig script relations » Make your STORE statements into no-ops so that your Pig scripts run without side effects. Slide 13 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.co/big-data-and-hadoop
  • 14. TDD Using PigUnit - Demo Slide 14 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions www.edureka.co/big-data-and-hadoop