SlideShare a Scribd company logo
Big Data & Hadoop
Agenda
• 1 Introduction to big data
• 2 Introduction to Hadoop
• 3 HDFS
• 4 Mapreduce
• 5 YARN
• 6 Hadoop ecosystem
• 7 Hadoop installation
• 8 MapReduce example
• 9 Apache sqoop tutorial
• 10 Apache flume tutorial
• 11 Apace pig
• 12 Apace hive
• 13 Apace Hbase
• 14 Hadoop project
Introduction to big data
• Evolution of Technology
• IOT
• Social Media
• Other Factors…
What is big Data?
• Big Data is the term for collection of data sets so large and complex
that it becomes difficult to process using on-hand database system
tools or traditional data processing application
5 v’s of Big Data
• Volume of Data
• Variety of Data  structured,semi-structured,un-structured
• Velocity
• Value  Mechanism to bring the correct meaning out of data
• Veracity  Uncertainty and inconsistencies in the data
Big Data Analytics
• Big data analytics examines large and different types of data to
uncover hidden patterns correlation and other insights.
• Stages:-
• Identify problem
• Designing data requirement
• Pre-processing data
• Performing Analytics on data
• Visualizing data
Types of Big Data Analytics
• Descriptive Analysis
• Predictive Analysis
• Prescriptive Analytics
• Diagnostic Analytics
Introduction to Hadoop
Apace hadoop: Framework to process Big
Data
• Haddop is a framework that allows us to store and process large data
sets in parlllel and distributed fashion.
Hadoop Master/slave Architrcture
HDFS
• HDFS core components:-
• Name Node
• Data Node
• Secondary Name node
Name Node and Data Node
Secondary NameNode & Checkpointing
HDFS Data Blocks
Fault Tolerance
HDFS Write Mechanism and
Acknowledgement
Mult-Block Write Mechanism
HDFS Read Mechanism
MAP REDUCE
What is MapReduce?
Word Count Program
Word Count Program
Mapper Code
Reduce Code
Driver Code
YARN
MapReduce Job Workflow
MapReduce Job Workflow
Hadoop Architecture
Cluster Modes

More Related Content

PPT
Big Data and Hadoop Basics
PPTX
Big data Analytics Hadoop
PDF
What is hadoop
PDF
Introduction to Bigdata and HADOOP
PPTX
Overview of Big data, Hadoop and Microsoft BI - version1
PPTX
Big Data and Hadoop
PPTX
Apache hadoop introduction and architecture
PPTX
Big Data and Hadoop
Big Data and Hadoop Basics
Big data Analytics Hadoop
What is hadoop
Introduction to Bigdata and HADOOP
Overview of Big data, Hadoop and Microsoft BI - version1
Big Data and Hadoop
Apache hadoop introduction and architecture
Big Data and Hadoop

What's hot (20)

PPTX
Hadoop project design and a usecase
PPTX
Big Data Concepts
PPTX
Hadoop and Big Data
PPTX
Big data ppt
PPTX
Big data analytics with hadoop volume 2
PPTX
Hadoop Tutorial For Beginners | Apache Hadoop Tutorial For Beginners | Hadoop...
PPTX
Big data concepts
PPTX
Hadoop and big data
PPTX
Big Data Technology Stack : Nutshell
PPTX
Whatisbigdataandwhylearnhadoop
PDF
Hadoop core concepts
PPTX
PPT on Hadoop
PPTX
HADOOP TECHNOLOGY ppt
PPTX
Big Data and Hadoop Introduction
PDF
Big Data technology Landscape
PDF
Big data Hadoop Analytic and Data warehouse comparison guide
PPTX
Big data and hadoop
PPTX
Big Data & Hadoop Tutorial
PDF
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
PPTX
Top Hadoop Big Data Interview Questions and Answers for Fresher
Hadoop project design and a usecase
Big Data Concepts
Hadoop and Big Data
Big data ppt
Big data analytics with hadoop volume 2
Hadoop Tutorial For Beginners | Apache Hadoop Tutorial For Beginners | Hadoop...
Big data concepts
Hadoop and big data
Big Data Technology Stack : Nutshell
Whatisbigdataandwhylearnhadoop
Hadoop core concepts
PPT on Hadoop
HADOOP TECHNOLOGY ppt
Big Data and Hadoop Introduction
Big Data technology Landscape
Big data Hadoop Analytic and Data warehouse comparison guide
Big data and hadoop
Big Data & Hadoop Tutorial
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Top Hadoop Big Data Interview Questions and Answers for Fresher
Ad

Similar to Big data & hadoop (20)

PPTX
IoT and Big Data - Iot Asia 2014
PPTX
Modul_1_Introduction_to_Big_Data.pptx
PPTX
Big data - Online Training
PDF
The Hadoop Ecosystem for Developers
PPTX
Big data
PPT
Big data and hadoop
PPTX
Big Data is a term used to describe vast and complicated collections of data
PPTX
Big Data is a term used to describe vast and complicated collections of data
PPTX
Introduction to Big Data and Hadoop
PPTX
Introduction to Big Data
PDF
Hadoop and the Data Warehouse: When to Use Which
PPTX
big_data_presentation with creativitty__
PDF
Data Integration and Data Warehousing for Cloud, Big Data and IoT: 
What’s Ne...
PPTX
Colorado Springs Open Source Hadoop/MySQL
PDF
Transform from database professional to a Big Data architect
PPT
Hadoop HDFS.ppt
PDF
Hadoop is dead - long live Hadoop | BiDaTA 2013 Genoa
PDF
Big_data_1674238705.ppt is a basic background
PPTX
Presentation on Big Data Analytics
IoT and Big Data - Iot Asia 2014
Modul_1_Introduction_to_Big_Data.pptx
Big data - Online Training
The Hadoop Ecosystem for Developers
Big data
Big data and hadoop
Big Data is a term used to describe vast and complicated collections of data
Big Data is a term used to describe vast and complicated collections of data
Introduction to Big Data and Hadoop
Introduction to Big Data
Hadoop and the Data Warehouse: When to Use Which
big_data_presentation with creativitty__
Data Integration and Data Warehousing for Cloud, Big Data and IoT: 
What’s Ne...
Colorado Springs Open Source Hadoop/MySQL
Transform from database professional to a Big Data architect
Hadoop HDFS.ppt
Hadoop is dead - long live Hadoop | BiDaTA 2013 Genoa
Big_data_1674238705.ppt is a basic background
Presentation on Big Data Analytics
Ad

Recently uploaded (20)

PDF
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
PPTX
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
PPTX
UNIT 4 Total Quality Management .pptx
PDF
PPT on Performance Review to get promotions
PPTX
Lesson 3_Tessellation.pptx finite Mathematics
PDF
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
PPTX
Lecture Notes Electrical Wiring System Components
PPTX
Geodesy 1.pptx...............................................
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PPTX
Welding lecture in detail for understanding
PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PDF
Operating System & Kernel Study Guide-1 - converted.pdf
PPTX
additive manufacturing of ss316l using mig welding
PPT
Project quality management in manufacturing
PPTX
OOP with Java - Java Introduction (Basics)
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PPTX
bas. eng. economics group 4 presentation 1.pptx
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
UNIT 4 Total Quality Management .pptx
PPT on Performance Review to get promotions
Lesson 3_Tessellation.pptx finite Mathematics
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
Lecture Notes Electrical Wiring System Components
Geodesy 1.pptx...............................................
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
Welding lecture in detail for understanding
Embodied AI: Ushering in the Next Era of Intelligent Systems
Operating System & Kernel Study Guide-1 - converted.pdf
additive manufacturing of ss316l using mig welding
Project quality management in manufacturing
OOP with Java - Java Introduction (Basics)
UNIT-1 - COAL BASED THERMAL POWER PLANTS
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
bas. eng. economics group 4 presentation 1.pptx
Mitigating Risks through Effective Management for Enhancing Organizational Pe...

Big data & hadoop