SlideShare a Scribd company logo
Hadoop
Hadoop
CONTENT
 Introduction
 What is Hadoop?
 Hadoop Applications
 Hadoop Architecture
 Importance
 Advantages
 Disadvantages
 When to use Hadoop?
 Reference
3
 Hadoop is an Apache open source
framework written in java that allows
distributed processing of large datasets
across clusters of computers using simple
programming models.
 A Hadoop frame-worked application works in
an environment that provides distributed
storage and computation across clusters of
computers.
INTRODUCTION
4
 Hadoop is sub-project of Lucene (a
collection of industrial-strength search tools),
under the umbrella of the Apache Software
Foundation.
 Hadoop parallelizes data processing across
many nodes (computers) in a compute
cluster, speeding up large computations and
hiding I/O latency through increased
concurrency.
WHAT IS HADOOP?
5
 Making Hadoop Applications More Widely
Accessible
 A Graphical Abstraction Layer on Top of
Hadoop Applications
HADOOP APPLICATIONS
6
HADOOP ARCHITECTURE
7
 Ability to store and process huge amounts of
any kind of data, quickly
 Computing power
 Fault tolerance
 Flexibility
 Low cost
 Scalability
WHY IS HADOOP IMPORTANT?
8
 Scalable
 Cost effective
 Flexible
 Fast
 Resilient to failure
ADVANTAGES OF HADOOP
9
 Security Concerns
 Not Fit for Small Data
 Potential Stability Issues
 General Limitations
DISADVANTAGES
10
CONTRIBUTIONS 2006 - 2011
11
 Hadoop Common (formerly Hadoop Core)
 Hadoop MapReduce
 Hadoop YARN (MapReduce 2.0)
 Hadoop Distributed File System (HDFS)
“CORE” HADOOP
12
 Ambari, Zookeeper (managing & monitoring)
 HBase, Cassandra (database)
 Hive, Pig (data warehouse and query language)
 Mahout (machine learning)
 Chukwa, Avro, Oozie, Giraph, and many more
THE WIDER HADOOP ECOSYSTEM
13
 Generally, always when “standard tools” don’t work
anymore because of sheer data size
(rule of thumb: if your data fits on a regular hard
drive, your better off sticking to
Python/SQL/Bash/etc.!)
 Aggregation across large data sets: use the power
of Reducers!
 Large-scale ETL operations (extract, transform,
load)
WHEN TO USE HADOOP?
14
REFERENCE
 www.google.com
 www.wikipedia.com
 www.studymafia.org
 www.projectsreports.org
Thank You
ALL

More Related Content

PPTX
Hadoop Architecture
PPTX
Hadoop And Their Ecosystem
PPTX
HADOOP TECHNOLOGY ppt
PPTX
PPT on Hadoop
PPT
Hadoop distributions - ecosystem
PPTX
Hadoop
PDF
Hadoop ecosystem
PDF
Hadoop Ecosystem
Hadoop Architecture
Hadoop And Their Ecosystem
HADOOP TECHNOLOGY ppt
PPT on Hadoop
Hadoop distributions - ecosystem
Hadoop
Hadoop ecosystem
Hadoop Ecosystem

What's hot (20)

PPT
Hadoop hive presentation
PPTX
Hadoop
PPT
Hadoop technology
PPT
Hadoop Technologies
PPTX
Hadoop Technology
PDF
Hadoop ecosystem J.AYEESHA PARVEEN II-M.SC.,COMPUTER SCIENCE, BON SECOURS CO...
PPTX
Apache hadoop technology : Beginners
PPTX
PPTX
Big Data and Hadoop - An Introduction
PPTX
Hadoop vs Apache Spark
DOCX
PDF
Hadoop vs Spark | Which One to Choose? | Hadoop Training | Spark Training | E...
PPTX
Big data and tools
PPTX
Hadoop An Introduction
PDF
Big Data and Hadoop Ecosystem
PPTX
Design of Hadoop Distributed File System
PPTX
Hadoop introduction
PDF
Hadoop ecosystem; J.Ayeesha parveen 2 nd M.sc., computer science Bon Secours...
ODP
Hadoop introduction
Hadoop hive presentation
Hadoop
Hadoop technology
Hadoop Technologies
Hadoop Technology
Hadoop ecosystem J.AYEESHA PARVEEN II-M.SC.,COMPUTER SCIENCE, BON SECOURS CO...
Apache hadoop technology : Beginners
Big Data and Hadoop - An Introduction
Hadoop vs Apache Spark
Hadoop vs Spark | Which One to Choose? | Hadoop Training | Spark Training | E...
Big data and tools
Hadoop An Introduction
Big Data and Hadoop Ecosystem
Design of Hadoop Distributed File System
Hadoop introduction
Hadoop ecosystem; J.Ayeesha parveen 2 nd M.sc., computer science Bon Secours...
Hadoop introduction
Ad

Similar to Hadoop (20)

PPT
Hadoop
PPTX
Hadoopppt.pptx
PDF
Hadoop Application Architectures Mark Grover Ted Malaska Jonathan Seidman Gwe...
PPTX
Hadoo its a good pdf to read some notes p.pptx
PPTX
Hadoop
PPTX
Hadoop technology
PDF
B.MONICA II M.SC COMPUTER SCIENCE
PPTX
002 Introduction to hadoop v3
PPTX
Big Data Training in Mohali
PDF
Hadoop .pdf
PPTX
Apache hadoop technology : Beginners
PPTX
Apache hadoop technology : Beginners
PPTX
The Apache Hadoop software library is a framework that allows for the distrib...
PPTX
Big Data Training in Amritsar
PPTX
Hadoop and Big Data: Revealed
ODP
Hadoop seminar
PPTX
hadoop-ecosystem-ppt.pptx
PPTX
Big Data Training in Ludhiana
PPT
Introduction to Apache hadoop
Hadoop
Hadoopppt.pptx
Hadoop Application Architectures Mark Grover Ted Malaska Jonathan Seidman Gwe...
Hadoo its a good pdf to read some notes p.pptx
Hadoop
Hadoop technology
B.MONICA II M.SC COMPUTER SCIENCE
002 Introduction to hadoop v3
Big Data Training in Mohali
Hadoop .pdf
Apache hadoop technology : Beginners
Apache hadoop technology : Beginners
The Apache Hadoop software library is a framework that allows for the distrib...
Big Data Training in Amritsar
Hadoop and Big Data: Revealed
Hadoop seminar
hadoop-ecosystem-ppt.pptx
Big Data Training in Ludhiana
Introduction to Apache hadoop
Ad

More from reddivarihareesh (15)

PPTX
Network protocals
PPTX
PPTX
Java script
PPTX
J servlets
PPT
Internet of things
PPT
Hibernate
PPTX
Google glass
PPTX
PPTX
Filezilla
PPTX
Cashcading stylesheets
PPTX
Cluster computing
PPTX
Blue brain
PPTX
Artificial intelligence
PPTX
Network protocals
Java script
J servlets
Internet of things
Hibernate
Google glass
Filezilla
Cashcading stylesheets
Cluster computing
Blue brain
Artificial intelligence

Recently uploaded (20)

PPT
Chapter 2 METAL FORMINGhhhhhhhjjjjmmmmmmmmm
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PPTX
1_Introduction to advance data techniques.pptx
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PPTX
Supervised vs unsupervised machine learning algorithms
PPTX
Global journeys: estimating international migration
PPTX
IB Computer Science - Internal Assessment.pptx
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PPTX
Business Acumen Training GuidePresentation.pptx
PPTX
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PPT
Reliability_Chapter_ presentation 1221.5784
PDF
Introduction to Business Data Analytics.
PPTX
Introduction to Knowledge Engineering Part 1
PPTX
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
PPT
Miokarditis (Inflamasi pada Otot Jantung)
Chapter 2 METAL FORMINGhhhhhhhjjjjmmmmmmmmm
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
1_Introduction to advance data techniques.pptx
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
oil_refinery_comprehensive_20250804084928 (1).pptx
Supervised vs unsupervised machine learning algorithms
Global journeys: estimating international migration
IB Computer Science - Internal Assessment.pptx
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
Business Acumen Training GuidePresentation.pptx
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
Reliability_Chapter_ presentation 1221.5784
Introduction to Business Data Analytics.
Introduction to Knowledge Engineering Part 1
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
Miokarditis (Inflamasi pada Otot Jantung)

Hadoop

  • 3. CONTENT  Introduction  What is Hadoop?  Hadoop Applications  Hadoop Architecture  Importance  Advantages  Disadvantages  When to use Hadoop?  Reference 3
  • 4.  Hadoop is an Apache open source framework written in java that allows distributed processing of large datasets across clusters of computers using simple programming models.  A Hadoop frame-worked application works in an environment that provides distributed storage and computation across clusters of computers. INTRODUCTION 4
  • 5.  Hadoop is sub-project of Lucene (a collection of industrial-strength search tools), under the umbrella of the Apache Software Foundation.  Hadoop parallelizes data processing across many nodes (computers) in a compute cluster, speeding up large computations and hiding I/O latency through increased concurrency. WHAT IS HADOOP? 5
  • 6.  Making Hadoop Applications More Widely Accessible  A Graphical Abstraction Layer on Top of Hadoop Applications HADOOP APPLICATIONS 6
  • 8.  Ability to store and process huge amounts of any kind of data, quickly  Computing power  Fault tolerance  Flexibility  Low cost  Scalability WHY IS HADOOP IMPORTANT? 8
  • 9.  Scalable  Cost effective  Flexible  Fast  Resilient to failure ADVANTAGES OF HADOOP 9
  • 10.  Security Concerns  Not Fit for Small Data  Potential Stability Issues  General Limitations DISADVANTAGES 10
  • 12.  Hadoop Common (formerly Hadoop Core)  Hadoop MapReduce  Hadoop YARN (MapReduce 2.0)  Hadoop Distributed File System (HDFS) “CORE” HADOOP 12
  • 13.  Ambari, Zookeeper (managing & monitoring)  HBase, Cassandra (database)  Hive, Pig (data warehouse and query language)  Mahout (machine learning)  Chukwa, Avro, Oozie, Giraph, and many more THE WIDER HADOOP ECOSYSTEM 13
  • 14.  Generally, always when “standard tools” don’t work anymore because of sheer data size (rule of thumb: if your data fits on a regular hard drive, your better off sticking to Python/SQL/Bash/etc.!)  Aggregation across large data sets: use the power of Reducers!  Large-scale ETL operations (extract, transform, load) WHEN TO USE HADOOP? 14
  • 15. REFERENCE  www.google.com  www.wikipedia.com  www.studymafia.org  www.projectsreports.org