Hadoop

CONTENT
 Introduction
 What is Hadoop?
 Hadoop Applications
 Hadoop Architecture
 Importance
 Advantages
 Disadvantages
 When to use Hadoop?
 Reference
3

 Hadoop is an Apache open source
framework written in java that allows
distributed processing of large datasets
across clusters of computers using simple
programming models.
 A Hadoop frame-worked application works in
an environment that provides distributed
storage and computation across clusters of
computers.
INTRODUCTION
4

 Hadoop is sub-project of Lucene (a
collection of industrial-strength search tools),
under the umbrella of the Apache Software
Foundation.
 Hadoop parallelizes data processing across
many nodes (computers) in a compute
cluster, speeding up large computations and
hiding I/O latency through increased
concurrency.
WHAT IS HADOOP?
5

 Making Hadoop Applications More Widely
Accessible
 A Graphical Abstraction Layer on Top of
Hadoop Applications
HADOOP APPLICATIONS
6

 Ability to store and process huge amounts of
any kind of data, quickly
 Computing power
 Fault tolerance
 Flexibility
 Low cost
 Scalability
WHY IS HADOOP IMPORTANT?
8

 Scalable
 Cost effective
 Flexible
 Fast
 Resilient to failure
ADVANTAGES OF HADOOP
9

 Security Concerns
 Not Fit for Small Data
 Potential Stability Issues
 General Limitations
DISADVANTAGES
10

 Hadoop Common (formerly Hadoop Core)
 Hadoop MapReduce
 Hadoop YARN (MapReduce 2.0)
 Hadoop Distributed File System (HDFS)
“CORE” HADOOP
12

 Ambari, Zookeeper (managing & monitoring)
 HBase, Cassandra (database)
 Hive, Pig (data warehouse and query language)
 Mahout (machine learning)
 Chukwa, Avro, Oozie, Giraph, and many more
THE WIDER HADOOP ECOSYSTEM
13

 Generally, always when “standard tools” don’t work
anymore because of sheer data size
(rule of thumb: if your data fits on a regular hard
drive, your better off sticking to
Python/SQL/Bash/etc.!)
 Aggregation across large data sets: use the power
of Reducers!
 Large-scale ETL operations (extract, transform,
load)
WHEN TO USE HADOOP?
14

REFERENCE
 www.google.com
 www.wikipedia.com
 www.studymafia.org
 www.projectsreports.org

Hadoop

More Related Content

What's hot (20)

Similar to Hadoop (20)

More from reddivarihareesh (15)

Recently uploaded (20)

Hadoop