SlideShare a Scribd company logo
ZooKeeper
BIG DATA TOOLS
BIG DATA TOOLS
BIG DATA TOOLS
& LIBRARIES
& LIBRARIES
& LIBRARIES
APACHE
Flume
Apache Hadoop is an open-source platform for storing and
processing vast amounts of data ranging from gigabytes to
petabytes.
Apache Spark is a distributed processing solution for big
data workloads that's also open-source.
Hive is an Apache Hadoop-based open-source framework
used for storing and processing large datasets. It is a SQL-
based database that allows users to read, write, and manage
petabytes of data.
HBase is a Hadoop Distributed File System (HDFS) based
column-oriented non-relational database management
system. It is a fault-tolerant storage system for sparse data
sets, standard in many big data applications.
Pig is a high-level scripting language used in conjunction
with Apache Hadoop. Pig processes data from various
sources, both structured and unstructured, and stores the
findings in Hadoop's Data File System.
Apache Flume is an open-source framework for big data
Hadoop. The primary aim of this framework is to provide a
single platform for distributed querying and analysis for big
data in a manner transparent to the end-user.
APACHE
Flume
Hadoop MapReduce is an Apache open-source software
framework used for distributed processing of large data sets
on decentralized networks. Based on Java, the Hadoop
MapReduce framework is one of the most commonly used
technologies for storing, managing, and analyzing big data.
Pig is a high-level scripting language used in conjunction
with Apache Hadoop. Pig processes data from various
sources, both structured and unstructured, and stores the
findings in Hadoop's Data File System.
ZooKeeper
YARN is one of Apache Hadoop's main components, and it's
in charge of assigning system resources to the many
applications operating in a Hadoop cluster and scheduling
tasks to run on different cluster nodes.
Apache Zookeeper is an open-source server that provides
centralized management for distributed applications and
services.
Python's great data processing speed makes it ideal for use
with Big Data. Because of its simple syntax and easy-to-
manage code, Python scripts are run at a fraction of the time
required by other programming languages.
Hadoop User Experience (HUE) is an open-source interface
that simplifies the use of Apache Hadoop.
Get A
Call Us Today
+91 86009 98107
70287 10777
of Recorded Live Session
Like Comment Share
Save for Later

More Related Content

PPTX
Big Data Technology Stack : Nutshell
PPTX
Hadoop An Introduction
PPTX
Brief Introduction about Hadoop and Core Services.
PDF
BIGDATA ppts
PPTX
hadoop eco system regarding big data analytics.pptx
PDF
What is Apache Hadoop and its ecosystem?
PPT
Introduction to Apache hadoop
PPTX
Apache hadoop introduction and architecture
Big Data Technology Stack : Nutshell
Hadoop An Introduction
Brief Introduction about Hadoop and Core Services.
BIGDATA ppts
hadoop eco system regarding big data analytics.pptx
What is Apache Hadoop and its ecosystem?
Introduction to Apache hadoop
Apache hadoop introduction and architecture

Similar to Big Data Tools & Libraries (20)

PPTX
Introduction to bigdata
PDF
Hadoop ecosystem J.AYEESHA PARVEEN II-M.SC.,COMPUTER SCIENCE, BON SECOURS CO...
PDF
How Java Empowers Significant Advances in Big Data.pdf
PPTX
ch 01B Introduction to Hadoop components
PDF
Introduction To Hadoop Administration - SpringPeople
PPT
unit-3bda-230421082621-d2b7d921.ppthjghh
PPTX
In15orlesss hadoop
ODP
Hadoop introduction
PPTX
Hadoop basics
PPTX
Hadoop vs Apache Spark
PPTX
What does the future of Big data look like?How to get a fresher job in data a...
PPTX
Getting started big data
PPTX
Big data
PPTX
Bigdata
PPTX
Bigdata ppt
PDF
Hadoop ecosystem; J.Ayeesha parveen 2 nd M.sc., computer science Bon Secours...
PPTX
Apache hadoop
PPTX
Best institute for Hadoop in gurgaon
PPTX
Overview of Big data, Hadoop and Microsoft BI - version1
Introduction to bigdata
Hadoop ecosystem J.AYEESHA PARVEEN II-M.SC.,COMPUTER SCIENCE, BON SECOURS CO...
How Java Empowers Significant Advances in Big Data.pdf
ch 01B Introduction to Hadoop components
Introduction To Hadoop Administration - SpringPeople
unit-3bda-230421082621-d2b7d921.ppthjghh
In15orlesss hadoop
Hadoop introduction
Hadoop basics
Hadoop vs Apache Spark
What does the future of Big data look like?How to get a fresher job in data a...
Getting started big data
Big data
Bigdata
Bigdata ppt
Hadoop ecosystem; J.Ayeesha parveen 2 nd M.sc., computer science Bon Secours...
Apache hadoop
Best institute for Hadoop in gurgaon
Overview of Big data, Hadoop and Microsoft BI - version1
Ad

More from sunil173422 (20)

PDF
Basic Linux Commands Used In AWS
PDF
Most-Popular-Backend-Framework.pdf
PDF
What Is BootStrap?
PDF
Comparison Between Excel Vs SQL
PDF
How to visualize Number using Python Library
PDF
Comparison Between Java And Python
PDF
Hybrid Cloud Vs Multiple Cloud
PDF
Comparison Between react js & react native
PDF
ETL Testing Vs Database Testing
PDF
ETL Development Learning path
PDF
Data Science Top Companies
PDF
Life Cycle Of Data Science Project
PDF
Components of CI/CD in DevOps
PDF
DevOps Site Reliability Engineer Vs DevOps
PDF
Devops learning path
PDF
Learn Data Science With Python
PDF
Comparison ETL Vs ELT
PDF
Relationship Between DevOps and Cloud
PDF
What is Big Data Pipe?
PDF
What is Big Data?
Basic Linux Commands Used In AWS
Most-Popular-Backend-Framework.pdf
What Is BootStrap?
Comparison Between Excel Vs SQL
How to visualize Number using Python Library
Comparison Between Java And Python
Hybrid Cloud Vs Multiple Cloud
Comparison Between react js & react native
ETL Testing Vs Database Testing
ETL Development Learning path
Data Science Top Companies
Life Cycle Of Data Science Project
Components of CI/CD in DevOps
DevOps Site Reliability Engineer Vs DevOps
Devops learning path
Learn Data Science With Python
Comparison ETL Vs ELT
Relationship Between DevOps and Cloud
What is Big Data Pipe?
What is Big Data?
Ad

Recently uploaded (20)

PDF
Sports Quiz easy sports quiz sports quiz
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PDF
Classroom Observation Tools for Teachers
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
 
PDF
Computing-Curriculum for Schools in Ghana
PDF
Pre independence Education in Inndia.pdf
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
 
PPTX
GDM (1) (1).pptx small presentation for students
PDF
O7-L3 Supply Chain Operations - ICLT Program
PDF
Complications of Minimal Access Surgery at WLH
PDF
Basic Mud Logging Guide for educational purpose
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PDF
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PPTX
PPH.pptx obstetrics and gynecology in nursing
Sports Quiz easy sports quiz sports quiz
Module 4: Burden of Disease Tutorial Slides S2 2025
Classroom Observation Tools for Teachers
STATICS OF THE RIGID BODIES Hibbelers.pdf
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
 
Computing-Curriculum for Schools in Ghana
Pre independence Education in Inndia.pdf
Pharmacology of Heart Failure /Pharmacotherapy of CHF
Final Presentation General Medicine 03-08-2024.pptx
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
 
GDM (1) (1).pptx small presentation for students
O7-L3 Supply Chain Operations - ICLT Program
Complications of Minimal Access Surgery at WLH
Basic Mud Logging Guide for educational purpose
102 student loan defaulters named and shamed – Is someone you know on the list?
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
Abdominal Access Techniques with Prof. Dr. R K Mishra
PPH.pptx obstetrics and gynecology in nursing

Big Data Tools & Libraries

  • 1. ZooKeeper BIG DATA TOOLS BIG DATA TOOLS BIG DATA TOOLS & LIBRARIES & LIBRARIES & LIBRARIES APACHE Flume
  • 2. Apache Hadoop is an open-source platform for storing and processing vast amounts of data ranging from gigabytes to petabytes. Apache Spark is a distributed processing solution for big data workloads that's also open-source.
  • 3. Hive is an Apache Hadoop-based open-source framework used for storing and processing large datasets. It is a SQL- based database that allows users to read, write, and manage petabytes of data. HBase is a Hadoop Distributed File System (HDFS) based column-oriented non-relational database management system. It is a fault-tolerant storage system for sparse data sets, standard in many big data applications.
  • 4. Pig is a high-level scripting language used in conjunction with Apache Hadoop. Pig processes data from various sources, both structured and unstructured, and stores the findings in Hadoop's Data File System. Apache Flume is an open-source framework for big data Hadoop. The primary aim of this framework is to provide a single platform for distributed querying and analysis for big data in a manner transparent to the end-user. APACHE Flume
  • 5. Hadoop MapReduce is an Apache open-source software framework used for distributed processing of large data sets on decentralized networks. Based on Java, the Hadoop MapReduce framework is one of the most commonly used technologies for storing, managing, and analyzing big data. Pig is a high-level scripting language used in conjunction with Apache Hadoop. Pig processes data from various sources, both structured and unstructured, and stores the findings in Hadoop's Data File System.
  • 6. ZooKeeper YARN is one of Apache Hadoop's main components, and it's in charge of assigning system resources to the many applications operating in a Hadoop cluster and scheduling tasks to run on different cluster nodes. Apache Zookeeper is an open-source server that provides centralized management for distributed applications and services.
  • 7. Python's great data processing speed makes it ideal for use with Big Data. Because of its simple syntax and easy-to- manage code, Python scripts are run at a fraction of the time required by other programming languages. Hadoop User Experience (HUE) is an open-source interface that simplifies the use of Apache Hadoop.
  • 8. Get A Call Us Today +91 86009 98107 70287 10777 of Recorded Live Session