SlideShare a Scribd company logo
BigDATA EcoSystem
@EdPimentl
Distributed Filesystem
Apache HDFS
Red Hat GlusterFS
NoSQL Databases
Apache Hbase
Apache Cassandra
Key-Value Data Model
Redis DB
LinkedIN Voldermort
Distributed Filesystem
Apache HDFS
Red Hat GlusterFS
Distributed Programming
Apache MapReduce
Apache Pig
Document Data Model
MongoDB
RethinkDB
Graph Data Model
ArangoDB
TitanDB
Distributed Filesystem
Apache HDFS
Red Hat GlusterFS
Here is a limited list of the BigData Ecosystem
@EdPimentl
Data Ingestion
Apache Flume
Apache Storm
Scheduling
Apache Falcon
Apache Oozie
System Development
Apache Ambari
Cloudera HUE
Apache Mesos
Service Programming
Apache Zookeeper
LinkedIn Norbert
Twitter Elephant Bird
Machine Learning
WEKA
Cloudera Oryx
Apache Mahout
Others
Accumulo
SQL-on-Hadoop
Apache Hive
Apache Drill
Here is a limited list of the BigData Ecosystem
@EdPimentl
What is a Byte, Kilobyte, Megabyte, Gigabyte, Terabyte, Petabyte, and Exabyte?
Bytes(8 bits)
0.1 bytes:A binary decision
Kilobyte (1000 bytes)
2 Kilobytes:A Typewritten page
Megabyte (1 000 000 bytes)
2 Megabytes:A high resolution photograph
Gigabyte (1 000 000 000 bytes)
1 Gigabyte:A pickup truck filled with paper OR A symphony in high-fidelity sound OR A movie at TV quality
Terabyte (1 000 000 000 000 bytes)
10 Terabytes:The printed collection of the US Library of Congress
Petabyte (1 000 000 000 000 000 bytes)
2 Petabytes:All US academic research libraries
20 Petabytes: Production of hard-disk drives in 1995
Exabyte (1 000 000 000 000 000 000 bytes)
5 Exabytes:All words ever spoken by human beings
Nice description by Julian Bunn
Related Links
Open Data will hit every industry sector within 10 years https://lnkd.in/eBbzTY7
http://blog.knuthaugen.no/2010/03/a-brief-history-of-nosql.html
http://www.zdnet.com/article/traditional-databases-vs-the-threat-from-in-memory-nosql/?_escaped_fragment_=#!
http://arstechnica.com/information-technology/2013/07/the-hot-new-technology-in-big-data-is-decades-old-sql/
@EdPimentl

More Related Content

PPTX
Frequent itemset mining_on_hadoop
PDF
A Hadoop Primer
PPTX
Big Data Processing with Hadoop-MapReduce in Cloud Systems
PPTX
2014 moore-ddd
PPTX
Intro to cassandra + hadoop
PDF
SparkR: The Past, the Present and the Future-(Shivaram Venkataraman and Rui S...
PPTX
PDF
Spark's Role in the Big Data Ecosystem (Spark Summit 2014)
Frequent itemset mining_on_hadoop
A Hadoop Primer
Big Data Processing with Hadoop-MapReduce in Cloud Systems
2014 moore-ddd
Intro to cassandra + hadoop
SparkR: The Past, the Present and the Future-(Shivaram Venkataraman and Rui S...
Spark's Role in the Big Data Ecosystem (Spark Summit 2014)

What's hot (20)

PDF
Low latency access of bigdata using spark and shark
PDF
Insight_150115_Demo
PPTX
Fault Tolerance in HDFS
PDF
Big Data Programming Using Hadoop Workshop
KEY
Cassandra eu
PPTX
Genome-scale Big Data Pipelines
PPTX
Bioinformatics Data Pipelines built by CSIRO on AWS
PPTX
Hadoop admiin demo
PDF
ESIP 2018 - The Case for Archives of Convenience
PPTX
Big Data & Hadoop Data Analysis
PDF
FAST DATA PROCESSING WITH APACHE SPARK
PDF
Using ElasticSearch as a fast, flexible, and scalable solution to search occu...
PPTX
Genomic Scale Big Data Pipelines
PDF
Building Open Data Lakes on AWS with Debezium and Apache Hudi
PDF
Spark what's new what's coming
PPTX
DataStructure Concepts-HEAP,HASH,Graph
PDF
Apache Con Eu2008 Hadoop Tour Tom White
PDF
Introduction to Spark R with R studio - Mr. Pragith
PDF
Hadoop vs Spark | Which One to Choose? | Hadoop Training | Spark Training | E...
PPTX
Big Data
Low latency access of bigdata using spark and shark
Insight_150115_Demo
Fault Tolerance in HDFS
Big Data Programming Using Hadoop Workshop
Cassandra eu
Genome-scale Big Data Pipelines
Bioinformatics Data Pipelines built by CSIRO on AWS
Hadoop admiin demo
ESIP 2018 - The Case for Archives of Convenience
Big Data & Hadoop Data Analysis
FAST DATA PROCESSING WITH APACHE SPARK
Using ElasticSearch as a fast, flexible, and scalable solution to search occu...
Genomic Scale Big Data Pipelines
Building Open Data Lakes on AWS with Debezium and Apache Hudi
Spark what's new what's coming
DataStructure Concepts-HEAP,HASH,Graph
Apache Con Eu2008 Hadoop Tour Tom White
Introduction to Spark R with R studio - Mr. Pragith
Hadoop vs Spark | Which One to Choose? | Hadoop Training | Spark Training | E...
Big Data
Ad

Viewers also liked (20)

PDF
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
PPTX
Introducing the Big Data Ecosystem with Caserta Concepts & Talend
PDF
Evaluating Software Architectures
PPTX
Talend AS A Product
KEY
Sybase To Oracle Migration for DBAs
PDF
Application retirement road_map_for_legacy_applications
PDF
Towards Neuro–Information Science
PPTX
KNOWLEDGE SCIENCE; NOT INFORMATION SCIENCE OR TECHNOLOGY- SCOPE,THEORIES AND...
PDF
Big Data and Hadoop - key drivers, ecosystem and use cases
PPTX
Simplifying Big Data ETL with Talend
PPTX
Big data + data science startup focus points
PDF
How to create intelligent Business Processes thanks to Big Data (BPM, Apache ...
PDF
Sharing & Sustaining Ecosystem Data
PDF
Semiotics and Information Science
PDF
Talend Introduction by TSI
PPTX
Real time data services
PDF
Real Time Big Data
PDF
Big data ecosystem
PPTX
Talend Big Data Capabilities Overview
PPTX
Big Data Ecosystem
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Introducing the Big Data Ecosystem with Caserta Concepts & Talend
Evaluating Software Architectures
Talend AS A Product
Sybase To Oracle Migration for DBAs
Application retirement road_map_for_legacy_applications
Towards Neuro–Information Science
KNOWLEDGE SCIENCE; NOT INFORMATION SCIENCE OR TECHNOLOGY- SCOPE,THEORIES AND...
Big Data and Hadoop - key drivers, ecosystem and use cases
Simplifying Big Data ETL with Talend
Big data + data science startup focus points
How to create intelligent Business Processes thanks to Big Data (BPM, Apache ...
Sharing & Sustaining Ecosystem Data
Semiotics and Information Science
Talend Introduction by TSI
Real time data services
Real Time Big Data
Big data ecosystem
Talend Big Data Capabilities Overview
Big Data Ecosystem
Ad

Similar to Big data ecosystem (20)

DOCX
Big Data A La Carte Menu
PPTX
Big data-denis-rothman
PPSX
Introduction to Bigdata & Hadoop
PDF
EclipseCon Keynote: Apache Hadoop - An Introduction
PDF
Lesson 1 introduction to_big_data_and_hadoop.pptx
PPTX
Big data analytics.
PPTX
BDA ( haoop ).pptx
PPTX
Big data - Apache Hadoop for Beginner's
PPTX
Big data and Hadoop
PPTX
Big data ppt
ODP
Hadoop introduction
PPT
Final deck
PPTX
BigDataInMedicine.pptx
PPTX
Big data overview
PPTX
MODULE 1: Introduction to Big Data Analytics.pptx
PPTX
Big Data Warsaw v 4 I "The Role of Hadoop Ecosystem in Advance Analytics" - R...
PDF
An introduction to Big Data
PPTX
Relational databases for BigData
PPTX
Big Data In Medicine
PPTX
Introduction to big data
Big Data A La Carte Menu
Big data-denis-rothman
Introduction to Bigdata & Hadoop
EclipseCon Keynote: Apache Hadoop - An Introduction
Lesson 1 introduction to_big_data_and_hadoop.pptx
Big data analytics.
BDA ( haoop ).pptx
Big data - Apache Hadoop for Beginner's
Big data and Hadoop
Big data ppt
Hadoop introduction
Final deck
BigDataInMedicine.pptx
Big data overview
MODULE 1: Introduction to Big Data Analytics.pptx
Big Data Warsaw v 4 I "The Role of Hadoop Ecosystem in Advance Analytics" - R...
An introduction to Big Data
Relational databases for BigData
Big Data In Medicine
Introduction to big data

More from SlideCentral (6)

PDF
AgileCO Labs Blockchain Consortium-2018-4.1
PDF
AgileCO Labs Blockchain Consortium2018_1
PDF
AgileCO Labs - Blockchain Consortium 2018
PDF
Blockchain 2Gether - EthEDU Training & Education
PDF
Blockchain Chamber of Commerce
PDF
AgileCO-Labs WhiteLabel ICO-Services Blockchain-Solutions
AgileCO Labs Blockchain Consortium-2018-4.1
AgileCO Labs Blockchain Consortium2018_1
AgileCO Labs - Blockchain Consortium 2018
Blockchain 2Gether - EthEDU Training & Education
Blockchain Chamber of Commerce
AgileCO-Labs WhiteLabel ICO-Services Blockchain-Solutions

Recently uploaded (20)

PDF
Clinical guidelines as a resource for EBP(1).pdf
PPT
Quality review (1)_presentation of this 21
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PDF
.pdf is not working space design for the following data for the following dat...
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PDF
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PPT
Chapter 3 METAL JOINING.pptnnnnnnnnnnnnn
PPTX
Supervised vs unsupervised machine learning algorithms
PPTX
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PPTX
climate analysis of Dhaka ,Banglades.pptx
PPTX
IB Computer Science - Internal Assessment.pptx
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PPT
Chapter 2 METAL FORMINGhhhhhhhjjjjmmmmmmmmm
PPTX
Moving the Public Sector (Government) to a Digital Adoption
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PPTX
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
PPTX
Database Infoormation System (DBIS).pptx
Clinical guidelines as a resource for EBP(1).pdf
Quality review (1)_presentation of this 21
Galatica Smart Energy Infrastructure Startup Pitch Deck
.pdf is not working space design for the following data for the following dat...
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
STUDY DESIGN details- Lt Col Maksud (21).pptx
Chapter 3 METAL JOINING.pptnnnnnnnnnnnnn
Supervised vs unsupervised machine learning algorithms
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
climate analysis of Dhaka ,Banglades.pptx
IB Computer Science - Internal Assessment.pptx
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
Chapter 2 METAL FORMINGhhhhhhhjjjjmmmmmmmmm
Moving the Public Sector (Government) to a Digital Adoption
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
Database Infoormation System (DBIS).pptx

Big data ecosystem

  • 2. Distributed Filesystem Apache HDFS Red Hat GlusterFS NoSQL Databases Apache Hbase Apache Cassandra Key-Value Data Model Redis DB LinkedIN Voldermort Distributed Filesystem Apache HDFS Red Hat GlusterFS Distributed Programming Apache MapReduce Apache Pig Document Data Model MongoDB RethinkDB Graph Data Model ArangoDB TitanDB Distributed Filesystem Apache HDFS Red Hat GlusterFS Here is a limited list of the BigData Ecosystem @EdPimentl
  • 3. Data Ingestion Apache Flume Apache Storm Scheduling Apache Falcon Apache Oozie System Development Apache Ambari Cloudera HUE Apache Mesos Service Programming Apache Zookeeper LinkedIn Norbert Twitter Elephant Bird Machine Learning WEKA Cloudera Oryx Apache Mahout Others Accumulo SQL-on-Hadoop Apache Hive Apache Drill Here is a limited list of the BigData Ecosystem @EdPimentl
  • 4. What is a Byte, Kilobyte, Megabyte, Gigabyte, Terabyte, Petabyte, and Exabyte? Bytes(8 bits) 0.1 bytes:A binary decision Kilobyte (1000 bytes) 2 Kilobytes:A Typewritten page Megabyte (1 000 000 bytes) 2 Megabytes:A high resolution photograph Gigabyte (1 000 000 000 bytes) 1 Gigabyte:A pickup truck filled with paper OR A symphony in high-fidelity sound OR A movie at TV quality Terabyte (1 000 000 000 000 bytes) 10 Terabytes:The printed collection of the US Library of Congress Petabyte (1 000 000 000 000 000 bytes) 2 Petabytes:All US academic research libraries 20 Petabytes: Production of hard-disk drives in 1995 Exabyte (1 000 000 000 000 000 000 bytes) 5 Exabytes:All words ever spoken by human beings Nice description by Julian Bunn
  • 5. Related Links Open Data will hit every industry sector within 10 years https://lnkd.in/eBbzTY7 http://blog.knuthaugen.no/2010/03/a-brief-history-of-nosql.html http://www.zdnet.com/article/traditional-databases-vs-the-threat-from-in-memory-nosql/?_escaped_fragment_=#! http://arstechnica.com/information-technology/2013/07/the-hot-new-technology-in-big-data-is-decades-old-sql/ @EdPimentl