SlideShare a Scribd company logo
BIG DATA
simplified!
Pravin Hanchinal
pravinhanchinal.com
Before we start...
Big Data
Big Data simplified
What you can do with Big Data?
Big Data
Big Data is a
cluster of many technologies and tools
that are used in various scenarios.
(Hadoop + HDFS+ Hcatalog+Flume+PowerView)
(HortonWorks + PowerView)
What you can do in Big Data?
Fetching
Processing
Visualizing
How Big is Big Data?
Byte of data: one grain of rice
Kilobyte: cup of rice
Megabyte: 8 bags of rice
Gigabyte: 3 container of lorries
Terabyte: 2 container ships
Petabyte: covers Mumbai
Exabyte: covers India
Zettabyte: fills Indian Ocean
Big Data Industry Overview
Big Data simplified
MapReduce
•MapReduce is a processing technique and a program
model for distributed computing based on java.
•The MapReduce algorithm contains two important tasks,
namely Map and Reduce.
Mapreduce
Big Data simplified
Big Data simplified
Big Data simplified
Hadoop Cluster
Big Data simplified
Big Data simplified
What you can do on Big Data?
Get Started with this:
CloudEra
HortonWorks
Why Big Data?
Business Intelligence
HortonWorks
Cloud Era
Why Hadoop?
-> Hadoop modeling and development: MapReduce, Pig, Mahout
-> Hadoop storage and data management: HDFS, HBase, Cassandra
-> Hadoop data warehousing, summarization and query: Hive, Sqoop
-> Hadoop data collection, aggregation and analysis: Chukwa, Flume
-> Hadoop metadata, table and schema management: HCatalog
-> Hadoop cluster management, job scheduling and workflow:
ZooKeeper, Oozie and Ambari
-> Hadoop Data serialization: Avro
Big Data in Nutshell
Big Data simplified
Got questions?
Text/WhatsApp on 974-086-1099
Stay connected
pravinhanchinal.com
What Next?
Dive in and Explore
Typical Use Case
Resources
http://pravinhanchinal.com/what-is-for-what-hadoop-tools
https://blog.cloudera.com/blog/2014/01/how-to-create-a-simple-hadoop-cluster-wit
h-virtualbox/
http://pingax.com/install-apache-hadoop-ubuntu-cluster-setup/
https://de.slideshare.net/EdurekaIN/ha-webinar-48976388
Resources
https://ayende.com/blog/4435/map-reduce-a-visual-explanation
MultiNode on Amazon: https://dzone.com/articles/how-set-multi-node-hadoop
https://ayende.com/blog/4435/map-reduce-a-visual-explanation
Run Sample MapReduce Examples:
MapReduce examples:
http://www.informit.com/articles/article.aspx?p=2190194&seqNum=3
https://hortonworks.com/hadoop-tutorial/how-to-process-data-with-apache-pig/

More Related Content

PPT
Big data analytics, survey r.nabati
PPTX
Big Data Concepts
PDF
Big Data Final Presentation
PDF
DOCX
Big data abstract
PDF
Big Data Hadoop Training by Easylearning Guru
PPTX
Big data Analytics Hadoop
PDF
Core concepts and Key technologies - Big Data Analytics
Big data analytics, survey r.nabati
Big Data Concepts
Big Data Final Presentation
Big data abstract
Big Data Hadoop Training by Easylearning Guru
Big data Analytics Hadoop
Core concepts and Key technologies - Big Data Analytics

What's hot (20)

PPTX
Introduction to BIg Data and Hadoop
PPTX
Hadoop and BigData - July 2016
PPT
BigData Analytics with Hadoop and BIRT
ODP
Big data, map reduce and beyond
PPTX
Intro to Big Data Hadoop
PDF
Introduction to Bigdata and HADOOP
PPT
Big Data: An Overview
PPTX
Hadoop: An Industry Perspective
PDF
Big data Big Analytics
PDF
THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012
PPTX
Hadoop and big data
PDF
Big data technologies and Hadoop infrastructure
PDF
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
PPTX
Big Data Course - BigData HUB
PPT
Big Data Analytics 2014
PDF
Introduction to Big Data and Hadoop
PPTX
Big Data & Hadoop Introduction
PPTX
Introduction of Big data, NoSQL & Hadoop
PPT
Big data introduction, Hadoop in details
PPTX
Whatisbigdataandwhylearnhadoop
Introduction to BIg Data and Hadoop
Hadoop and BigData - July 2016
BigData Analytics with Hadoop and BIRT
Big data, map reduce and beyond
Intro to Big Data Hadoop
Introduction to Bigdata and HADOOP
Big Data: An Overview
Hadoop: An Industry Perspective
Big data Big Analytics
THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012
Hadoop and big data
Big data technologies and Hadoop infrastructure
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Big Data Course - BigData HUB
Big Data Analytics 2014
Introduction to Big Data and Hadoop
Big Data & Hadoop Introduction
Introduction of Big data, NoSQL & Hadoop
Big data introduction, Hadoop in details
Whatisbigdataandwhylearnhadoop
Ad

Viewers also liked (20)

PPTX
Big data ppt
PPTX
What is Big Data?
PPTX
Big data ppt
PPTX
What is big data?
PPTX
Simple Tactics Superb Performance by INSPIRE-groups
PDF
Cloud computing projects by inspire-groups (Pravin Hanchinal)
PDF
Virtualization-the Cloud Enabler by INSPIRE-groups
PDF
Vedic Sciences and Computers
PDF
Entrepreneurship by INSPIRE-groups (Pravin Hanchinal)
PDF
Cloud development and career path
PDF
Big data Analytics hands-on sessions
PPT
Big data ppt
PDF
Cloud APIs and Cloud Frameworks
PDF
Virtualization, the cloud enabler
PDF
How to give final year project presentation?
PPTX
Big Idea For Big Data
PPT
Big Data
PPTX
Big Data Analytics with Hadoop
PPTX
Big data presentation on Crystal Ball Event Prediction
PDF
Apache Drill (ver. 0.1, check ver. 0.2)
Big data ppt
What is Big Data?
Big data ppt
What is big data?
Simple Tactics Superb Performance by INSPIRE-groups
Cloud computing projects by inspire-groups (Pravin Hanchinal)
Virtualization-the Cloud Enabler by INSPIRE-groups
Vedic Sciences and Computers
Entrepreneurship by INSPIRE-groups (Pravin Hanchinal)
Cloud development and career path
Big data Analytics hands-on sessions
Big data ppt
Cloud APIs and Cloud Frameworks
Virtualization, the cloud enabler
How to give final year project presentation?
Big Idea For Big Data
Big Data
Big Data Analytics with Hadoop
Big data presentation on Crystal Ball Event Prediction
Apache Drill (ver. 0.1, check ver. 0.2)
Ad

Similar to Big Data simplified (20)

PDF
PDF
big data analytics introduction chapter 1
PDF
International Journal of Engineering Research and Development (IJERD)
PPTX
BIG Data & Hadoop Applications in E-Commerce
PPTX
Big Data
PPTX
Big data Intro - Presentation to OCHackerz Meetup Group
PPT
Lecture 5 - Big Data and Hadoop Intro.ppt
PPTX
Intro to big data and how it works
PPTX
Café da manhã - São Paulo - Use-cases and opportunities in BigData with Hadoop
PPTX
Big data: Descoberta de conhecimento em ambientes de big data e computação na...
PPTX
Introduction to BIG DATA
PPTX
BIG Data & Hadoop Applications in Retail
PPTX
BIG Data & Hadoop Applications in Logistics
PPTX
Big Data By Vijay Bhaskar Semwal
PPTX
Big Data
PPTX
Unit 1 - Introduction to Big Data and hadoop.pptx
PPTX
Big data-denis-rothman
PPTX
Big-Data-Seminar-6-Aug-2014-Koenig
PDF
PPT
Data analytics & its Trends
big data analytics introduction chapter 1
International Journal of Engineering Research and Development (IJERD)
BIG Data & Hadoop Applications in E-Commerce
Big Data
Big data Intro - Presentation to OCHackerz Meetup Group
Lecture 5 - Big Data and Hadoop Intro.ppt
Intro to big data and how it works
Café da manhã - São Paulo - Use-cases and opportunities in BigData with Hadoop
Big data: Descoberta de conhecimento em ambientes de big data e computação na...
Introduction to BIG DATA
BIG Data & Hadoop Applications in Retail
BIG Data & Hadoop Applications in Logistics
Big Data By Vijay Bhaskar Semwal
Big Data
Unit 1 - Introduction to Big Data and hadoop.pptx
Big data-denis-rothman
Big-Data-Seminar-6-Aug-2014-Koenig
Data analytics & its Trends

More from Praveen Hanchinal (12)

PDF
Artificial Intelligence (AI): Applications in Life Science | Davangere Univer...
PDF
TensorFlow based Machine Learning VTU 2019 by pravin hanchinal
PDF
Internet of things | Research Directions in Green IoT and Case Studies
PDF
Artificial Intelligence and Machine Learning by Praveen Hanchinal
PDF
Economy and Big Data | Praveen Hanchinal
PDF
Artificial intelligence by praveen hanchinal
PDF
Research Issues, Challenges and Directions in IoT (Internet of Things)
PDF
Cloud based mobile app development cit 2017
PDF
Cloud based development cit-2017
PDF
Cloud computing simplified cit 2017
PDF
Women and Web
PDF
Google App Engine (Introduction)
Artificial Intelligence (AI): Applications in Life Science | Davangere Univer...
TensorFlow based Machine Learning VTU 2019 by pravin hanchinal
Internet of things | Research Directions in Green IoT and Case Studies
Artificial Intelligence and Machine Learning by Praveen Hanchinal
Economy and Big Data | Praveen Hanchinal
Artificial intelligence by praveen hanchinal
Research Issues, Challenges and Directions in IoT (Internet of Things)
Cloud based mobile app development cit 2017
Cloud based development cit-2017
Cloud computing simplified cit 2017
Women and Web
Google App Engine (Introduction)

Recently uploaded (20)

PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPT
Teaching material agriculture food technology
PDF
Modernizing your data center with Dell and AMD
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PPTX
Cloud computing and distributed systems.
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
Machine learning based COVID-19 study performance prediction
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
Per capita expenditure prediction using model stacking based on satellite ima...
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Review of recent advances in non-invasive hemoglobin estimation
Dropbox Q2 2025 Financial Results & Investor Presentation
Reach Out and Touch Someone: Haptics and Empathic Computing
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Diabetes mellitus diagnosis method based random forest with bat algorithm
Teaching material agriculture food technology
Modernizing your data center with Dell and AMD
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
The Rise and Fall of 3GPP – Time for a Sabbatical?
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
Cloud computing and distributed systems.
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
Machine learning based COVID-19 study performance prediction
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
NewMind AI Weekly Chronicles - August'25 Week I
“AI and Expert System Decision Support & Business Intelligence Systems”

Big Data simplified