SlideShare a Scribd company logo
BIG DATA AND HADOOP 
Big Data History,Hadoop, and Industry Trends 
E S T H E R K U N D I N 
B L O O M B E R G L P
BIG DATA – WHAT IS IT?
WHAT IS BIG DATA?
BIG DATA ORIGINS 
Indexing the web requires lots of storage - Petabytes of data! 
Economic problem – reliable servers expensive! 
Solution: 
 Cram in as many cheap machines as possible 
 Replace them when they fail 
 Solve reliability via software! 
Google publishes papers about: 
 GFS (2000) 
 MapReduce (2004) 
 BigTable (2006)
HADOOP – OPEN SOURCE BIG DATA SOLUTION 
Hadoop, originally developed at 
Yahoo, accepted as Apache top-level 
project in 2008
CURRENT INDUSTRY TRENDS
QUESTIONS? 
Thank you!

More Related Content

PPTX
נתן פרידחי הקדמה לכנס Hadoop
ODP
Mango DB
PDF
Introduction to Big Data by Manouj Bongirr
PPT
Hack reduce introduction
PDF
NetApp Cloud Storage Facts
PDF
ApacheCon - Seven habits of highly effective big data programmers
PDF
Big data – An Introduction, July 2013
PDF
Hadoop essential setup
נתן פרידחי הקדמה לכנס Hadoop
Mango DB
Introduction to Big Data by Manouj Bongirr
Hack reduce introduction
NetApp Cloud Storage Facts
ApacheCon - Seven habits of highly effective big data programmers
Big data – An Introduction, July 2013
Hadoop essential setup

What's hot (13)

PDF
Open source for customer analytics
PDF
Eligotech presents @ Data Donderdag on 24 April 2014
PPTX
Big Data Overview/Teaser (6 Aug 2013)
PDF
State of the Podcast Indutry 2018 by Triton Digital @ MAC PodcastINOUT / Webi...
PDF
Big data, Hadoop - lunchtime talk 2015.02.26
PPTX
A Brief History Of Data
PDF
From BigTable to HBase and back again
PDF
How to Gain a Competitive Edge with an Open Source, Purpose-built Time Series...
PDF
Lessons learned building a big data analytics engine, from proprietary to ope...
PPTX
Big Data Overview for Chinese University of Hong Kong Centre for Innovation a...
PPTX
MongoDB IoT City Tour STUTTGART: Why your Dad's database won't work for IoT. ...
PPTX
Oracle Data Science Platform
PPTX
Open source for customer analytics
Eligotech presents @ Data Donderdag on 24 April 2014
Big Data Overview/Teaser (6 Aug 2013)
State of the Podcast Indutry 2018 by Triton Digital @ MAC PodcastINOUT / Webi...
Big data, Hadoop - lunchtime talk 2015.02.26
A Brief History Of Data
From BigTable to HBase and back again
How to Gain a Competitive Edge with an Open Source, Purpose-built Time Series...
Lessons learned building a big data analytics engine, from proprietary to ope...
Big Data Overview for Chinese University of Hong Kong Centre for Innovation a...
MongoDB IoT City Tour STUTTGART: Why your Dad's database won't work for IoT. ...
Oracle Data Science Platform
Ad

Viewers also liked (18)

ODP
Encendre leds amb fruita segona part
PDF
2014 2015 ieee java projects titles list globalsoft technologies
PPTX
多媒體10314142
PDF
Media Sebagai Aktor Politik
DOCX
manual de instalacion y configuracion del servidor DHCP y manual de configura...
PPTX
Español Uno
PDF
2014 2015 ieee matlab power electronics projects titles list globalsoft techn...
DOC
14 tips untuk awet muda buat cewek juga
PPTX
Designed slide - Định giá BĐS
PPTX
ТРИЗ для коучей. 14 противоречий коучинга.
PDF
Meteor Day Gothenburg
PDF
How to create a new Master Page in SharePoint 2013?
PPT
"BENVENUTI AD ECOLANDIA"
PPTX
Supply Chain Management: Technology's Role in Achieving Optimum Performance
PPTX
VPS Hosting Providers Mumbai | VPS Hosting
PPTX
Spike Jonze Case Study
PPTX
Existing Record Label Company Research
PPTX
File types pro forma
Encendre leds amb fruita segona part
2014 2015 ieee java projects titles list globalsoft technologies
多媒體10314142
Media Sebagai Aktor Politik
manual de instalacion y configuracion del servidor DHCP y manual de configura...
Español Uno
2014 2015 ieee matlab power electronics projects titles list globalsoft techn...
14 tips untuk awet muda buat cewek juga
Designed slide - Định giá BĐS
ТРИЗ для коучей. 14 противоречий коучинга.
Meteor Day Gothenburg
How to create a new Master Page in SharePoint 2013?
"BENVENUTI AD ECOLANDIA"
Supply Chain Management: Technology's Role in Achieving Optimum Performance
VPS Hosting Providers Mumbai | VPS Hosting
Spike Jonze Case Study
Existing Record Label Company Research
File types pro forma
Ad

Recently uploaded (20)

PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
cuic standard and advanced reporting.pdf
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PPTX
Cloud computing and distributed systems.
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Empathic Computing: Creating Shared Understanding
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Encapsulation theory and applications.pdf
PDF
Spectral efficient network and resource selection model in 5G networks
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
cuic standard and advanced reporting.pdf
“AI and Expert System Decision Support & Business Intelligence Systems”
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Chapter 3 Spatial Domain Image Processing.pdf
Mobile App Security Testing_ A Comprehensive Guide.pdf
Diabetes mellitus diagnosis method based random forest with bat algorithm
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Cloud computing and distributed systems.
The Rise and Fall of 3GPP – Time for a Sabbatical?
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Dropbox Q2 2025 Financial Results & Investor Presentation
20250228 LYD VKU AI Blended-Learning.pptx
Empathic Computing: Creating Shared Understanding
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Advanced methodologies resolving dimensionality complications for autism neur...
Encapsulation theory and applications.pdf
Spectral efficient network and resource selection model in 5G networks

Big data and hadoop lightining talk

  • 1. BIG DATA AND HADOOP Big Data History,Hadoop, and Industry Trends E S T H E R K U N D I N B L O O M B E R G L P
  • 2. BIG DATA – WHAT IS IT?
  • 3. WHAT IS BIG DATA?
  • 4. BIG DATA ORIGINS Indexing the web requires lots of storage - Petabytes of data! Economic problem – reliable servers expensive! Solution:  Cram in as many cheap machines as possible  Replace them when they fail  Solve reliability via software! Google publishes papers about:  GFS (2000)  MapReduce (2004)  BigTable (2006)
  • 5. HADOOP – OPEN SOURCE BIG DATA SOLUTION Hadoop, originally developed at Yahoo, accepted as Apache top-level project in 2008