SlideShare a Scribd company logo
big data
So What?
12 October 2016
1
Who am I?
• Software guy
• Technology leader with experience in software
development as CTOs and development managers
of mid-sized teams.
• Doing big data hands-on since 2009
• Running http://meetup.com/bigdatabe since 2011
(1700 members!)
2
@wimvanleuven
wim@bigboards.io
3
4
“Big data is data that exceeds the processing
capacity of conventional database systems.
The data is too big, moves too fast, or doesn’t fit
the strictures of your database architectures.”
5
–Edd Dumbill, O’Reilly
What is big data?
http://radar.oreilly.com/2012/01/what-is-big-data.html
…too big…
6
IOIIOIIOIOIOIIIOIOOOOIOIOOIIOIIIIIOIIOIIOIOIOIOIOIOIIOIIOI
OIOIIIOIOOOOIOIOOIIOIIIIIOIIOIIOIOIOIOIOIOIIOIIOIOIOIIIOI
OOOOIOIOOIIOIIIIIOIIOIIOIOIOIOIOIOIIOIIOIOIOIIIOIOOOOI
OIOOIIOIIIIIOIIOIIOIOIOIOIOIOIIOIIOIOIOIIIOIOOOOIOIOOIIO
IIIIIOIIOIIOIOIOIOIOIOIIOIIOIOIOIIIOIOOOOIOIOOIIOIIIIIOIIOII
OIOIOIOIOIOIIOIIOIOIOIIIOIOOOOIOIOOIIOIIIIIOIIOIIOIOIOIO
… moves to fast …
7
8
… doesn’t fit …
9
What is Big Data not?
• not a delivery model (on-premise vs hosted vs
cloud vs IaaS/PaaS/SaaS vs serverless)
• not a deployment model (private, public, hybrid)
• not a revenue model (license vs subscription vs
Pay-as-you-Go)
• not software architecture
10
“We don’t do Hadoop because we have Big
Data; we do Big Data because we have
Hadoop.”
11
–Unknown developer, Facebook
What is Big Data? — revisited
New tools and technologies to capture and
process data on a cluster of commodity
hardware so that the system acts as one,
is resilient to failures and scales linearly.
12
What is Big Data? — revisited
Big Data is no panacea
13
• First decide what problem you want to solve; pick
a real business problem to add immediate value
• Start small, the technology is made for linear
scalability (a 3-node cluster is a cluster!)
• Then become lean: learn through experimentation
Big Data challenges
• Beware of hype, Big Data - washing and fad
• Tech infancy
• IT | Biz
• Data is hard
• Lack of skills!
14
Benefits
• Scalability of course
• Collect more and more data
• Robustness inherent to the setup
• More predictable performance
15
16
Questions?
17
Co-existence
Big
Data
View
ESB
App
ETL
DFS
18
1
2
3
4
5
2
4
5
1
2
5
1
3
4
2
3
5
1
3
4
Node A Node B Node C Node D Node E
MapReduce
19
4
5
3
2
1
Node A
Node B
Node C
Node D
Node E
Map Shuffle Reduce
x
y
z
𝛌
20
𝛋
21
3
1
2
45
22
Q&A

More Related Content

PDF
What is a Data Scientist
PPTX
Data Scientist: The Sexiest Job in the 21st Century
PPTX
Data 101- Big Data: What is it and Why Do We Care?
PDF
Big Data introduction - Café Numérique Bruxelles
PDF
Data_Scientist_Position_Description
PPTX
EA : The Biggest Big Data Challenge
PDF
Big Data & the importance of Data Science
PDF
Introduction to big data for the EA course at Solvay MBA
What is a Data Scientist
Data Scientist: The Sexiest Job in the 21st Century
Data 101- Big Data: What is it and Why Do We Care?
Big Data introduction - Café Numérique Bruxelles
Data_Scientist_Position_Description
EA : The Biggest Big Data Challenge
Big Data & the importance of Data Science
Introduction to big data for the EA course at Solvay MBA

Similar to Introduction to Big Data (20)

PDF
Big Data - Introduction and Research Topics - for Dutch Kadaster
PPTX
Data Science Overview
PPTX
BigData.pptx
PPTX
Introduction to Big Data
PPTX
Big Data, NoSQL, NewSQL & The Future of Data Management
PDF
Level Seven - Expedient Big Data presentation
PDF
Big Data - Harisfazillah Jamel - Startup and Developer 4th Meetup 5th Novembe...
PPTX
PresentationBig Data111111111111111.pptx
PPTX
Big data myths busted
PDF
Big Data - Bridging Technology and Humans
PDF
The value of big data analytics
PPTX
Big data by Mithlesh sadh
PPTX
Big Data: Setting Up the Big Data Lake
PPTX
Presentation on Big Data
PDF
Ds01 data science
PPTX
An Overview of BigData
PPTX
A beginner's guide to Big data
PDF
2017 06-14-getting started with data science
PPT
Big Data Analytics Materials, Chapter: 1
PPTX
Compliance, Security, Migration, Systems Management – All Fixed by Microsoft?
Big Data - Introduction and Research Topics - for Dutch Kadaster
Data Science Overview
BigData.pptx
Introduction to Big Data
Big Data, NoSQL, NewSQL & The Future of Data Management
Level Seven - Expedient Big Data presentation
Big Data - Harisfazillah Jamel - Startup and Developer 4th Meetup 5th Novembe...
PresentationBig Data111111111111111.pptx
Big data myths busted
Big Data - Bridging Technology and Humans
The value of big data analytics
Big data by Mithlesh sadh
Big Data: Setting Up the Big Data Lake
Presentation on Big Data
Ds01 data science
An Overview of BigData
A beginner's guide to Big data
2017 06-14-getting started with data science
Big Data Analytics Materials, Chapter: 1
Compliance, Security, Migration, Systems Management – All Fixed by Microsoft?
Ad

Recently uploaded (20)

PPTX
Introduction to Knowledge Engineering Part 1
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PPT
ISS -ESG Data flows What is ESG and HowHow
PDF
Capcut Pro Crack For PC Latest Version {Fully Unlocked 2025}
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PDF
Transcultural that can help you someday.
PDF
Optimise Shopper Experiences with a Strong Data Estate.pdf
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PDF
[EN] Industrial Machine Downtime Prediction
PPTX
SAP 2 completion done . PRESENTATION.pptx
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PDF
Introduction to the R Programming Language
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PPTX
Managing Community Partner Relationships
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
Introduction to Knowledge Engineering Part 1
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
ISS -ESG Data flows What is ESG and HowHow
Capcut Pro Crack For PC Latest Version {Fully Unlocked 2025}
IBA_Chapter_11_Slides_Final_Accessible.pptx
Transcultural that can help you someday.
Optimise Shopper Experiences with a Strong Data Estate.pdf
Miokarditis (Inflamasi pada Otot Jantung)
[EN] Industrial Machine Downtime Prediction
SAP 2 completion done . PRESENTATION.pptx
Data_Analytics_and_PowerBI_Presentation.pptx
Introduction to the R Programming Language
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
Introduction-to-Cloud-ComputingFinal.pptx
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
Managing Community Partner Relationships
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
Ad

Introduction to Big Data