SlideShare a Scribd company logo
Slides by Gabriela Antunes Vieira
Course by TOON VANAGT
Founder and managing director of several
internet startups
Co-founder and board member of Fintech Belgium
Chairman of Open Knowledge Belgium
Tech entrepreneur, lean
startup coach & angel
investor
Toon Vanagt
Datanews ICT manager SME of the year 2002
@Toon Betacowork co-owner (get your free trial ! )
What is Big Data ? How is it classified ? Where does it come
from ? And why is it important ?1
2
3
INTRODUCTION
THE 5 Vs OF DATA
BIG DATA TECHNOLOGY
The main characteristics of Big Data
Big Data Technologies Landscape
4
5
HOW BIG DATA BECOMES SMART DATA
SOME EXAMPLES
Turning Big Data into value.
.
• What is Big Data ?
• Classification of data Sources of Data
• The Importance of Big Data
Course by TOON VANAGT
Data in our 21st century is like oil in the 18th Century: an
immensely, untapped valuable asset…
Course by TOON VANAGT
Big data refers to data sets that are too
large or complex for traditional data-
processing application software to
adequately deal with
Course by TOON VANAGT
The importance of Big Data relies on how
a company utilizes their collected data.
If you can’t turn this data into value, it’s
useless.
Better understanding of customers
Optimization is processes
Improve security
Improve performance
Course by TOON VANAGT
MTurk aims to make accessing human intelligence simple,
scalable, and cost-effective. Businesses or developers
needing tasks done (called Human Intelligence Tasks or
“HITs”) can use the robust MTurk API to access thousands
of high quality, global, on-demand Workers—and then
programmatically integrate the results of that work
directly into their business processes and systems. MTurk
enables developers and businesses to achieve their goals
more quickly and at a lower cost than was previously
possible.
Course by TOON VANAGT
Data that has been organized into a
formatted repository, typically
a database, so that its elements can
be made addressable for more
effective processing and analysis
Information that either does not
have a pre-defined data model or is
not organized in a pre-defined
manner.
A form of structured data that does
not conform with the formal
structure of data models
but nonetheless markers to separate
semantic elements and enforce
hierarchies of records and fields
within the data.
Course by TOON VANAGT
• Images, Videos,
audios …
• Social Media
Facebook, Twitter,
Youtube, Instagram…
• Public, private or
third party cloud
platforms
• Data publicly
available on the web
•Data Generated
from interconnection
of IOT devices
MEDIA
CLOUD
WEB
MACHINE DATA
TRANSACTIONAL
•Product ID,
Distribution,
Payements …
Course by TOON VANAGT
Course by TOON VANAGT
Move up the information ladder by
asking users/patients for input
Combine, correlate and improve
quality of data sets
Bring new value from raw (open)
data sets
Visualise in new ways
Mine deeper to dig out “insights”
(not just basic statistics)
Any company can now run its
“own Google”
Bring new value from raw (open)
data sets
Mine deeper to dig out “insights”
(not just basic statistics)
Any company can now run its
“own Google”
• Volume
• Velocity
• Variety
• Veracity
• Value
Course by TOON VANAGT
Course by TOON VANAGT
Open data is any content or
info that people are free to
use, re-use and redistribute
— without any legal,
technological or social
Publically accessible data
from websites, social
networks, blogs, news feeds,
product feeds (ecommerce)
and more. Re-use is often
not formalized and implicit…
Authentication/secured
access is required to use
proprietary corporate data ,
personal data or device data
• Big Data Landscape
• Big Data Tools
• Process
Course by TOON VANAGT
Course by TOON VANAGT
Course by TOON VANAGT
Many (open source) big data tools can be
relatively cheap building blocks of your
‘refinery’
Course by TOON VANAGT
https: //bit.ly/2GszxZF
• Smart Data Applications
• How to start smart ?
• Big Data Challenges
Course by TOON VANAGT
Fraud detection / Prevention
Targeted ads, product placement, brand sentiment analysis
Patient monitoring, Patient Care…
Proactive equipment repair, power and consumption matching
Bandwidth allocation, Cell Tower diagostics …
Proactive maintenance, Decreasing time, supply planning …
Outbreak detection, network intrusion detection …
Route and time planning, traffic monitoring …
Course by TOON VANAGT
• Could an expert help to sense-
check your results ?
• Can you validate hypotheses ?
• What further Data do you need ?
• What data do you have ?
• How is it used ?
• Do you have the expertise to
manage your data ?
• What data do you have and how
is it used ?
• Are you being specific enough ?
Course 1 - Introduction to Big Data by Toon Vanagt ( #BigDataBXL)
Course 1 - Introduction to Big Data by Toon Vanagt ( #BigDataBXL)
Course by TOON VANAGT
Course by TOON VANAGT
Users' weight, height, heart rate, ovulation cycles
and other data were shared
The information is collected in real time
Data is sent to FB via its Software Development Kit,
open source software tools that can be used by devs
to create mobile apps
These apps use a Facebook analytics tool called
App Events lets developers track user activity on
FB
Course by TOON VANAGT
Retailers are spending lots of time analyzing your data to determine how
to sell you even more.
Target, has figured out how to successfully use shopper data to
determine if an individual is having a baby and when.
Everyone provides data through customer IDs tied to personally
identifiable information (PII) such as credit cards, emails, and loyalty
card numbers
Using shopping behavior data, Target could assign a pregnancy prediction
score to customers based on the purchase and purchase volume of about 25
different products in-store, regardless of baby registries.
The Target advertising team started to use the mix and match technique
that still allows for proper targeting without freaking out customers
Course by TOON VANAGT
The Global Heat Map is published by the GPS
tracking company Strava
It is made up by sticking together the locations
and activities of people who use fitness devices
like Fitbits
Strava fitness map accidentally revealed the
location of secret military bases in war zones in
the Middle East by tracking soldiers' movements
Course by TOON VANAGT
Course 1 - Introduction to Big Data by Toon Vanagt ( #BigDataBXL)
Course 1 - Introduction to Big Data by Toon Vanagt ( #BigDataBXL)

More Related Content

PDF
Course 8 : How to start your big data project by Eric Rodriguez
PDF
Course 3 : Types of data and opportunities by Nikolaos Deligiannis
PDF
Course 4 : Big Data Structuring, Integration and Management Systems by Daan G...
PDF
Python for Data Science - TDC 2015
PPS
Big Data Science: Intro and Benefits
PDF
Data Architecture: OMG It’s Made of People
PDF
ATAAS2016 - Big data analytics – data visualization himanshu and santosh
PDF
Building Data Science Teams
 
Course 8 : How to start your big data project by Eric Rodriguez
Course 3 : Types of data and opportunities by Nikolaos Deligiannis
Course 4 : Big Data Structuring, Integration and Management Systems by Daan G...
Python for Data Science - TDC 2015
Big Data Science: Intro and Benefits
Data Architecture: OMG It’s Made of People
ATAAS2016 - Big data analytics – data visualization himanshu and santosh
Building Data Science Teams
 

What's hot (20)

PDF
A Perspective from the intersection Data Science, Mobility, and Mobile Devices
PDF
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystem
PDF
Data Science Application in Business Portfolio & Risk Management
PPTX
A Big Data Journey
PDF
Introduction to big data
PDF
Introduction to Data Science (Data Summit, 2017)
PDF
How Can Analytics Improve Business?
PPSX
Intro to Data Science Big Data
PDF
Big Data Evolution
PDF
Data Wrangling and the Art of Big Data Discovery
PDF
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
PDF
Introduction to open data in DataOps
PPTX
The Five Graphs of Government: How Federal Agencies can Utilize Graph Technology
PDF
Full-Stack Data Science: How to be a One-person Data Team
PPTX
Advanced Analytics and Data Science Expertise
PDF
Building a Data Platform Strata SF 2019
PPTX
Big data-ppt
PPTX
Machine Learning in Big Data
PDF
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
PPTX
Big data(1st presentation)
A Perspective from the intersection Data Science, Mobility, and Mobile Devices
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystem
Data Science Application in Business Portfolio & Risk Management
A Big Data Journey
Introduction to big data
Introduction to Data Science (Data Summit, 2017)
How Can Analytics Improve Business?
Intro to Data Science Big Data
Big Data Evolution
Data Wrangling and the Art of Big Data Discovery
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Introduction to open data in DataOps
The Five Graphs of Government: How Federal Agencies can Utilize Graph Technology
Full-Stack Data Science: How to be a One-person Data Team
Advanced Analytics and Data Science Expertise
Building a Data Platform Strata SF 2019
Big data-ppt
Machine Learning in Big Data
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Big data(1st presentation)
Ad

Similar to Course 1 - Introduction to Big Data by Toon Vanagt ( #BigDataBXL) (20)

PPTX
INFORMATION TECHNOLOGY UNIT 2 THE EMERGING TECHNOLOGY
PDF
Big Data, Analytics and Data Science
PDF
Big data unit i
PDF
Business Innovation and ICT Strategies Sriram Birudavolu
PPTX
Content Marketing Trending Topics in Tech
PPTX
Putting data science into perspective
PPTX
Why Data Science is Getting Popular in 2023?
PDF
Real Estate Big Data- Benefits & Challenges
PDF
Business Innovation and ICT Strategies Sriram Birudavolu
PPTX
Module 1 the power of data
PDF
Big Data Analytics
PPTX
What is big data ? | Big Data Applications
PDF
Lay of the Land for All Things Privacy
PDF
The Product Dev Conundrum: To Build or Buy in a Digital World?
PDF
ThingsCon: Trustable Tech Mark (27 Oct 2018, Mozfest Edition)
PDF
Data foundation for analytics excellence
PPT
Presentation big data and social media final_video
PPTX
Identifying the new frontier of big data as an enabler for T&T industries: Re...
PPT
Presentation To Seda Technology Programme
PPTX
Modern Product Data Workflows: Harness Your Product Data: Better Understandin...
INFORMATION TECHNOLOGY UNIT 2 THE EMERGING TECHNOLOGY
Big Data, Analytics and Data Science
Big data unit i
Business Innovation and ICT Strategies Sriram Birudavolu
Content Marketing Trending Topics in Tech
Putting data science into perspective
Why Data Science is Getting Popular in 2023?
Real Estate Big Data- Benefits & Challenges
Business Innovation and ICT Strategies Sriram Birudavolu
Module 1 the power of data
Big Data Analytics
What is big data ? | Big Data Applications
Lay of the Land for All Things Privacy
The Product Dev Conundrum: To Build or Buy in a Digital World?
ThingsCon: Trustable Tech Mark (27 Oct 2018, Mozfest Edition)
Data foundation for analytics excellence
Presentation big data and social media final_video
Identifying the new frontier of big data as an enabler for T&T industries: Re...
Presentation To Seda Technology Programme
Modern Product Data Workflows: Harness Your Product Data: Better Understandin...
Ad

Recently uploaded (20)

PDF
Machine learning based COVID-19 study performance prediction
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PPTX
Machine Learning_overview_presentation.pptx
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
cuic standard and advanced reporting.pdf
PDF
NewMind AI Weekly Chronicles - August'25-Week II
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPTX
Programs and apps: productivity, graphics, security and other tools
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Electronic commerce courselecture one. Pdf
Machine learning based COVID-19 study performance prediction
Network Security Unit 5.pdf for BCA BBA.
Reach Out and Touch Someone: Haptics and Empathic Computing
Spectral efficient network and resource selection model in 5G networks
gpt5_lecture_notes_comprehensive_20250812015547.pdf
Machine Learning_overview_presentation.pptx
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
cuic standard and advanced reporting.pdf
NewMind AI Weekly Chronicles - August'25-Week II
The AUB Centre for AI in Media Proposal.docx
Encapsulation_ Review paper, used for researhc scholars
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Programs and apps: productivity, graphics, security and other tools
20250228 LYD VKU AI Blended-Learning.pptx
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
The Rise and Fall of 3GPP – Time for a Sabbatical?
Dropbox Q2 2025 Financial Results & Investor Presentation
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Electronic commerce courselecture one. Pdf

Course 1 - Introduction to Big Data by Toon Vanagt ( #BigDataBXL)

  • 1. Slides by Gabriela Antunes Vieira
  • 2. Course by TOON VANAGT Founder and managing director of several internet startups Co-founder and board member of Fintech Belgium Chairman of Open Knowledge Belgium Tech entrepreneur, lean startup coach & angel investor Toon Vanagt Datanews ICT manager SME of the year 2002 @Toon Betacowork co-owner (get your free trial ! )
  • 3. What is Big Data ? How is it classified ? Where does it come from ? And why is it important ?1 2 3 INTRODUCTION THE 5 Vs OF DATA BIG DATA TECHNOLOGY The main characteristics of Big Data Big Data Technologies Landscape 4 5 HOW BIG DATA BECOMES SMART DATA SOME EXAMPLES Turning Big Data into value. .
  • 4. • What is Big Data ? • Classification of data Sources of Data • The Importance of Big Data
  • 5. Course by TOON VANAGT Data in our 21st century is like oil in the 18th Century: an immensely, untapped valuable asset…
  • 6. Course by TOON VANAGT Big data refers to data sets that are too large or complex for traditional data- processing application software to adequately deal with
  • 7. Course by TOON VANAGT The importance of Big Data relies on how a company utilizes their collected data. If you can’t turn this data into value, it’s useless. Better understanding of customers Optimization is processes Improve security Improve performance
  • 8. Course by TOON VANAGT MTurk aims to make accessing human intelligence simple, scalable, and cost-effective. Businesses or developers needing tasks done (called Human Intelligence Tasks or “HITs”) can use the robust MTurk API to access thousands of high quality, global, on-demand Workers—and then programmatically integrate the results of that work directly into their business processes and systems. MTurk enables developers and businesses to achieve their goals more quickly and at a lower cost than was previously possible.
  • 9. Course by TOON VANAGT Data that has been organized into a formatted repository, typically a database, so that its elements can be made addressable for more effective processing and analysis Information that either does not have a pre-defined data model or is not organized in a pre-defined manner. A form of structured data that does not conform with the formal structure of data models but nonetheless markers to separate semantic elements and enforce hierarchies of records and fields within the data.
  • 10. Course by TOON VANAGT • Images, Videos, audios … • Social Media Facebook, Twitter, Youtube, Instagram… • Public, private or third party cloud platforms • Data publicly available on the web •Data Generated from interconnection of IOT devices MEDIA CLOUD WEB MACHINE DATA TRANSACTIONAL •Product ID, Distribution, Payements …
  • 11. Course by TOON VANAGT
  • 12. Course by TOON VANAGT Move up the information ladder by asking users/patients for input Combine, correlate and improve quality of data sets Bring new value from raw (open) data sets Visualise in new ways Mine deeper to dig out “insights” (not just basic statistics) Any company can now run its “own Google” Bring new value from raw (open) data sets Mine deeper to dig out “insights” (not just basic statistics) Any company can now run its “own Google”
  • 13. • Volume • Velocity • Variety • Veracity • Value
  • 14. Course by TOON VANAGT
  • 15. Course by TOON VANAGT Open data is any content or info that people are free to use, re-use and redistribute — without any legal, technological or social Publically accessible data from websites, social networks, blogs, news feeds, product feeds (ecommerce) and more. Re-use is often not formalized and implicit… Authentication/secured access is required to use proprietary corporate data , personal data or device data
  • 16. • Big Data Landscape • Big Data Tools • Process
  • 17. Course by TOON VANAGT
  • 18. Course by TOON VANAGT
  • 19. Course by TOON VANAGT Many (open source) big data tools can be relatively cheap building blocks of your ‘refinery’
  • 20. Course by TOON VANAGT https: //bit.ly/2GszxZF
  • 21. • Smart Data Applications • How to start smart ? • Big Data Challenges
  • 22. Course by TOON VANAGT Fraud detection / Prevention Targeted ads, product placement, brand sentiment analysis Patient monitoring, Patient Care… Proactive equipment repair, power and consumption matching Bandwidth allocation, Cell Tower diagostics … Proactive maintenance, Decreasing time, supply planning … Outbreak detection, network intrusion detection … Route and time planning, traffic monitoring …
  • 23. Course by TOON VANAGT • Could an expert help to sense- check your results ? • Can you validate hypotheses ? • What further Data do you need ? • What data do you have ? • How is it used ? • Do you have the expertise to manage your data ? • What data do you have and how is it used ? • Are you being specific enough ?
  • 26. Course by TOON VANAGT
  • 27. Course by TOON VANAGT Users' weight, height, heart rate, ovulation cycles and other data were shared The information is collected in real time Data is sent to FB via its Software Development Kit, open source software tools that can be used by devs to create mobile apps These apps use a Facebook analytics tool called App Events lets developers track user activity on FB
  • 28. Course by TOON VANAGT Retailers are spending lots of time analyzing your data to determine how to sell you even more. Target, has figured out how to successfully use shopper data to determine if an individual is having a baby and when. Everyone provides data through customer IDs tied to personally identifiable information (PII) such as credit cards, emails, and loyalty card numbers Using shopping behavior data, Target could assign a pregnancy prediction score to customers based on the purchase and purchase volume of about 25 different products in-store, regardless of baby registries. The Target advertising team started to use the mix and match technique that still allows for proper targeting without freaking out customers
  • 29. Course by TOON VANAGT The Global Heat Map is published by the GPS tracking company Strava It is made up by sticking together the locations and activities of people who use fitness devices like Fitbits Strava fitness map accidentally revealed the location of secret military bases in war zones in the Middle East by tracking soldiers' movements
  • 30. Course by TOON VANAGT