SlideShare a Scribd company logo
Introduction to Big Data
Agenda
•
•
•
•

What is Big Data
Example of Big Data
Drivers of Big Data: HIPO vs “Geeks”
Potential of Big Data

Gan, Jeremy

eMOT | MG 8783: Cloud Computing

1
What is Big Data?
• Three V’s of Big Data
– Volume
– Velocity
– Variety

Gan, Jeremy

eMOT | MG 8783: Cloud Computing

2
VOLUME: HOW MUCH DATA?

Gan, Jeremy

eMOT | MG 8783: Cloud Computing

3
Volume: How Much Data?
•
•
•
•
•
•
•
•

KiloMegaTeraGigaPetaExaZettaYotta-

Gan, Jeremy

: 10^3
: 10^6
: 10^9
: 10^12
: 10^15
: 10^18
: 10^21
: 10^24

bytes
bytes
bytes
bytes
bytes
bytes
bytes
bytes

eMOT | MG 8783: Cloud Computing

4
Volume: How Much Data?
•
•
•
•
•
•
•
•

KiloMegaTeraGigaPetaExaZettaYotta-

Gan, Jeremy

: 10^3
: 10^6
: 10^9
: 10^12
: 10^15
: 10^18
: 10^21
: 10^24

bytes
bytes
bytes
bytes
bytes
bytes
bytes
bytes

eMOT | MG 8783: Cloud Computing

As of 2013

5
Volume: How Much Data? (cont.)
HELLA(~ 10^27 byte)

aka

“HELLUVA-”
Gan, Jeremy

eMOT | MG 8783: Cloud Computing

6
Volume: How Much Data? (cont.)
If we were to take all that information and store
it in books, we could cover the entire area of the
US or China in 3 layers of books.
Martin Hilbert, Researcher, USC

Gan, Jeremy

eMOT | MG 8783: Cloud Computing

7
VELOCITY: IMMEDIATE & REACTIVE
(REAL-TIME DATA ANALYSIS)
Gan, Jeremy

eMOT | MG 8783: Cloud Computing

8
NYSE collects over 1 TB of trade info EACH session

Gan, Jeremy

eMOT | MG 8783: Cloud Computing

9
Modern cars have over HUNDRED sensors

Gan, Jeremy

eMOT | MG 8783: Cloud Computing

10
Google Wallet Debit Card

Gan, Jeremy

eMOT | MG 8783: Cloud Computing

11
iOS7 Location Tracking Map

Gan, Jeremy

eMOT | MG 8783: Cloud Computing

12
NBC The Voice #InstantSave

Gan, Jeremy

eMOT | MG 8783: Cloud Computing

13
Wasabi Waiter

Gan, Jeremy

eMOT | MG 8783: Cloud Computing

14
VARIETY: DATA IN WHAT FORM?

Gan, Jeremy

eMOT | MG 8783: Cloud Computing

15
Tweets

Gan, Jeremy

eMOT | MG 8783: Cloud Computing

16
Facebook: Likes

Gan, Jeremy

eMOT | MG 8783: Cloud Computing

17
Facebook: Mouse Cursor Tracking

Gan, Jeremy

eMOT | MG 8783: Cloud Computing

18
Apple iBeacon

Gan, Jeremy

eMOT | MG 8783: Cloud Computing

19
Variety: Data In What Form?
• Goal
– Identify patterns
– Gain insights

• Why?
– Combine big data with traditional data to better
understand pain points
– Mitigate/limit negative impact
– Increase/create revenue stream
Gan, Jeremy

eMOT | MG 8783: Cloud Computing

20
THREE V’S + 1 = VERACITY

Gan, Jeremy

eMOT | MG 8783: Cloud Computing

21
Role of Data Scientist
• Keep data organized - accurately
• Poor data management quality cost U.S.
economy roughly $3.1 trillion/year

Gan, Jeremy

eMOT | MG 8783: Cloud Computing

22
Role of Data Scientist (cont.)
• Data used correctly could spark limitless
potentials
– Prevent disease
– Combat crime
– Revolutionize global R&D
– Disrupt conventional business model
– Challenge HIPO’s guts

Gan, Jeremy

eMOT | MG 8783: Cloud Computing

23
Role of Data Scientist (cont.)
• Data used correctly could spark limitless
potentials
– Prevent disease
– Combat crime
– Revolutionize global R&D
– Disrupt conventional business model

–Challenge HIPO’s guts

Gan, Jeremy

eMOT | MG 8783: Cloud Computing

24
DRIVERS OF BIG DATA: HIPO VS
“GEEKS” (EXAMPLE)
Gan, Jeremy

eMOT | MG 8783: Cloud Computing

25
2012 Presidential Election
President Barack Obama

Gan, Jeremy

Gov. Mitt Romney

eMOT | MG 8783: Cloud Computing

26
HIPO vs Geek
Michael Slaby, CTO, OFA 2008

Gan, Jeremy

Harper Reed, CTO, OFA 2012

eMOT | MG 8783: Cloud Computing

27
Breakdown
• Innovative solution by leveraging big data
– Facebook information
• Personal interest: Preferences
• Location: Hyper-local, better content distribution
• Relevant: Contact efficiency
– Push innovation into sales by using data to have a
conversation

– Twitter
• DM via President and First Lady’s Twitter accounts

Gan, Jeremy

eMOT | MG 8783: Cloud Computing

28
Result

Gan, Jeremy

eMOT | MG 8783: Cloud Computing

29
POTENTIAL OF BIGDATA

Gan, Jeremy

eMOT | MG 8783: Cloud Computing

30
Limitless
• Research by McKinsey in Jan 2013
– Companies using large-scale big data to shape
corporate strategy
• Example:
– IBM acquiring Kenexa Corp.
» Cloud (SAAS foundation) + big data (market insights)
» Remove “guess work” – replacing it with precision
• Hiring – Utilize behavioral traits

• Research by Harvard School of Public Health
– Big data could effectively prevent TB and
shrinkage of health care cost
Gan, Jeremy

eMOT | MG 8783: Cloud Computing

31
Harper’s Thought On Healthcare.gov
Source NYT.com

Gan, Jeremy

eMOT | MG 8783: Cloud Computing

32
Gan, Jeremy

eMOT | MG 8783: Cloud Computing

33

More Related Content

PPT
BIM: The interaction between the specifier and the Passive Fire Protection Co...
PPTX
The Dawn of Live Engagement Marketing
PDF
Marketing changed
PPTX
The Ultimate Convergence of the Physical and Digital Worlds
PPTX
Jeremy_Spiller_LIFTING_YOUR_SALES_THROUGH_DIGITAL_EXCELLENCE_IT-tinget_2014
PPTX
What is the future for cloud applications - May 2017
PPTX
Big Data Examples
PDF
How Apache Drives Music Recommendations At Spotify
BIM: The interaction between the specifier and the Passive Fire Protection Co...
The Dawn of Live Engagement Marketing
Marketing changed
The Ultimate Convergence of the Physical and Digital Worlds
Jeremy_Spiller_LIFTING_YOUR_SALES_THROUGH_DIGITAL_EXCELLENCE_IT-tinget_2014
What is the future for cloud applications - May 2017
Big Data Examples
How Apache Drives Music Recommendations At Spotify

Similar to Introduction to Big Data (20)

PPTX
Introduction of information technology with the emerging technology
PDF
DataEd Online: Demystifying Big Data
PDF
Data-Ed: Demystifying Big Data
PPT
"Big Data Dreams"
PDF
Big data analytics with Apache Hadoop
PDF
Smart Data Webinar: Advances in Natural Language Processing II - NL Generation
PDF
Implementing Big Data, NoSQL, & Hadoop - Bigger Is (Usually) Better
PDF
Ictam big data
PDF
Level Seven - Expedient Big Data presentation
PDF
Big Data Chapter1.pdf
PDF
Internet of Things
DOCX
BIGDATAPrepared ByMuhammad Abrar UddinIntrodu.docx
PDF
Big data and analytics
PDF
International Journal of Engineering Research and Development (IJERD)
PDF
Big Data Fundamentals
PDF
Hadoop Analytics + Enterprise Class Storage: One-Stop Solution From EMC for H...
 
PPTX
BIMCV: The Perfect "Big Data" Storm.
PPTX
INN530 - Assignment 2, Big data and cloud computing for management
PDF
LITERATURE SURVEY ON BIG DATA AND PRESERVING PRIVACY FOR THE BIG DATA IN CLOUD
PPTX
Big data
Introduction of information technology with the emerging technology
DataEd Online: Demystifying Big Data
Data-Ed: Demystifying Big Data
"Big Data Dreams"
Big data analytics with Apache Hadoop
Smart Data Webinar: Advances in Natural Language Processing II - NL Generation
Implementing Big Data, NoSQL, & Hadoop - Bigger Is (Usually) Better
Ictam big data
Level Seven - Expedient Big Data presentation
Big Data Chapter1.pdf
Internet of Things
BIGDATAPrepared ByMuhammad Abrar UddinIntrodu.docx
Big data and analytics
International Journal of Engineering Research and Development (IJERD)
Big Data Fundamentals
Hadoop Analytics + Enterprise Class Storage: One-Stop Solution From EMC for H...
 
BIMCV: The Perfect "Big Data" Storm.
INN530 - Assignment 2, Big data and cloud computing for management
LITERATURE SURVEY ON BIG DATA AND PRESERVING PRIVACY FOR THE BIG DATA IN CLOUD
Big data
Ad

More from jammygan (7)

PDF
Non-US High Technology Clusters: Skolkovo Innovation Center
PDF
Global Innovation Strategy of Tencent Holdings
PDF
Skolkovo innovation center
DOCX
Assignment4 gan jeremyfinal
PPTX
Strategic Change Management: Implementing a Mobile CRM Solution
PPTX
Strategic Change Management
PDF
Monetizing Mobile Advertising
Non-US High Technology Clusters: Skolkovo Innovation Center
Global Innovation Strategy of Tencent Holdings
Skolkovo innovation center
Assignment4 gan jeremyfinal
Strategic Change Management: Implementing a Mobile CRM Solution
Strategic Change Management
Monetizing Mobile Advertising
Ad

Recently uploaded (20)

PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
Machine Learning_overview_presentation.pptx
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPTX
Cloud computing and distributed systems.
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
cuic standard and advanced reporting.pdf
PDF
Empathic Computing: Creating Shared Understanding
PDF
Approach and Philosophy of On baking technology
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPTX
Programs and apps: productivity, graphics, security and other tools
20250228 LYD VKU AI Blended-Learning.pptx
“AI and Expert System Decision Support & Business Intelligence Systems”
Diabetes mellitus diagnosis method based random forest with bat algorithm
Machine Learning_overview_presentation.pptx
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Cloud computing and distributed systems.
Building Integrated photovoltaic BIPV_UPV.pdf
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Digital-Transformation-Roadmap-for-Companies.pptx
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Advanced methodologies resolving dimensionality complications for autism neur...
Encapsulation_ Review paper, used for researhc scholars
cuic standard and advanced reporting.pdf
Empathic Computing: Creating Shared Understanding
Approach and Philosophy of On baking technology
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Programs and apps: productivity, graphics, security and other tools

Introduction to Big Data

  • 2. Agenda • • • • What is Big Data Example of Big Data Drivers of Big Data: HIPO vs “Geeks” Potential of Big Data Gan, Jeremy eMOT | MG 8783: Cloud Computing 1
  • 3. What is Big Data? • Three V’s of Big Data – Volume – Velocity – Variety Gan, Jeremy eMOT | MG 8783: Cloud Computing 2
  • 4. VOLUME: HOW MUCH DATA? Gan, Jeremy eMOT | MG 8783: Cloud Computing 3
  • 5. Volume: How Much Data? • • • • • • • • KiloMegaTeraGigaPetaExaZettaYotta- Gan, Jeremy : 10^3 : 10^6 : 10^9 : 10^12 : 10^15 : 10^18 : 10^21 : 10^24 bytes bytes bytes bytes bytes bytes bytes bytes eMOT | MG 8783: Cloud Computing 4
  • 6. Volume: How Much Data? • • • • • • • • KiloMegaTeraGigaPetaExaZettaYotta- Gan, Jeremy : 10^3 : 10^6 : 10^9 : 10^12 : 10^15 : 10^18 : 10^21 : 10^24 bytes bytes bytes bytes bytes bytes bytes bytes eMOT | MG 8783: Cloud Computing As of 2013 5
  • 7. Volume: How Much Data? (cont.) HELLA(~ 10^27 byte) aka “HELLUVA-” Gan, Jeremy eMOT | MG 8783: Cloud Computing 6
  • 8. Volume: How Much Data? (cont.) If we were to take all that information and store it in books, we could cover the entire area of the US or China in 3 layers of books. Martin Hilbert, Researcher, USC Gan, Jeremy eMOT | MG 8783: Cloud Computing 7
  • 9. VELOCITY: IMMEDIATE & REACTIVE (REAL-TIME DATA ANALYSIS) Gan, Jeremy eMOT | MG 8783: Cloud Computing 8
  • 10. NYSE collects over 1 TB of trade info EACH session Gan, Jeremy eMOT | MG 8783: Cloud Computing 9
  • 11. Modern cars have over HUNDRED sensors Gan, Jeremy eMOT | MG 8783: Cloud Computing 10
  • 12. Google Wallet Debit Card Gan, Jeremy eMOT | MG 8783: Cloud Computing 11
  • 13. iOS7 Location Tracking Map Gan, Jeremy eMOT | MG 8783: Cloud Computing 12
  • 14. NBC The Voice #InstantSave Gan, Jeremy eMOT | MG 8783: Cloud Computing 13
  • 15. Wasabi Waiter Gan, Jeremy eMOT | MG 8783: Cloud Computing 14
  • 16. VARIETY: DATA IN WHAT FORM? Gan, Jeremy eMOT | MG 8783: Cloud Computing 15
  • 17. Tweets Gan, Jeremy eMOT | MG 8783: Cloud Computing 16
  • 18. Facebook: Likes Gan, Jeremy eMOT | MG 8783: Cloud Computing 17
  • 19. Facebook: Mouse Cursor Tracking Gan, Jeremy eMOT | MG 8783: Cloud Computing 18
  • 20. Apple iBeacon Gan, Jeremy eMOT | MG 8783: Cloud Computing 19
  • 21. Variety: Data In What Form? • Goal – Identify patterns – Gain insights • Why? – Combine big data with traditional data to better understand pain points – Mitigate/limit negative impact – Increase/create revenue stream Gan, Jeremy eMOT | MG 8783: Cloud Computing 20
  • 22. THREE V’S + 1 = VERACITY Gan, Jeremy eMOT | MG 8783: Cloud Computing 21
  • 23. Role of Data Scientist • Keep data organized - accurately • Poor data management quality cost U.S. economy roughly $3.1 trillion/year Gan, Jeremy eMOT | MG 8783: Cloud Computing 22
  • 24. Role of Data Scientist (cont.) • Data used correctly could spark limitless potentials – Prevent disease – Combat crime – Revolutionize global R&D – Disrupt conventional business model – Challenge HIPO’s guts Gan, Jeremy eMOT | MG 8783: Cloud Computing 23
  • 25. Role of Data Scientist (cont.) • Data used correctly could spark limitless potentials – Prevent disease – Combat crime – Revolutionize global R&D – Disrupt conventional business model –Challenge HIPO’s guts Gan, Jeremy eMOT | MG 8783: Cloud Computing 24
  • 26. DRIVERS OF BIG DATA: HIPO VS “GEEKS” (EXAMPLE) Gan, Jeremy eMOT | MG 8783: Cloud Computing 25
  • 27. 2012 Presidential Election President Barack Obama Gan, Jeremy Gov. Mitt Romney eMOT | MG 8783: Cloud Computing 26
  • 28. HIPO vs Geek Michael Slaby, CTO, OFA 2008 Gan, Jeremy Harper Reed, CTO, OFA 2012 eMOT | MG 8783: Cloud Computing 27
  • 29. Breakdown • Innovative solution by leveraging big data – Facebook information • Personal interest: Preferences • Location: Hyper-local, better content distribution • Relevant: Contact efficiency – Push innovation into sales by using data to have a conversation – Twitter • DM via President and First Lady’s Twitter accounts Gan, Jeremy eMOT | MG 8783: Cloud Computing 28
  • 30. Result Gan, Jeremy eMOT | MG 8783: Cloud Computing 29
  • 31. POTENTIAL OF BIGDATA Gan, Jeremy eMOT | MG 8783: Cloud Computing 30
  • 32. Limitless • Research by McKinsey in Jan 2013 – Companies using large-scale big data to shape corporate strategy • Example: – IBM acquiring Kenexa Corp. » Cloud (SAAS foundation) + big data (market insights) » Remove “guess work” – replacing it with precision • Hiring – Utilize behavioral traits • Research by Harvard School of Public Health – Big data could effectively prevent TB and shrinkage of health care cost Gan, Jeremy eMOT | MG 8783: Cloud Computing 31
  • 33. Harper’s Thought On Healthcare.gov Source NYT.com Gan, Jeremy eMOT | MG 8783: Cloud Computing 32
  • 34. Gan, Jeremy eMOT | MG 8783: Cloud Computing 33

Editor's Notes

  • #12: Insight unto consumer spending to sell you targeted ads.Merchant gain access to your email every swipe to send you emails.
  • #15: designed by a team of neuroscientists, psychologists, and data scientists to suss out human potential. Play one of them for just 20 minutes and you’ll generate several megabytes of data, exponentially more than what’s collected by the SATEnd Result?high-resolution portrait of your psyche and intellect, and an assessment of your potential as a leader or an innovator.