SlideShare a Scribd company logo
1.Introduction to Big Data
1
Glance at Data in Modern Era
2
Data Classification
Structured
• Granular Queryability
• Tables with rows &
columns
• Eg: RDBMS like SQL
• Contribution: 5%
Semi-Structured
• Spectrum between
Structured & Unstructured
• Contains tags, schema
contained within the data
• Eg:XML,JSON,NO SQL
Unstructured
• Not Queryable
• Eg:
Audio,Videos,Text,Images,E-
Mail
• Contribution : 80%
3
Overview of Big Data
• What?
Big data is a term that describes the large volume of data – both structured and unstructured – that
inundates a business on a day-to-day basis.
• Why?
i. Data sets so complex and huge that it becomes tough to process by making use of traditional
data processing methods.
ii. Warrants innovative solutions for a variety of new and existing data to provide real business
benefits.
• Where?
i. Analyse for insights that lead to better decisions and strategic business moves.
ii. Processing large volumes or wide varieties of data remains merely a technological solution
unless it is tied to business goals and objectives.
iii. Larger operational efficiencies, reduced risk and cost reductions.
iv. Reveal patterns, trends and associations related to human behavior and interactions.
v. Better understand consumer habits and target marketing campaigns
4
Characteristics of Big Data
While the term “big data” is relatively new, the act of gathering and storing large amounts of
information for eventual analysis is ages old. The concept gained momentum in the early 2000’s
when industry analyst Doug Laney articulated it “
5
Big
Data
Velocity
Variety
Veracity
Volume
Volume : Data will grow from 4.4 zettabytes today to
around 44 zettabytes.
Velocity: By 2020, about 1.7 megabytes of new
information will be created every second for every human
being on the planet.
Variety: Smart phones will be shipped – all packed with
sensors capable of collecting all kinds of data, not to
mention the data the users create themselves.
4 V’s
Volume
Enormous amount of data generated by machines, networks and human interaction on systems like
social media.
Velocity
The pace at which data flows in from sources like business processes, machines, networks and
human interaction with things like social media sites, mobile devices, etc. The flow of data is massive
and continuous.
Variety
Variety refers to the many sources and types of data both structured and unstructured. Now data
comes in the form of emails, photos, videos, monitoring devices, PDFs, audio, etc. This variety of
unstructured data creates problems for storage, mining and analyzing data.
Veracity
Refers to the biases, noise and abnormality in data. Is the data that is being stored, and mined
meaningful to the problem being analyzed
6
7

More Related Content

PPTX
In memory big data management and processing
PPTX
What is big data
PPTX
Big data Mining
PPTX
Understanding big data
PPTX
Big data seminor
PPTX
Big data
PPTX
State of Florida Neo4J Graph Briefing - Keynote
PPT
Big data : Coudbells.com
In memory big data management and processing
What is big data
Big data Mining
Understanding big data
Big data seminor
Big data
State of Florida Neo4J Graph Briefing - Keynote
Big data : Coudbells.com

What's hot (20)

PDF
Banji Adenusi - big data prezzie - InfoSci
PPTX
Big data, big opportunities
PDF
E content.1 - P.SENEKA II-MSC COMPUTER SCIENCE,BON SECOURS COLLEGE FOR WOMEN
PPTX
BIG DATA ANALYTICS
PPT
Conrad We’re not there, yet!
PPTX
Big data
PDF
Big data hype or reality
PDF
Latest Trends in Computer Science
DOCX
IEEE 2014 DOTNET DATA MINING PROJECTS Data mining with big data
PPTX
Big dataorig
PDF
What is Data Science
PPT
Discovering Things and Things’ data/services
PPTX
Big data introduction by quontra solutions
PDF
Big data
PPTX
Big Data meetup R#1 slide
PDF
Data Science, Knowledge Discover, Mining and Learning
PPTX
To share or not to share? machine generated data for science
PPT
Dynamic Semantics for Semantics for Dynamic IoT Environments
PPT
The impact of Big Data on next generation of smart cities
PPT
Large-scale data analytics for smart cities
Banji Adenusi - big data prezzie - InfoSci
Big data, big opportunities
E content.1 - P.SENEKA II-MSC COMPUTER SCIENCE,BON SECOURS COLLEGE FOR WOMEN
BIG DATA ANALYTICS
Conrad We’re not there, yet!
Big data
Big data hype or reality
Latest Trends in Computer Science
IEEE 2014 DOTNET DATA MINING PROJECTS Data mining with big data
Big dataorig
What is Data Science
Discovering Things and Things’ data/services
Big data introduction by quontra solutions
Big data
Big Data meetup R#1 slide
Data Science, Knowledge Discover, Mining and Learning
To share or not to share? machine generated data for science
Dynamic Semantics for Semantics for Dynamic IoT Environments
The impact of Big Data on next generation of smart cities
Large-scale data analytics for smart cities
Ad

Similar to Big data intro.pptx (20)

DOCX
Data and Information.docx
PDF
BIG DATA.pdf
PDF
Big data by Ravi .pdf
PPTX
Evolution & Introduction to Big data-2.pptx
PPTX
Data analytics introduction
PDF
Big Data Analytics Introduction chapter.pdf
PDF
What's the Big Deal About Big Data?
PPTX
Chapter 1 big data
PPSX
Introduction to Big Data Analytics.ppsx
PPTX
Lecture #03
PPTX
Unit – 1 introduction to big datannj.pptx
PPTX
Bigdata Unit1.pptx
PDF
Analysis on big data concepts and applications
PPTX
bigdata- Introduction for pg students fo
PPTX
Big-Data 5V of big data engineering.pptx
PPTX
big data Presentation
PPTX
Big data
PPTX
bigdata introduction for students pg msc
PPTX
Big Data.pptx
Data and Information.docx
BIG DATA.pdf
Big data by Ravi .pdf
Evolution & Introduction to Big data-2.pptx
Data analytics introduction
Big Data Analytics Introduction chapter.pdf
What's the Big Deal About Big Data?
Chapter 1 big data
Introduction to Big Data Analytics.ppsx
Lecture #03
Unit – 1 introduction to big datannj.pptx
Bigdata Unit1.pptx
Analysis on big data concepts and applications
bigdata- Introduction for pg students fo
Big-Data 5V of big data engineering.pptx
big data Presentation
Big data
bigdata introduction for students pg msc
Big Data.pptx
Ad

Recently uploaded (20)

PPT
Miokarditis (Inflamasi pada Otot Jantung)
PDF
Lecture1 pattern recognition............
PPTX
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
PPTX
1_Introduction to advance data techniques.pptx
PPTX
Business Acumen Training GuidePresentation.pptx
PPTX
climate analysis of Dhaka ,Banglades.pptx
PDF
Business Analytics and business intelligence.pdf
PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PPTX
Introduction to machine learning and Linear Models
PDF
Foundation of Data Science unit number two notes
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PDF
Mega Projects Data Mega Projects Data
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PPT
Quality review (1)_presentation of this 21
PDF
Fluorescence-microscope_Botany_detailed content
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PPTX
Computer network topology notes for revision
PPTX
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
Miokarditis (Inflamasi pada Otot Jantung)
Lecture1 pattern recognition............
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
1_Introduction to advance data techniques.pptx
Business Acumen Training GuidePresentation.pptx
climate analysis of Dhaka ,Banglades.pptx
Business Analytics and business intelligence.pdf
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
Introduction to machine learning and Linear Models
Foundation of Data Science unit number two notes
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
Mega Projects Data Mega Projects Data
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
Quality review (1)_presentation of this 21
Fluorescence-microscope_Botany_detailed content
Acceptance and paychological effects of mandatory extra coach I classes.pptx
Computer network topology notes for revision
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
STUDY DESIGN details- Lt Col Maksud (21).pptx
oil_refinery_comprehensive_20250804084928 (1).pptx

Big data intro.pptx

  • 2. Glance at Data in Modern Era 2
  • 3. Data Classification Structured • Granular Queryability • Tables with rows & columns • Eg: RDBMS like SQL • Contribution: 5% Semi-Structured • Spectrum between Structured & Unstructured • Contains tags, schema contained within the data • Eg:XML,JSON,NO SQL Unstructured • Not Queryable • Eg: Audio,Videos,Text,Images,E- Mail • Contribution : 80% 3
  • 4. Overview of Big Data • What? Big data is a term that describes the large volume of data – both structured and unstructured – that inundates a business on a day-to-day basis. • Why? i. Data sets so complex and huge that it becomes tough to process by making use of traditional data processing methods. ii. Warrants innovative solutions for a variety of new and existing data to provide real business benefits. • Where? i. Analyse for insights that lead to better decisions and strategic business moves. ii. Processing large volumes or wide varieties of data remains merely a technological solution unless it is tied to business goals and objectives. iii. Larger operational efficiencies, reduced risk and cost reductions. iv. Reveal patterns, trends and associations related to human behavior and interactions. v. Better understand consumer habits and target marketing campaigns 4
  • 5. Characteristics of Big Data While the term “big data” is relatively new, the act of gathering and storing large amounts of information for eventual analysis is ages old. The concept gained momentum in the early 2000’s when industry analyst Doug Laney articulated it “ 5 Big Data Velocity Variety Veracity Volume Volume : Data will grow from 4.4 zettabytes today to around 44 zettabytes. Velocity: By 2020, about 1.7 megabytes of new information will be created every second for every human being on the planet. Variety: Smart phones will be shipped – all packed with sensors capable of collecting all kinds of data, not to mention the data the users create themselves.
  • 6. 4 V’s Volume Enormous amount of data generated by machines, networks and human interaction on systems like social media. Velocity The pace at which data flows in from sources like business processes, machines, networks and human interaction with things like social media sites, mobile devices, etc. The flow of data is massive and continuous. Variety Variety refers to the many sources and types of data both structured and unstructured. Now data comes in the form of emails, photos, videos, monitoring devices, PDFs, audio, etc. This variety of unstructured data creates problems for storage, mining and analyzing data. Veracity Refers to the biases, noise and abnormality in data. Is the data that is being stored, and mined meaningful to the problem being analyzed 6
  • 7. 7