SlideShare a Scribd company logo
By ,
Abhishek Palo
Regd no:1301308234
CONTENTS
 AN INTRODUCTION TOTHE WORLD OF DATA ?
 WHAT IS BIG DATA ?
 WHO CREATED BIG DATA ?
 WHEN IT WAS CREATED ?
 WHY BIG DATA ?
 WHERE WE’RE USING IT ?
 HOW TO USE IT ?
5’W 1’H
OF BIG DATA
WHAT IS DATA.....?
facts and statistics collected together for reference or analysis.
OR
the quantities, characters, or symbols on which operations
are performed by a computer, which may be stored and
transmitted in the form of electrical signals and recorded
on magnetic, optical, or mechanical recording media.
Types of data
 Traditional
Document
Finance
Stock record
Personal files
 Modern
Photographs
Audio & Video
3D Model
Simulation
Location Data
CAN BE HANDLED BY
RDBMS
DIFFICULT TO BE HANDLED BY
RDBMS
What is BIG DATA ?
Big data generally uses that data sets which can’t be
handled by the traditional software tools to capture ,curate
, manage and process data within a tolerable elapsed time.
As we know that size is a constantly varying target
ranging from few terabytes to many petabytes of data so we
can also say big data is a set of techniques required to
uncover large hidden values from large data sets that are
diverse complex and of a massive scale.
Who created BIG DATA ?
 In his in the year 2001 paper 3D Data Management: Controlling
Data Volume, Velocity and Variety Doug Laney, analyst at Gartner,
defines three of what will come to be the commonly-accepted
characteristics of Big Data.
 Commentators announce in 2005 that we are witnessing the
birth of “Web 2.0” – the user-generated web where the majority of
content will be provided by users of services, rather than the service
providers themselves. This year also witnessed the emerging of
HADOOP an open source platform used to store and process big data.
Contd...
 In 2008 the world’s servers process 9.57 zettabytes (9.57 trillion
gigabytes) of information – equivalent to 12 gigabytes of information
per person, per day), according to the How Much Information? 2010
report. In International Production and Dissemination of Information,
it is estimated that 14.7 exabytes of new information are produced this
year.
 In 2010 Eric Schmidt, executive chairman of Google, tells a conference
that as much data is now being created every two days, as was created
from the beginning of human civilization to the year 2003.
 In 2014 The rise of the mobile machines – as for the first time, more
people are using mobile devices to access digital data, than office or
home computers. 88% of business executives surveyed by GE working
with Accenture report that big data analytics is a top priority for their
business.
Contd...
What this teaches us is that Big Data is not a new or
isolated phenomenon, but one that is part of a long
evolution of capturing and using data. Like other key
developments in data storage, data processing and the
Internet, Big Data is just a further step that will bring
change to the way we run business and society. At the same
time it will lay the foundations on which many evolutions
will be built.
The 3 V’s of big data....
Volume
(Amount of Data)
Velocity
(speed of processing)
Variety
(range and source)
Volume....
As the number of users increasing day by day the mount of
data used by them also increasing simultaneously.
Organisation Data processed(per day)
Ebay 100 pb
Google 100 pb
Baidu 10-100 pb
NSA 29 pb
Spotify 600 pb
Facebook 100 pb
Twitter 64 pb
Contd...
If we analyse these amount of data it would be easier for
the companies to know about their customers. however
traditional data processing system is not able to process
these amount of data .So we need a more reliable data
processing concept which is nothing but BIG DATA.
Velocity
 The amount of data which are uploaded or downloaded by
the users of some organisation are exceeding the capacity
of their IT systems.
 As we can see that the amount of data produced in last 5
years is the 90% of the whole data which are produced by
in last 20 years.
 And in this speed data processing can’t be done by using
traditional RDBMS concepts.
Variety
 Previously we’re dealing with few varieties of data such as
Document Finance Stock record Personal files
 But now a days we’ve to deal with many kinds f data
such as videos ,music ,photographs , simulations and
3D models.
Big data classification
Contd...
 Analysis type — Whether the data is analyzed in real time or batched
for later analysis. Give careful consideration to choosing the analysis
type, since it affects several other decisions about products, tools,
hardware, data sources, and expected data frequency. A mix of both
types may be required by the use case:
 Fraud detection; analysis must be done in real time or near real time.
 Trend analysis for strategic business decisions; analysis can be in batch
mode.
 Processing methodology — The type of technique to be applied for
processing data (e.g., predictive, analytical, ad-hoc query, and
reporting). Business requirements determine the appropriate
processing methodology. A combination of techniques can be used.
The choice of processing methodology helps identify the appropriate
tools and techniques to be used in your big data solution.
Contd...
 Content format — Format of incoming data — structured (RDMBS, for
example), unstructured (audio, video, and images, for example), or
semi-structured. Format determines how the incoming data needs to
be processed and is key to choosing tools and techniques and defining
a solution from a business perspective.
 Data type — Type of data to be processed — transactional, historical,
master data, and others. Knowing the data type helps segregate the
data in storage.
 Data frequency and size — How much data is expected and at what
frequency does it arrive. Knowing frequency and size helps determine
the storage mechanism, storage format, and the necessary pre-
processing tools. Data frequency and size depend on data sources:
 On demand, as with social media data
 Continuous feed, real-time (weather data, transactional data)
 Time series (time-based data)
Contd...
 Data source — Sources of data (where the data is generated) — web
and social media, machine-generated, human-generated, etc.
Identifying all the data sources helps determine the scope from a
business perspective. The figure shows the most widely used data
sources.
 Data consumers — A list of all of the possible consumers of the
processed data: Business processes, Business users, Enterprise
applications, Individual people in various business roles, Part of the
process flows, Other data repositories or enterprise applications.
 Hardware — The type of hardware on which the big data solution will
be implemented — commodity hardware or state of the art.
Understanding the limitations of hardware helps inform the choice of
big data solution.
Big data technology
Where we’re using....
 In medicals
 In social networking sites
 For surveys
 Science and Research
 Real estate
 Retail
 Banking
 Internet of things
 Government sectors
Conclusion
The availability of Big Data, low-cost commodity hardware, and
new information management and analytic software have produced a
unique moment in the history of data analysis. The convergence of
these trends means that we have the capabilities required to analyze
astonishing data sets quickly and cost-effectively for the first time in
history. These capabilities are neither theoretical nor trivial. They
represent a genuine leap forward and a clear opportunity to realize
enormous gains in terms of efficiency, productivity, revenue, and
profitability.
The Age of Big Data is here, and these are truly revolutionary times
if both business and technology professionals continue to work
together and deliver on the promise.
THANK YOU

More Related Content

DOCX
Big data lecture notes
PDF
Big data unit i
PPTX
Big data
PDF
Big data analytics, research report
PPSX
Applications of Big Data Analytics in Businesses
DOCX
Big data (word file)
PDF
Big Data & Analytics (Conceptual and Practical Introduction)
PPTX
Data mining with big data implementation
Big data lecture notes
Big data unit i
Big data
Big data analytics, research report
Applications of Big Data Analytics in Businesses
Big data (word file)
Big Data & Analytics (Conceptual and Practical Introduction)
Data mining with big data implementation

What's hot (20)

PDF
Big Data Evolution
PPTX
Chapter 4 what is data and data types
PDF
Big Data : Risks and Opportunities
PPTX
Big Data for Beginners
PPTX
Big data-ppt
PPTX
Presentation on Big Data
PDF
Big data-analytics-ebook
PDF
Big Data & Future - Big Data, Analytics, Cloud, SDN, Internet of things
PPTX
Big data
PPTX
Big data
PDF
Applications of Big Data
PPTX
Big data-ppt-
PPT
Research issues in the big data and its Challenges
PPTX
Fraud and Risk in Big Data
PPTX
A Big Data Concept
PDF
Big data-analytics-cpe8035
PPTX
Big data
PPTX
Big Data and Classification
Big Data Evolution
Chapter 4 what is data and data types
Big Data : Risks and Opportunities
Big Data for Beginners
Big data-ppt
Presentation on Big Data
Big data-analytics-ebook
Big Data & Future - Big Data, Analytics, Cloud, SDN, Internet of things
Big data
Big data
Applications of Big Data
Big data-ppt-
Research issues in the big data and its Challenges
Fraud and Risk in Big Data
A Big Data Concept
Big data-analytics-cpe8035
Big data
Big Data and Classification
Ad

Viewers also liked (11)

PDF
Edmond de Rothschild group key data 2016
PPTX
Big data
PPTX
Desafíos a la ética
PPTX
Estrategias de aprendizaje
PDF
Rethinking Information Architecture for SEO and Content Marketing
PDF
Capítulo 20 la diversidad de los protistas
PDF
Satellite broadcasting
PDF
Satellite broadcasting
PPTX
USWNT Sponsorship
DOCX
Trabajoventiladores
DOC
CV_E&I QC
Edmond de Rothschild group key data 2016
Big data
Desafíos a la ética
Estrategias de aprendizaje
Rethinking Information Architecture for SEO and Content Marketing
Capítulo 20 la diversidad de los protistas
Satellite broadcasting
Satellite broadcasting
USWNT Sponsorship
Trabajoventiladores
CV_E&I QC
Ad

Similar to Big data (20)

DOCX
Introduction to big data – convergences.
PDF
IRJET- Big Data Management and Growth Enhancement
PPTX
PDF
big-datagroup6-150317090053-conversion-gate01.pdf
PPTX
Guide to big data analytics
PPTX
PPTX
Age Friendly Economy - Introduction to Big Data
PDF
An Investigation on Scalable and Efficient Privacy Preserving Challenges for ...
PPTX
Big data.pptx
PPTX
Data mining with big data
PDF
Big data upload
PPTX
Big data
PDF
Mastering Big Data: Tools, Techniques, and Applications
PPTX
BIG DATA & DATA ANALYTICS
PPTX
An Overview of BigData
PDF
MBA-TU-Thailand:BigData for business startup.
PDF
PPTX
Big data seminor
PPTX
BIG DATA,WHAT IS BIG DATA?THREE CHARACTERISTICS OF BIG DATA
PPTX
WHAT IS BIG DATA,THREE CHARACTERISTICS OF BIG DATA
Introduction to big data – convergences.
IRJET- Big Data Management and Growth Enhancement
big-datagroup6-150317090053-conversion-gate01.pdf
Guide to big data analytics
Age Friendly Economy - Introduction to Big Data
An Investigation on Scalable and Efficient Privacy Preserving Challenges for ...
Big data.pptx
Data mining with big data
Big data upload
Big data
Mastering Big Data: Tools, Techniques, and Applications
BIG DATA & DATA ANALYTICS
An Overview of BigData
MBA-TU-Thailand:BigData for business startup.
Big data seminor
BIG DATA,WHAT IS BIG DATA?THREE CHARACTERISTICS OF BIG DATA
WHAT IS BIG DATA,THREE CHARACTERISTICS OF BIG DATA

Recently uploaded (20)

PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PDF
Launch Your Data Science Career in Kochi – 2025
PDF
.pdf is not working space design for the following data for the following dat...
PPTX
Business Acumen Training GuidePresentation.pptx
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PDF
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PPTX
Supervised vs unsupervised machine learning algorithms
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PPTX
climate analysis of Dhaka ,Banglades.pptx
PPT
Quality review (1)_presentation of this 21
PPTX
Moving the Public Sector (Government) to a Digital Adoption
PDF
Lecture1 pattern recognition............
PPT
Chapter 3 METAL JOINING.pptnnnnnnnnnnnnn
PPTX
Database Infoormation System (DBIS).pptx
PDF
Foundation of Data Science unit number two notes
PPT
Reliability_Chapter_ presentation 1221.5784
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
Launch Your Data Science Career in Kochi – 2025
.pdf is not working space design for the following data for the following dat...
Business Acumen Training GuidePresentation.pptx
oil_refinery_comprehensive_20250804084928 (1).pptx
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
Data_Analytics_and_PowerBI_Presentation.pptx
Acceptance and paychological effects of mandatory extra coach I classes.pptx
Supervised vs unsupervised machine learning algorithms
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
climate analysis of Dhaka ,Banglades.pptx
Quality review (1)_presentation of this 21
Moving the Public Sector (Government) to a Digital Adoption
Lecture1 pattern recognition............
Chapter 3 METAL JOINING.pptnnnnnnnnnnnnn
Database Infoormation System (DBIS).pptx
Foundation of Data Science unit number two notes
Reliability_Chapter_ presentation 1221.5784
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb

Big data

  • 1. By , Abhishek Palo Regd no:1301308234
  • 2. CONTENTS  AN INTRODUCTION TOTHE WORLD OF DATA ?  WHAT IS BIG DATA ?  WHO CREATED BIG DATA ?  WHEN IT WAS CREATED ?  WHY BIG DATA ?  WHERE WE’RE USING IT ?  HOW TO USE IT ? 5’W 1’H OF BIG DATA
  • 3. WHAT IS DATA.....? facts and statistics collected together for reference or analysis. OR the quantities, characters, or symbols on which operations are performed by a computer, which may be stored and transmitted in the form of electrical signals and recorded on magnetic, optical, or mechanical recording media.
  • 4. Types of data  Traditional Document Finance Stock record Personal files  Modern Photographs Audio & Video 3D Model Simulation Location Data CAN BE HANDLED BY RDBMS DIFFICULT TO BE HANDLED BY RDBMS
  • 5. What is BIG DATA ? Big data generally uses that data sets which can’t be handled by the traditional software tools to capture ,curate , manage and process data within a tolerable elapsed time. As we know that size is a constantly varying target ranging from few terabytes to many petabytes of data so we can also say big data is a set of techniques required to uncover large hidden values from large data sets that are diverse complex and of a massive scale.
  • 6. Who created BIG DATA ?  In his in the year 2001 paper 3D Data Management: Controlling Data Volume, Velocity and Variety Doug Laney, analyst at Gartner, defines three of what will come to be the commonly-accepted characteristics of Big Data.  Commentators announce in 2005 that we are witnessing the birth of “Web 2.0” – the user-generated web where the majority of content will be provided by users of services, rather than the service providers themselves. This year also witnessed the emerging of HADOOP an open source platform used to store and process big data.
  • 7. Contd...  In 2008 the world’s servers process 9.57 zettabytes (9.57 trillion gigabytes) of information – equivalent to 12 gigabytes of information per person, per day), according to the How Much Information? 2010 report. In International Production and Dissemination of Information, it is estimated that 14.7 exabytes of new information are produced this year.  In 2010 Eric Schmidt, executive chairman of Google, tells a conference that as much data is now being created every two days, as was created from the beginning of human civilization to the year 2003.  In 2014 The rise of the mobile machines – as for the first time, more people are using mobile devices to access digital data, than office or home computers. 88% of business executives surveyed by GE working with Accenture report that big data analytics is a top priority for their business.
  • 8. Contd... What this teaches us is that Big Data is not a new or isolated phenomenon, but one that is part of a long evolution of capturing and using data. Like other key developments in data storage, data processing and the Internet, Big Data is just a further step that will bring change to the way we run business and society. At the same time it will lay the foundations on which many evolutions will be built.
  • 9. The 3 V’s of big data.... Volume (Amount of Data) Velocity (speed of processing) Variety (range and source)
  • 10. Volume.... As the number of users increasing day by day the mount of data used by them also increasing simultaneously. Organisation Data processed(per day) Ebay 100 pb Google 100 pb Baidu 10-100 pb NSA 29 pb Spotify 600 pb Facebook 100 pb Twitter 64 pb
  • 11. Contd... If we analyse these amount of data it would be easier for the companies to know about their customers. however traditional data processing system is not able to process these amount of data .So we need a more reliable data processing concept which is nothing but BIG DATA.
  • 12. Velocity  The amount of data which are uploaded or downloaded by the users of some organisation are exceeding the capacity of their IT systems.  As we can see that the amount of data produced in last 5 years is the 90% of the whole data which are produced by in last 20 years.  And in this speed data processing can’t be done by using traditional RDBMS concepts.
  • 13. Variety  Previously we’re dealing with few varieties of data such as Document Finance Stock record Personal files  But now a days we’ve to deal with many kinds f data such as videos ,music ,photographs , simulations and 3D models.
  • 15. Contd...  Analysis type — Whether the data is analyzed in real time or batched for later analysis. Give careful consideration to choosing the analysis type, since it affects several other decisions about products, tools, hardware, data sources, and expected data frequency. A mix of both types may be required by the use case:  Fraud detection; analysis must be done in real time or near real time.  Trend analysis for strategic business decisions; analysis can be in batch mode.  Processing methodology — The type of technique to be applied for processing data (e.g., predictive, analytical, ad-hoc query, and reporting). Business requirements determine the appropriate processing methodology. A combination of techniques can be used. The choice of processing methodology helps identify the appropriate tools and techniques to be used in your big data solution.
  • 16. Contd...  Content format — Format of incoming data — structured (RDMBS, for example), unstructured (audio, video, and images, for example), or semi-structured. Format determines how the incoming data needs to be processed and is key to choosing tools and techniques and defining a solution from a business perspective.  Data type — Type of data to be processed — transactional, historical, master data, and others. Knowing the data type helps segregate the data in storage.  Data frequency and size — How much data is expected and at what frequency does it arrive. Knowing frequency and size helps determine the storage mechanism, storage format, and the necessary pre- processing tools. Data frequency and size depend on data sources:  On demand, as with social media data  Continuous feed, real-time (weather data, transactional data)  Time series (time-based data)
  • 17. Contd...  Data source — Sources of data (where the data is generated) — web and social media, machine-generated, human-generated, etc. Identifying all the data sources helps determine the scope from a business perspective. The figure shows the most widely used data sources.  Data consumers — A list of all of the possible consumers of the processed data: Business processes, Business users, Enterprise applications, Individual people in various business roles, Part of the process flows, Other data repositories or enterprise applications.  Hardware — The type of hardware on which the big data solution will be implemented — commodity hardware or state of the art. Understanding the limitations of hardware helps inform the choice of big data solution.
  • 19. Where we’re using....  In medicals  In social networking sites  For surveys  Science and Research  Real estate  Retail  Banking  Internet of things  Government sectors
  • 20. Conclusion The availability of Big Data, low-cost commodity hardware, and new information management and analytic software have produced a unique moment in the history of data analysis. The convergence of these trends means that we have the capabilities required to analyze astonishing data sets quickly and cost-effectively for the first time in history. These capabilities are neither theoretical nor trivial. They represent a genuine leap forward and a clear opportunity to realize enormous gains in terms of efficiency, productivity, revenue, and profitability. The Age of Big Data is here, and these are truly revolutionary times if both business and technology professionals continue to work together and deliver on the promise.