SlideShare a Scribd company logo
Presented by
B Srujana
MTECH(CSE)
19K91D5813
Contents
1. Introduction
2. What is Big Data
3. Characteristic of Big Data
4. Storing ,selecting and processing of Big Data
5. Big Data Examples
6.Tools used in Big Data
7.Application of Big Data
8.Future of Big Data
9.Conclusion
10.Referrences
1.Introduction
• Big Data may well be the Next Big Thing in the IT world.
• Big data burst upon the scene in the first decade of the 21st
century
• The first organizations to embrace it were online and startup
firms.
Firms like Google, eBay, LinkedIn, and Face book were
built around big data from the beginning.
• Like many new information technologies, big data can bring
about dramatic cost reductions, substantial improvements in
the time required to perform a computing task, or new
product and service offerings.
What is Data?
The quantities, characters, or symbols on which operations are
performed by a computer, which may be stored and
transmitted in the form of electrical signals and recorded on
magnetic, optical, or mechanical recording media.
2. WHAT is BIG DATA
Big Data is also data but with a huge size. Big Data
is a term used to describe a collection of data that is
huge in size and yet growing exponentially with
time. In short such data is so large and complex that
none of the traditional data management tools are
able to store it or process it efficiently
3. Characteristic of Big Data
Volume – The name Big Data itself is related to a size which is enormous. Size of data
plays a very crucial role in determining value out of data. Also, whether a particular data can
actually be considered as a Big Data or not, is dependent upon the volume of data.
Hence, 'Volume' is one characteristic which needs to be considered while dealing with Big Data
 Variety – The next aspect of Big Data is its variety.
Variety refers to heterogeneous sources and the nature of data, both structured and unstructured.
During earlier days, spreadsheets and databases were the only sources of data considered by most
of the applications. Nowadays, data in the form of emails, photos, videos, monitoring devices,
PDFs, audio, etc. are also being considered in the analysis applications. This variety of
unstructured data poses certain issues for storage, mining and analyzing data.
 Velocity – The term 'velocity' refers to the speed of generation of data. How fast the
data is generated and processed to meet the demands, determines real potential in the
data. Big Data Velocity deals with the speed at which data flows in from sources like
business processes, application logs, networks, and social media sites, sensors, Mobile
devices, etc. The flow of data is massive and continuous
Variability – This refers to the inconsistency which can be shown by the data at times,
thus hampering the process of being able to handle and manage the data effectively
4.Storing ,Selecting and Processing of Big Data
1.Storing
Analyzing your data characteristics
• Selecting data sources for analysis
• Eliminating redundant data
• Establishing the role of No SQL
Overview of Big Data stores
• Data models: key value, graph, document, column-family
• Hadoop Distributed File System
• H Base
• Hive
2.Selecting
•Choosing the correct data stores based on your data characteristics
• Moving code to data and Implementing polyglot data store
solutions
• Aligning business goals to the appropriate data store
3.STORING OF BIGDATA
Integrating disparate data stores
• Mapping data to the programming framework
• Connecting and extracting data from storage
• Transforming data for processing
• Subdividing data in preparation for Hadoop Map Reduce
Employing Hadoop Map Reduce
• Creating the components of Hadoop Map Reduce jobs
• Distributing data processing across server farms
• Executing Hadoop Map Reduce jobs
• Monitoring the progress of job flows
The Structure of Big Data
Structured
Any data that can be stored, accessed and processed in the
form of fixed format is termed as a 'structured' data.
Unstructured
Any data with unknown form or the structure is classified
as unstructured data. In addition to the size being huge,
un-structured data poses multiple challenges in terms of
its processing for deriving value out of it.
Semi-structured
Semi-structured data can contain both the forms of data.
We can see semi-structured data as a structured in form
but it is actually not defined with e.g. a table definition in
relational DBMS. Example :an XML file
Big data seminor
5.Examples of Big Data
New York Stock Exchange:
The New York Stock Exchange generates about one
terabyte of new trade data per day.
Social Media:
The statistic shows that 500+terabytes of new data get
ingested into the databases of social media
site Facebook, every day. This data is mainly
generated in terms of photo and video uploads,
message exchanges, putting comments etc.
Jet Engine:
A single Jet engine can generate 10+terabytes of data
in 30 minutes of flight time. With many thousand
flights per day, generation of data reaches up to
many Petabytes
6.Types of top tools used in Big-Data
Hadoop. Apache Apache Spark Apache Storm.
Cassandra. RapidMiner.MongoDB.
R Programming Tool. Neo4j.
Maximilien Brice, © CERN
Big data seminor
7.Application Of Big Data analytics
•Homeland Security
• Smarter Healthcare
•Multi-channel sales
•Telecom
•Manufacturing
• Traffic Control
•Trading Analytics
• Search Quality
8.Future of Big Data
• $15 billion on software firms only specializing in data
management and analytics.
• This industry on its own is worth more than $100 billion and
growing at almost 10% a year which is roughly twice as fast as
the software business as a whole.
• In February 2012, the open source analyst firm Wikibon
released the first market forecast for Big Data , listing $5.1B
revenue in 2012 with growth to $53.4B in 2017
•The McKinsey Global Institute estimates that data volume is
growing 40% per year, and will grow 44x between 2009 and
2020.
Big data seminor
Big data seminor
Big data seminor

More Related Content

PPTX
Data mining on big data
PPTX
Mining Big Data in Real Time
PPT
Data mining with big data
PPT
Big Data
DOCX
JPJ1417 Data Mining With Big Data
PDF
Big Data, Big Deal: For Future Big Data Scientists
PPTX
Data mining with big data
PDF
Data minig with Big data analysis
Data mining on big data
Mining Big Data in Real Time
Data mining with big data
Big Data
JPJ1417 Data Mining With Big Data
Big Data, Big Deal: For Future Big Data Scientists
Data mining with big data
Data minig with Big data analysis

What's hot (20)

PPTX
Big data
PPT
Big data
PPTX
Data mining with big data implementation
PDF
Big data introduction
PPTX
Big Data
PPT
Research issues in the big data and its Challenges
PPTX
Introduction to Big Data & Big Data 1.0 System
PPTX
In memory big data management and processing
PDF
Big data
PPTX
Big data
PPTX
big data Presentation
PPTX
PPTX
Big data
PPTX
Big data Ppt
PPTX
Big data
PPTX
PPTX
Big Data in Distributed Analytics,Cybersecurity And Digital Forensics
PPTX
Presentation on Big Data
PPTX
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Big data
Big data
Data mining with big data implementation
Big data introduction
Big Data
Research issues in the big data and its Challenges
Introduction to Big Data & Big Data 1.0 System
In memory big data management and processing
Big data
Big data
big data Presentation
Big data
Big data Ppt
Big data
Big Data in Distributed Analytics,Cybersecurity And Digital Forensics
Presentation on Big Data
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Ad

Similar to Big data seminor (20)

DOCX
Content1. Introduction2. What is Big Data3. Characte.docx
PPTX
ppt final.pptx
PDF
Bigdatappt 140225061440-phpapp01
PPTX
Presentation on Big Data
PPTX
Kartikey tripathi
PPTX
bigdata.pptx
PPTX
Big data ppt
PPTX
bigdatappt.pptx
PPTX
Big_Data_ppt[1] (1).pptx
PPTX
Special issues on big data
DOCX
BIGDATAPrepared ByMuhammad Abrar UddinIntrodu.docx
PPTX
Big data Analytics
PPTX
BIGDATA-Basics-Sources-types-Impact.pptx
PPTX
Big data-ppt
PPTX
Big data
PPTX
Introduction to Big Data
PPTX
Big Data ppt
PPTX
WHAT IS BIG DATA,THREE CHARACTERISTICS OF BIG DATA
PPTX
BIG DATA,WHAT IS BIG DATA?THREE CHARACTERISTICS OF BIG DATA
Content1. Introduction2. What is Big Data3. Characte.docx
ppt final.pptx
Bigdatappt 140225061440-phpapp01
Presentation on Big Data
Kartikey tripathi
bigdata.pptx
Big data ppt
bigdatappt.pptx
Big_Data_ppt[1] (1).pptx
Special issues on big data
BIGDATAPrepared ByMuhammad Abrar UddinIntrodu.docx
Big data Analytics
BIGDATA-Basics-Sources-types-Impact.pptx
Big data-ppt
Big data
Introduction to Big Data
Big Data ppt
WHAT IS BIG DATA,THREE CHARACTERISTICS OF BIG DATA
BIG DATA,WHAT IS BIG DATA?THREE CHARACTERISTICS OF BIG DATA
Ad

More from berasrujana (6)

PDF
Network programming pdf
PDF
Topic : Shared memory
DOCX
Distributed computing file
PPTX
Capgemini 1
PPTX
Kairos aarohan
PPTX
Atm using fingerprint
Network programming pdf
Topic : Shared memory
Distributed computing file
Capgemini 1
Kairos aarohan
Atm using fingerprint

Recently uploaded (20)

PDF
Machine learning based COVID-19 study performance prediction
PDF
Review of recent advances in non-invasive hemoglobin estimation
PPT
Teaching material agriculture food technology
PDF
Encapsulation theory and applications.pdf
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PPTX
Cloud computing and distributed systems.
PDF
Empathic Computing: Creating Shared Understanding
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Machine learning based COVID-19 study performance prediction
Review of recent advances in non-invasive hemoglobin estimation
Teaching material agriculture food technology
Encapsulation theory and applications.pdf
“AI and Expert System Decision Support & Business Intelligence Systems”
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
CIFDAQ's Market Insight: SEC Turns Pro Crypto
NewMind AI Monthly Chronicles - July 2025
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Diabetes mellitus diagnosis method based random forest with bat algorithm
MYSQL Presentation for SQL database connectivity
Advanced methodologies resolving dimensionality complications for autism neur...
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
Cloud computing and distributed systems.
Empathic Computing: Creating Shared Understanding
Chapter 3 Spatial Domain Image Processing.pdf
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...

Big data seminor

  • 2. Contents 1. Introduction 2. What is Big Data 3. Characteristic of Big Data 4. Storing ,selecting and processing of Big Data 5. Big Data Examples 6.Tools used in Big Data 7.Application of Big Data 8.Future of Big Data 9.Conclusion 10.Referrences
  • 3. 1.Introduction • Big Data may well be the Next Big Thing in the IT world. • Big data burst upon the scene in the first decade of the 21st century • The first organizations to embrace it were online and startup firms. Firms like Google, eBay, LinkedIn, and Face book were built around big data from the beginning. • Like many new information technologies, big data can bring about dramatic cost reductions, substantial improvements in the time required to perform a computing task, or new product and service offerings.
  • 4. What is Data? The quantities, characters, or symbols on which operations are performed by a computer, which may be stored and transmitted in the form of electrical signals and recorded on magnetic, optical, or mechanical recording media.
  • 5. 2. WHAT is BIG DATA Big Data is also data but with a huge size. Big Data is a term used to describe a collection of data that is huge in size and yet growing exponentially with time. In short such data is so large and complex that none of the traditional data management tools are able to store it or process it efficiently
  • 6. 3. Characteristic of Big Data Volume – The name Big Data itself is related to a size which is enormous. Size of data plays a very crucial role in determining value out of data. Also, whether a particular data can actually be considered as a Big Data or not, is dependent upon the volume of data. Hence, 'Volume' is one characteristic which needs to be considered while dealing with Big Data  Variety – The next aspect of Big Data is its variety. Variety refers to heterogeneous sources and the nature of data, both structured and unstructured. During earlier days, spreadsheets and databases were the only sources of data considered by most of the applications. Nowadays, data in the form of emails, photos, videos, monitoring devices, PDFs, audio, etc. are also being considered in the analysis applications. This variety of unstructured data poses certain issues for storage, mining and analyzing data.  Velocity – The term 'velocity' refers to the speed of generation of data. How fast the data is generated and processed to meet the demands, determines real potential in the data. Big Data Velocity deals with the speed at which data flows in from sources like business processes, application logs, networks, and social media sites, sensors, Mobile devices, etc. The flow of data is massive and continuous
  • 7. Variability – This refers to the inconsistency which can be shown by the data at times, thus hampering the process of being able to handle and manage the data effectively
  • 8. 4.Storing ,Selecting and Processing of Big Data 1.Storing Analyzing your data characteristics • Selecting data sources for analysis • Eliminating redundant data • Establishing the role of No SQL Overview of Big Data stores • Data models: key value, graph, document, column-family • Hadoop Distributed File System • H Base • Hive 2.Selecting •Choosing the correct data stores based on your data characteristics • Moving code to data and Implementing polyglot data store solutions • Aligning business goals to the appropriate data store
  • 9. 3.STORING OF BIGDATA Integrating disparate data stores • Mapping data to the programming framework • Connecting and extracting data from storage • Transforming data for processing • Subdividing data in preparation for Hadoop Map Reduce Employing Hadoop Map Reduce • Creating the components of Hadoop Map Reduce jobs • Distributing data processing across server farms • Executing Hadoop Map Reduce jobs • Monitoring the progress of job flows
  • 10. The Structure of Big Data Structured Any data that can be stored, accessed and processed in the form of fixed format is termed as a 'structured' data. Unstructured Any data with unknown form or the structure is classified as unstructured data. In addition to the size being huge, un-structured data poses multiple challenges in terms of its processing for deriving value out of it. Semi-structured Semi-structured data can contain both the forms of data. We can see semi-structured data as a structured in form but it is actually not defined with e.g. a table definition in relational DBMS. Example :an XML file
  • 12. 5.Examples of Big Data New York Stock Exchange: The New York Stock Exchange generates about one terabyte of new trade data per day. Social Media: The statistic shows that 500+terabytes of new data get ingested into the databases of social media site Facebook, every day. This data is mainly generated in terms of photo and video uploads, message exchanges, putting comments etc. Jet Engine: A single Jet engine can generate 10+terabytes of data in 30 minutes of flight time. With many thousand flights per day, generation of data reaches up to many Petabytes
  • 13. 6.Types of top tools used in Big-Data Hadoop. Apache Apache Spark Apache Storm. Cassandra. RapidMiner.MongoDB. R Programming Tool. Neo4j.
  • 16. 7.Application Of Big Data analytics •Homeland Security • Smarter Healthcare •Multi-channel sales •Telecom •Manufacturing • Traffic Control •Trading Analytics • Search Quality
  • 17. 8.Future of Big Data • $15 billion on software firms only specializing in data management and analytics. • This industry on its own is worth more than $100 billion and growing at almost 10% a year which is roughly twice as fast as the software business as a whole. • In February 2012, the open source analyst firm Wikibon released the first market forecast for Big Data , listing $5.1B revenue in 2012 with growth to $53.4B in 2017 •The McKinsey Global Institute estimates that data volume is growing 40% per year, and will grow 44x between 2009 and 2020.