SlideShare a Scribd company logo
Bigdata
Big Data
• Big Data is term utilized to refer to the increase in the volume of data that are difficult
to store, process and analyze through traditional data base technology.
Other definitions
• Big Data refers to extremely large datasets that are too complex and voluminous
for traditional data processing software to handle efficiently.
• Big data is a combination of structured, semi-structured and unstructured data
that organizations collect, analyze and mine for information and insights.
• Big data refers to the large, diverse sets of information that grow at ever-
increasing rates.
• Big data is a voluminous set of structured, unstructured, and semi-structured
datasets, which is challenging to manage using traditional data processing tools.
Characteristic of Big Data
Characteristic of Big Data
"V"s of Big Data- Key characteristics of bigdata
• Volume: The amount of data generated is vast and continuously growing. Big Data deals with large
quantities of data from various sources.
• Velocity: The speed at which data is generated, collected, and processed. Big Data requires the ability to
handle this rapid arrival of data in real-time or near real-time.
• Variety: The different types of data, both structured (like databases) and unstructured (like text, images,
and videos), that need to be analyzed.
• Veracity: The quality and accuracy of the data. It addresses the trustworthiness and reliability of the data
being used.
• Value: The potential insights and benefits that can be derived from analyzing the data. This V emphasizes
the importance of turning data into actionable insights.
• Variability: The inconsistency of the data, which can vary greatly in terms of quality and format. This
characteristic highlights the challenges in processing and managing data that may change over time.
• Visualization: The ability to represent data in a way that is understandable and actionable, often through
charts, graphs, and other visual tools.
Example explanation of all V’s
1. Volume:
Example: Social media platforms like Facebook generate massive amounts of data
every day. Each user's posts, comments, likes, and shares contribute to an enormous
volume of data that needs to be stored and analyzed.
2.Velocity:
Example: Stock trading systems process millions of transactions per second. The
speed at which this data is generated and needs to be processed to make real-time
trading decisions exemplifies the velocity characteristic.
3.Variety:
Example: An e-commerce company collects data from various sources such as
transaction records, customer reviews (text), social media interactions (images,
videos), and customer service calls (audio). This mix of data types demonstrates the
variety in Big Data.
Example explanation of all V’s
4.Veracity:
Example: In healthcare, patient data can come from various sources like medical
records, wearable devices, and patient surveys. Ensuring this data is accurate and
reliable (veracity) is crucial for making correct diagnoses and treatment plans.
5. Value:
Example: A retailer analyzes customer purchasing patterns and discovers that sales
of certain products increase during specific times of the year. This insight (value)
allows the retailer to optimize inventory and marketing strategies to boost sales.
Example explanation of all V’s
6. Variability:
Example: Weather data is collected from sensors, satellites, and weather stations.
The data can vary greatly due to changes in weather patterns, sensor malfunctions,
and varying data formats, showcasing variability.
7.Visualization:
Example: A city government uses data from traffic sensors and public transportation
systems to create interactive maps and dashboards. These visualizations help city
planners and residents understand traffic flow and make informed decisions about
commuting.
Types of Big Data
• Structure
• Unstructure
• Semi Structure
• Transaction data
• Machine Data
• Spatial data
• Time series data
• Open Data
Types of Big Data examples
Structured Data:
Example: Data stored in databases, spreadsheets, and tables. This type of data is
organized and easily searchable by simple algorithms.
Sources: Relational databases, Excel files, transaction records
Unstructured Data:
Example: Data that does not have a predefined data model or structure. This includes text,
images, videos, and social media posts.
Sources: Social media platforms, multimedia files, email messages.
Types of Big Data examples
Semi-Structured Data:
Example: Data that does not conform to a rigid structure but contains tags or markers to separate semantic
elements.
Sources: XML files, JSON documents, NoSQL databases.
Transactional Data:
Example: Data generated from daily operations and transactions, such as sales, purchases, and financial
operations.
Sources: Point-of-sale systems, online transaction processing systems.
Types of Big Data examples
Machine Data:
Example: Data generated by machines, sensors, and other devices, often used in IoT applications.
Sources: Log files, industrial equipment sensors, network devices.
Spatial Data:
Example: Data that represents the physical location and shape of objects, often used in
geographic information systems (GIS).
Sources: GPS data, satellite imagery, maps.
Types of Big Data examples
Time-Series Data:
Example: Data points indexed or organized by time, capturing data changes over time.
Sources: Financial market data, weather monitoring systems, IoT sensors.
Open Data:
Example: Data that is freely available for anyone to use and share, often provided by
governments or public organizations.
Sources: Government databases, public research datasets, open-source projects.
Classification of Big Data
Big Data can be classified based on various criteria, including its
• Structure
• Source
• Processing method
• Data Source
• Contents
• Usage
Classification of Big Data
Based on Structure
1. Structured Data:
• Organized in fixed formats or schemas.
• Examples: Relational databases, spreadsheets.
1. Unstructured Data:
• Lacks a predefined format or organization.
• Examples: Text documents, social media posts, videos, images.
1. Semi-Structured Data:
• Does not conform to a strict structure but contains tags or markers to separate data elements.
• Examples: XML files, JSON documents, emails.
Classification of Big Data
Based on Source
1. Human-Generated Data:
• Created by people through their interactions with various systems.
• Examples: Social media content, emails, online reviews.
1. Machine-Generated Data:
• Produced by machines or systems without human intervention.
• Examples: Sensor data, log files, transactional data from automated
systems.
Classification of Big Data
Based on Processing Method
1. Batch Processing Data:
• Processed in large volumes at a specific time.
• Examples: Payroll systems, large-scale data analysis tasks.
1. Real-Time Processing Data:
• Processed instantly as it is generated.
• Examples: Financial transactions, real-time monitoring systems.
Classification of Big Data
Based on Data Sources
1. Internal Data:
• Collected from within an organization.
• Examples: Sales records, employee information, internal
surveys.
1. External Data:
• Sourced from outside the organization.
• Examples: Market research data, social media feeds, public
datasets.
Classification of Big Data
Based on Usage
1. Operational Data:
• Used for daily operations and transactions.
• Examples: Online transaction records, CRM data.
1. Analytical Data:
• Used for analysis and decision-making.
• Examples: Data warehouses, business intelligence reports.
Classification of Big Data
1. Based on Content
Text Data:
• Comprises textual information.
• Examples: Documents, emails, web pages.
Multimedia Data:
• Includes various forms of media.
• Examples: Images, audio files, videos.
Time-Series Data:
• Sequential data points indexed in time order.
• Examples: Stock prices, sensor readings.
Geospatial Data:
• Related to geographic or spatial aspects.
• Examples: GPS data, maps.
Sources of big data
1. Social Media Platforms:
• Examples: Facebook, Twitter, Instagram, LinkedIn.
• Data Types: Posts, comments, likes, shares, user profiles, multimedia content.
1. Sensor Data:
• Examples: IoT devices, weather stations, industrial equipment, smart meters.
• Data Types: Temperature readings, humidity levels, machinery status, energy
consumption.
1. Transactional Data:
• Examples: Online shopping platforms, banking transactions, point-of-sale (POS) systems.
• Data Types: Purchase records, payment histories, order details.
Sources of big data
4. Machine-Generated Data:
• Examples: Server logs, application logs, network traffic data.
• Data Types: Error logs, access logs, event logs.
5. Public Data:
• Examples: Government databases, open data initiatives, public research datasets.
• Data Types: Census data, economic indicators, public health records.
6. Enterprise Data:
• Examples: Customer relationship management (CRM) systems, enterprise resource
planning (ERP) systems, internal surveys.
• Data Types: Customer profiles, sales data, inventory levels.
Sources of big data
7. Multimedia Data:
• Examples: Video streaming services, online photo galleries, audio
libraries.
• Data Types: Videos, images, audio files.
8.Web Data:
• Examples: Websites, blogs, forums, e-commerce sites.
• Data Types: Web pages, blog posts, product reviews, user ratings.
9. Geospatial Data:
• Examples: GPS devices, satellite imagery, geographic information
systems (GIS).
• Data Types: Location coordinates, maps, satellite images.
Sources of big data
10. Health Data:
• Examples: Electronic health records (EHRs), medical imaging, wearable
health devices.
• Data Types: Patient records, MRI scans, fitness tracker data.
11. Communication Data:
• Examples: Emails, text messages, call records.
• Data Types: Email content, SMS logs, call duration.
12. Scientific Data:
• Examples: Research experiments, laboratory results, astronomical
observations.
• Data Types: Experimental data, research findings, telescope images.

More Related Content

PPTX
Introduction to Big Data
PDF
Big Data Analytics Introduction chapter.pdf
PPTX
Big Data
DOCX
Content1. Introduction2. What is Big Data3. Characte.docx
PPTX
ppt final.pptx
PPTX
Special issues on big data
PDF
Lesson_1_definitions_BIG DATA INROSUCTIONUE.pdf
PPTX
Introduction to Big Data
Big Data Analytics Introduction chapter.pdf
Big Data
Content1. Introduction2. What is Big Data3. Characte.docx
ppt final.pptx
Special issues on big data
Lesson_1_definitions_BIG DATA INROSUCTIONUE.pdf

Similar to BIG DATA INTRO , bigdata_intro , Hadoop PPT (20)

PPTX
Data analytics introduction
PDF
Introduction to Big Data Analytics Unit 1 .pdf
PPTX
What is big data
PDF
@vtucode.in-21CS71-module-1-pdf.pdfBig data
PPTX
Big data ppt
DOCX
BIGDATAPrepared ByMuhammad Abrar UddinIntrodu.docx
PPTX
Big data
PDF
Intro to big data and applications - day 1
PPTX
Big data
PPTX
bigdatappt.pptx
PPTX
Kartikey tripathi
PPTX
Big Data Analytics
PDF
Big data document (basic concepts,3vs,Bigdata vs Smalldata,importance,storage...
PDF
Bigdatappt 140225061440-phpapp01
PDF
Big Data - Insights & Challenges
PPTX
Big_Data_ppt[1] (1).pptx
PPTX
Big data Presentation
PPTX
SKILLWISE-BIGDATA ANALYSIS
PPTX
Big data Analytics
PPTX
bigdata.pptx
Data analytics introduction
Introduction to Big Data Analytics Unit 1 .pdf
What is big data
@vtucode.in-21CS71-module-1-pdf.pdfBig data
Big data ppt
BIGDATAPrepared ByMuhammad Abrar UddinIntrodu.docx
Big data
Intro to big data and applications - day 1
Big data
bigdatappt.pptx
Kartikey tripathi
Big Data Analytics
Big data document (basic concepts,3vs,Bigdata vs Smalldata,importance,storage...
Bigdatappt 140225061440-phpapp01
Big Data - Insights & Challenges
Big_Data_ppt[1] (1).pptx
Big data Presentation
SKILLWISE-BIGDATA ANALYSIS
Big data Analytics
bigdata.pptx
Ad

Recently uploaded (20)

PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PPTX
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
PDF
Foundation of Data Science unit number two notes
PPTX
IB Computer Science - Internal Assessment.pptx
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PDF
Clinical guidelines as a resource for EBP(1).pdf
PPTX
Computer network topology notes for revision
PDF
Lecture1 pattern recognition............
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PPT
ISS -ESG Data flows What is ESG and HowHow
PPTX
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
PDF
annual-report-2024-2025 original latest.
PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PPTX
1_Introduction to advance data techniques.pptx
PPTX
climate analysis of Dhaka ,Banglades.pptx
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
Foundation of Data Science unit number two notes
IB Computer Science - Internal Assessment.pptx
Data_Analytics_and_PowerBI_Presentation.pptx
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
Clinical guidelines as a resource for EBP(1).pdf
Computer network topology notes for revision
Lecture1 pattern recognition............
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
ISS -ESG Data flows What is ESG and HowHow
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
annual-report-2024-2025 original latest.
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
1_Introduction to advance data techniques.pptx
climate analysis of Dhaka ,Banglades.pptx
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
Acceptance and paychological effects of mandatory extra coach I classes.pptx
Ad

BIG DATA INTRO , bigdata_intro , Hadoop PPT

  • 2. Big Data • Big Data is term utilized to refer to the increase in the volume of data that are difficult to store, process and analyze through traditional data base technology.
  • 3. Other definitions • Big Data refers to extremely large datasets that are too complex and voluminous for traditional data processing software to handle efficiently. • Big data is a combination of structured, semi-structured and unstructured data that organizations collect, analyze and mine for information and insights. • Big data refers to the large, diverse sets of information that grow at ever- increasing rates. • Big data is a voluminous set of structured, unstructured, and semi-structured datasets, which is challenging to manage using traditional data processing tools.
  • 6. "V"s of Big Data- Key characteristics of bigdata • Volume: The amount of data generated is vast and continuously growing. Big Data deals with large quantities of data from various sources. • Velocity: The speed at which data is generated, collected, and processed. Big Data requires the ability to handle this rapid arrival of data in real-time or near real-time. • Variety: The different types of data, both structured (like databases) and unstructured (like text, images, and videos), that need to be analyzed. • Veracity: The quality and accuracy of the data. It addresses the trustworthiness and reliability of the data being used. • Value: The potential insights and benefits that can be derived from analyzing the data. This V emphasizes the importance of turning data into actionable insights. • Variability: The inconsistency of the data, which can vary greatly in terms of quality and format. This characteristic highlights the challenges in processing and managing data that may change over time. • Visualization: The ability to represent data in a way that is understandable and actionable, often through charts, graphs, and other visual tools.
  • 7. Example explanation of all V’s 1. Volume: Example: Social media platforms like Facebook generate massive amounts of data every day. Each user's posts, comments, likes, and shares contribute to an enormous volume of data that needs to be stored and analyzed. 2.Velocity: Example: Stock trading systems process millions of transactions per second. The speed at which this data is generated and needs to be processed to make real-time trading decisions exemplifies the velocity characteristic. 3.Variety: Example: An e-commerce company collects data from various sources such as transaction records, customer reviews (text), social media interactions (images, videos), and customer service calls (audio). This mix of data types demonstrates the variety in Big Data.
  • 8. Example explanation of all V’s 4.Veracity: Example: In healthcare, patient data can come from various sources like medical records, wearable devices, and patient surveys. Ensuring this data is accurate and reliable (veracity) is crucial for making correct diagnoses and treatment plans. 5. Value: Example: A retailer analyzes customer purchasing patterns and discovers that sales of certain products increase during specific times of the year. This insight (value) allows the retailer to optimize inventory and marketing strategies to boost sales.
  • 9. Example explanation of all V’s 6. Variability: Example: Weather data is collected from sensors, satellites, and weather stations. The data can vary greatly due to changes in weather patterns, sensor malfunctions, and varying data formats, showcasing variability. 7.Visualization: Example: A city government uses data from traffic sensors and public transportation systems to create interactive maps and dashboards. These visualizations help city planners and residents understand traffic flow and make informed decisions about commuting.
  • 10. Types of Big Data • Structure • Unstructure • Semi Structure • Transaction data • Machine Data • Spatial data • Time series data • Open Data
  • 11. Types of Big Data examples Structured Data: Example: Data stored in databases, spreadsheets, and tables. This type of data is organized and easily searchable by simple algorithms. Sources: Relational databases, Excel files, transaction records Unstructured Data: Example: Data that does not have a predefined data model or structure. This includes text, images, videos, and social media posts. Sources: Social media platforms, multimedia files, email messages.
  • 12. Types of Big Data examples Semi-Structured Data: Example: Data that does not conform to a rigid structure but contains tags or markers to separate semantic elements. Sources: XML files, JSON documents, NoSQL databases. Transactional Data: Example: Data generated from daily operations and transactions, such as sales, purchases, and financial operations. Sources: Point-of-sale systems, online transaction processing systems.
  • 13. Types of Big Data examples Machine Data: Example: Data generated by machines, sensors, and other devices, often used in IoT applications. Sources: Log files, industrial equipment sensors, network devices. Spatial Data: Example: Data that represents the physical location and shape of objects, often used in geographic information systems (GIS). Sources: GPS data, satellite imagery, maps.
  • 14. Types of Big Data examples Time-Series Data: Example: Data points indexed or organized by time, capturing data changes over time. Sources: Financial market data, weather monitoring systems, IoT sensors. Open Data: Example: Data that is freely available for anyone to use and share, often provided by governments or public organizations. Sources: Government databases, public research datasets, open-source projects.
  • 15. Classification of Big Data Big Data can be classified based on various criteria, including its • Structure • Source • Processing method • Data Source • Contents • Usage
  • 16. Classification of Big Data Based on Structure 1. Structured Data: • Organized in fixed formats or schemas. • Examples: Relational databases, spreadsheets. 1. Unstructured Data: • Lacks a predefined format or organization. • Examples: Text documents, social media posts, videos, images. 1. Semi-Structured Data: • Does not conform to a strict structure but contains tags or markers to separate data elements. • Examples: XML files, JSON documents, emails.
  • 17. Classification of Big Data Based on Source 1. Human-Generated Data: • Created by people through their interactions with various systems. • Examples: Social media content, emails, online reviews. 1. Machine-Generated Data: • Produced by machines or systems without human intervention. • Examples: Sensor data, log files, transactional data from automated systems.
  • 18. Classification of Big Data Based on Processing Method 1. Batch Processing Data: • Processed in large volumes at a specific time. • Examples: Payroll systems, large-scale data analysis tasks. 1. Real-Time Processing Data: • Processed instantly as it is generated. • Examples: Financial transactions, real-time monitoring systems.
  • 19. Classification of Big Data Based on Data Sources 1. Internal Data: • Collected from within an organization. • Examples: Sales records, employee information, internal surveys. 1. External Data: • Sourced from outside the organization. • Examples: Market research data, social media feeds, public datasets.
  • 20. Classification of Big Data Based on Usage 1. Operational Data: • Used for daily operations and transactions. • Examples: Online transaction records, CRM data. 1. Analytical Data: • Used for analysis and decision-making. • Examples: Data warehouses, business intelligence reports.
  • 21. Classification of Big Data 1. Based on Content Text Data: • Comprises textual information. • Examples: Documents, emails, web pages. Multimedia Data: • Includes various forms of media. • Examples: Images, audio files, videos. Time-Series Data: • Sequential data points indexed in time order. • Examples: Stock prices, sensor readings. Geospatial Data: • Related to geographic or spatial aspects. • Examples: GPS data, maps.
  • 22. Sources of big data 1. Social Media Platforms: • Examples: Facebook, Twitter, Instagram, LinkedIn. • Data Types: Posts, comments, likes, shares, user profiles, multimedia content. 1. Sensor Data: • Examples: IoT devices, weather stations, industrial equipment, smart meters. • Data Types: Temperature readings, humidity levels, machinery status, energy consumption. 1. Transactional Data: • Examples: Online shopping platforms, banking transactions, point-of-sale (POS) systems. • Data Types: Purchase records, payment histories, order details.
  • 23. Sources of big data 4. Machine-Generated Data: • Examples: Server logs, application logs, network traffic data. • Data Types: Error logs, access logs, event logs. 5. Public Data: • Examples: Government databases, open data initiatives, public research datasets. • Data Types: Census data, economic indicators, public health records. 6. Enterprise Data: • Examples: Customer relationship management (CRM) systems, enterprise resource planning (ERP) systems, internal surveys. • Data Types: Customer profiles, sales data, inventory levels.
  • 24. Sources of big data 7. Multimedia Data: • Examples: Video streaming services, online photo galleries, audio libraries. • Data Types: Videos, images, audio files. 8.Web Data: • Examples: Websites, blogs, forums, e-commerce sites. • Data Types: Web pages, blog posts, product reviews, user ratings. 9. Geospatial Data: • Examples: GPS devices, satellite imagery, geographic information systems (GIS). • Data Types: Location coordinates, maps, satellite images.
  • 25. Sources of big data 10. Health Data: • Examples: Electronic health records (EHRs), medical imaging, wearable health devices. • Data Types: Patient records, MRI scans, fitness tracker data. 11. Communication Data: • Examples: Emails, text messages, call records. • Data Types: Email content, SMS logs, call duration. 12. Scientific Data: • Examples: Research experiments, laboratory results, astronomical observations. • Data Types: Experimental data, research findings, telescope images.