SlideShare a Scribd company logo
07/26/2025 1
Department of Computer Science & Engineering (SB-ET)
III B. Tech -II Semester
DATAANALYTICS
SUBJECT CODE: 22PCOAM21
AcademicYear : 2024-2025
By
Dr.M.Gokilavani
GNITC
Department of CSE (SB-ET)
07/26/2025 Department of CSE (SB-ET) 2
22PCOAM21 DATAANALYTICS
UNIT – I
Syllabus
Data Management: Design Data Architecture and manage the data for
analysis, understand various sources of Data like Sensors/Signals/GPS etc.
Data Management, Data Quality (noise, outliers, missing values, duplicate
data) and Data Preprocessing &Processing.
Course Prerequisites
1. Database Management Systems.
2. Knowledge of probability and statistics.
07/26/2025 3
TEXTBOOK:
• Student’s Handbook for Associate Analytics - II, III.
• Data Mining Concepts and Techniques, Han, Kamber, 3rd Edition, Morgan Kaufmann
Publishers.
REFERENCES:
• Introduction to Data Mining, Tan, Steinbach and Kumar, Addision Wisley, 2006.
• Data Mining Analysis and Concepts, M. Zaki and W. Meira
• Mining of Massive Datasets, Jure Leskovec Stanford Univ. Anand Rajaraman Milliway Labs,
Jeffrey D Ullman Stanford Univ.
No of Hours Required: 13
Department of CSE (SB-ET)
UNIT - I LECTURE - 02
07/26/2025 Department of CSE (SB-ET) 4
Data & Data Collection
• Data is a collection of measurements and facts.
• A tool that helps an individual or a group of individuals reach a sound
conclusion by providing them with some information.
• Data collection serves as the critical first step in this process, laying the
foundation for extracting meaningful insights from raw information.
• Structured like numerical records
• Unstructured like text, audio, or video, organizations can transform raw data
into actionable knowledge.
• In the process of big data analysis, “Data collection” is the initial step before
starting to analyze the patterns or useful information in the data.
UNIT - I LECTURE - 02
07/26/2025 Department of CSE (SB-ET) 5
Data Collection
• The data that is collected is known as raw data, which is not useful now, but after
cleaning the impure and utilizing that data for further analysis forms information, the
information obtained is known as “knowledge”.
• There are two types
• Qualitative data
• Quantitative data
• Qualitative data which is a group of non-numerical data such as words, sentences
mostly focus on behavior and actions of the group.
• Quantitative data which is in numerical forms and can be calculated using different
scientific tools and sampling data.
UNIT - I LECTURE - 02
07/26/2025 Department of CSE (SB-ET) 6
Data Collection
The actual data is then further divided mainly into two types known as:
• Primary data
• Secondary data
UNIT - I LECTURE - 02
22PCOAM21 Session 2 Understanding Data Source.pptx
07/26/2025 Department of CSE (SB-ET) 8
Primary Data
• The data which is Raw, original, and extracted directly from the official sources is
known as primary data.
• This type of data is collected directly by performing techniques such as questionnaires,
interviews, and surveys.
• The data collected must be according to the demand and requirements of the target
audience on which analysis is performed otherwise it would be a burden in the data
processing.
• Few methods of collecting primary data
• Interview method - E.g: Telephone, face to face, email, etc.
• Survey method - E.g: Text, audio, or video
• Observation method – E.g: Results
• Experimental Method – E.g: Research, and Investigation
UNIT - I LECTURE - 02
07/26/2025 Department of CSE (SB-ET) 9
Secondary Data
• Secondary data is the data which has already been collected and
reused again for some valid purpose.
• This type of data is previously recorded from primary data and it has two
types of sources named
• Internal source
• External source
• Other Resources such as Sensors data, Satellite Data, Web traffic etc.,
UNIT - I LECTURE - 02
07/26/2025 Department of CSE (SB-ET) 10
Topics to be covered in next session 3
• Data Quality (noise, outliers, missing values, duplicate
data)
Thank you!!!
UNIT - I LECTURE - 02

More Related Content

PPTX
22PCOAM21 Session 1 Data Management.pptx
PPTX
Data analytics unit 1 aktu updated syllabus new
PDF
Introduction to Data Analytics, AKTU - UNIT-1
PPTX
Methods of data collection
PPTX
PPTX
Data Analytics-Unit 1 , this Is ppt for student help
PPTX
WEEK2-S2 (2).pptx
PDF
Use of secondary data in marketing analytics
22PCOAM21 Session 1 Data Management.pptx
Data analytics unit 1 aktu updated syllabus new
Introduction to Data Analytics, AKTU - UNIT-1
Methods of data collection
Data Analytics-Unit 1 , this Is ppt for student help
WEEK2-S2 (2).pptx
Use of secondary data in marketing analytics

Similar to 22PCOAM21 Session 2 Understanding Data Source.pptx (20)

PDF
Brm unit 3-dr. shriram dawkhar
PPTX
Research copmputing
PPTX
Data Collection
PPTX
Researchpe-5.pptx
PPTX
Data Collection Strategies for Better Insights#DataCollection
PPTX
Data science unit1
PPTX
Research Process.pptxxxxxxxxxxxxxxxxxxxx
PPTX
OVERVIEW-OF-DATA-ANALYSIS-IN-RESEARCH.-for-the-class-pptx.pptx
PPTX
Research methodology
PPTX
DOWLD SLIDES.pptx
PDF
Data documentation and contextual descriptions
PDF
Module 2 Data Collection and Management.pdf
PDF
Research Methodology Module-04
PPTX
secondary and primary data.pptx
PPTX
How to access the AEDC data collections
PPTX
Introduction to data science
PPTX
What is Data analytics and it's importance ?
PDF
Data Science Unit1 AMET.pdf
PDF
CS3352-Foundations of Data Science Notes.pdf
Brm unit 3-dr. shriram dawkhar
Research copmputing
Data Collection
Researchpe-5.pptx
Data Collection Strategies for Better Insights#DataCollection
Data science unit1
Research Process.pptxxxxxxxxxxxxxxxxxxxx
OVERVIEW-OF-DATA-ANALYSIS-IN-RESEARCH.-for-the-class-pptx.pptx
Research methodology
DOWLD SLIDES.pptx
Data documentation and contextual descriptions
Module 2 Data Collection and Management.pdf
Research Methodology Module-04
secondary and primary data.pptx
How to access the AEDC data collections
Introduction to data science
What is Data analytics and it's importance ?
Data Science Unit1 AMET.pdf
CS3352-Foundations of Data Science Notes.pdf
Ad

More from Guru Nanak Technical Institutions (20)

PPTX
22PCOAM21 Data Quality Session 3 Data Quality.pptx
PDF
III Year II Sem 22PCOAM21 Data Analytics Syllabus.pdf
PDF
22PCOAM16 _ML_Unit 3 Notes & Question bank
PDF
22PCOAM16 Machine Learning Unit V Full notes & QB
PDF
22PCOAM16_MACHINE_LEARNING_UNIT_IV_NOTES_with_QB
PDF
22PCOAM16 ML Unit 3 Full notes PDF & QB.pdf
PPTX
22PCOAM16 Unit 3 Session 23 Different ways to Combine Classifiers.pptx
PPTX
22PCOAM16 Unit 3 Session 22 Ensemble Learning .pptx
PPTX
22PCOAM16 Unit 3 Session 24 K means Algorithms.pptx
PPTX
22PCOAM16 ML Unit 3 Session 18 Learning with tree.pptx
PPTX
22PCOAM16 ML Unit 3 Session 21 Classification and Regression Trees .pptx
PPTX
22PCOAM16 ML Unit 3 Session 20 ID3 Algorithm and working.pptx
PPTX
22PCOAM16 ML Unit 3 Session 19 Constructing Decision Trees.pptx
PDF
22PCOAM16 ML UNIT 2 NOTES & QB QUESTION WITH ANSWERS
PDF
22PCOAM16 _ML_ Unit 2 Full unit notes.pdf
PDF
22PCOAM16_ML_Unit 1 notes & Question Bank with answers.pdf
PDF
22PCOAM16_MACHINE_LEARNING_UNIT_I_NOTES.pdf
PPTX
22PCOAM16 Unit 2 Session 17 Support vector Machine.pptx
PPTX
22PCOAM16 Unit 2 Session 16 Interpolations and Basic Functions.pptx
PPTX
22PCOAM16 Unit 2 Session 15 Curse of Dimensionality.pptx
22PCOAM21 Data Quality Session 3 Data Quality.pptx
III Year II Sem 22PCOAM21 Data Analytics Syllabus.pdf
22PCOAM16 _ML_Unit 3 Notes & Question bank
22PCOAM16 Machine Learning Unit V Full notes & QB
22PCOAM16_MACHINE_LEARNING_UNIT_IV_NOTES_with_QB
22PCOAM16 ML Unit 3 Full notes PDF & QB.pdf
22PCOAM16 Unit 3 Session 23 Different ways to Combine Classifiers.pptx
22PCOAM16 Unit 3 Session 22 Ensemble Learning .pptx
22PCOAM16 Unit 3 Session 24 K means Algorithms.pptx
22PCOAM16 ML Unit 3 Session 18 Learning with tree.pptx
22PCOAM16 ML Unit 3 Session 21 Classification and Regression Trees .pptx
22PCOAM16 ML Unit 3 Session 20 ID3 Algorithm and working.pptx
22PCOAM16 ML Unit 3 Session 19 Constructing Decision Trees.pptx
22PCOAM16 ML UNIT 2 NOTES & QB QUESTION WITH ANSWERS
22PCOAM16 _ML_ Unit 2 Full unit notes.pdf
22PCOAM16_ML_Unit 1 notes & Question Bank with answers.pdf
22PCOAM16_MACHINE_LEARNING_UNIT_I_NOTES.pdf
22PCOAM16 Unit 2 Session 17 Support vector Machine.pptx
22PCOAM16 Unit 2 Session 16 Interpolations and Basic Functions.pptx
22PCOAM16 Unit 2 Session 15 Curse of Dimensionality.pptx
Ad

Recently uploaded (20)

PDF
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PPTX
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
PDF
Model Code of Practice - Construction Work - 21102022 .pdf
PPTX
OOP with Java - Java Introduction (Basics)
PDF
Digital Logic Computer Design lecture notes
PDF
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PPTX
UNIT 4 Total Quality Management .pptx
PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
PPTX
Construction Project Organization Group 2.pptx
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PPTX
Internet of Things (IOT) - A guide to understanding
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PDF
Well-logging-methods_new................
PDF
Arduino robotics embedded978-1-4302-3184-4.pdf
PPTX
Strings in CPP - Strings in C++ are sequences of characters used to store and...
PPTX
bas. eng. economics group 4 presentation 1.pptx
PDF
PPT on Performance Review to get promotions
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
Model Code of Practice - Construction Work - 21102022 .pdf
OOP with Java - Java Introduction (Basics)
Digital Logic Computer Design lecture notes
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
UNIT 4 Total Quality Management .pptx
CYBER-CRIMES AND SECURITY A guide to understanding
Construction Project Organization Group 2.pptx
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
Internet of Things (IOT) - A guide to understanding
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
Well-logging-methods_new................
Arduino robotics embedded978-1-4302-3184-4.pdf
Strings in CPP - Strings in C++ are sequences of characters used to store and...
bas. eng. economics group 4 presentation 1.pptx
PPT on Performance Review to get promotions

22PCOAM21 Session 2 Understanding Data Source.pptx

  • 1. 07/26/2025 1 Department of Computer Science & Engineering (SB-ET) III B. Tech -II Semester DATAANALYTICS SUBJECT CODE: 22PCOAM21 AcademicYear : 2024-2025 By Dr.M.Gokilavani GNITC Department of CSE (SB-ET)
  • 2. 07/26/2025 Department of CSE (SB-ET) 2 22PCOAM21 DATAANALYTICS UNIT – I Syllabus Data Management: Design Data Architecture and manage the data for analysis, understand various sources of Data like Sensors/Signals/GPS etc. Data Management, Data Quality (noise, outliers, missing values, duplicate data) and Data Preprocessing &Processing. Course Prerequisites 1. Database Management Systems. 2. Knowledge of probability and statistics.
  • 3. 07/26/2025 3 TEXTBOOK: • Student’s Handbook for Associate Analytics - II, III. • Data Mining Concepts and Techniques, Han, Kamber, 3rd Edition, Morgan Kaufmann Publishers. REFERENCES: • Introduction to Data Mining, Tan, Steinbach and Kumar, Addision Wisley, 2006. • Data Mining Analysis and Concepts, M. Zaki and W. Meira • Mining of Massive Datasets, Jure Leskovec Stanford Univ. Anand Rajaraman Milliway Labs, Jeffrey D Ullman Stanford Univ. No of Hours Required: 13 Department of CSE (SB-ET) UNIT - I LECTURE - 02
  • 4. 07/26/2025 Department of CSE (SB-ET) 4 Data & Data Collection • Data is a collection of measurements and facts. • A tool that helps an individual or a group of individuals reach a sound conclusion by providing them with some information. • Data collection serves as the critical first step in this process, laying the foundation for extracting meaningful insights from raw information. • Structured like numerical records • Unstructured like text, audio, or video, organizations can transform raw data into actionable knowledge. • In the process of big data analysis, “Data collection” is the initial step before starting to analyze the patterns or useful information in the data. UNIT - I LECTURE - 02
  • 5. 07/26/2025 Department of CSE (SB-ET) 5 Data Collection • The data that is collected is known as raw data, which is not useful now, but after cleaning the impure and utilizing that data for further analysis forms information, the information obtained is known as “knowledge”. • There are two types • Qualitative data • Quantitative data • Qualitative data which is a group of non-numerical data such as words, sentences mostly focus on behavior and actions of the group. • Quantitative data which is in numerical forms and can be calculated using different scientific tools and sampling data. UNIT - I LECTURE - 02
  • 6. 07/26/2025 Department of CSE (SB-ET) 6 Data Collection The actual data is then further divided mainly into two types known as: • Primary data • Secondary data UNIT - I LECTURE - 02
  • 8. 07/26/2025 Department of CSE (SB-ET) 8 Primary Data • The data which is Raw, original, and extracted directly from the official sources is known as primary data. • This type of data is collected directly by performing techniques such as questionnaires, interviews, and surveys. • The data collected must be according to the demand and requirements of the target audience on which analysis is performed otherwise it would be a burden in the data processing. • Few methods of collecting primary data • Interview method - E.g: Telephone, face to face, email, etc. • Survey method - E.g: Text, audio, or video • Observation method – E.g: Results • Experimental Method – E.g: Research, and Investigation UNIT - I LECTURE - 02
  • 9. 07/26/2025 Department of CSE (SB-ET) 9 Secondary Data • Secondary data is the data which has already been collected and reused again for some valid purpose. • This type of data is previously recorded from primary data and it has two types of sources named • Internal source • External source • Other Resources such as Sensors data, Satellite Data, Web traffic etc., UNIT - I LECTURE - 02
  • 10. 07/26/2025 Department of CSE (SB-ET) 10 Topics to be covered in next session 3 • Data Quality (noise, outliers, missing values, duplicate data) Thank you!!! UNIT - I LECTURE - 02