SlideShare a Scribd company logo
27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 1
Presentation by
Mrs. C. Kavitha
Assistant Professor(OG)/AI & DS
SRI RAMAKRISHNA ENGINEERING
COLLEGE
[Educational Service : SNR Sons Charitable Trust]
[Autonomous Institution, Reaccredited by NAAC with ‘A+’ Grade]
[Approved by AICTE and Permanently Affiliated to Anna University, Chennai]
[ISO 9001:2015 Certified and all Eligible Programmes Accredited by NBA]
VATTAMALAIPALAYAM, N.G.G.O. COLONY POST, COIMBATORE – 641 022.
20AD201 – DATA ANALYTICS
Department of Artificial Intelligence and Data
Science
27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 2
COURSE OUTCOME
CO1: Understand the key concepts of data science and address the
different sectors used in data science.
CO2: Examine the basic concepts of exploratory data analysis and
feature selection algorithms.
CO3: Apply data analytics process for feature generation and
selection.
CO4: Analyze a complex dataset using data visualization tool to
inspire real-time projects.
CO5: Explore tools and practices for working with text and
recommender system.
27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 3
20AD201 DATA ANALYTICS
Introduction to Data Science, Different Sectors using Data science,
Purpose and Components of Python in Data Science. Applications of
Data Science, Data Science and Ethical Issues- Discussions on privacy,
security, ethics- A look back at Data Science- Next-generation data
scientists.
MODULE I : INTRODUCTION TO DATA SCIENCE 10
27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 4
Textbooks and References
TEXTBOOKS
1. Joel Grus, “Data Science from Scratch”, Shroff Publisher /O’Reilly Publisher Media,2019.
2. Cathy O’Neil and Rachel Schutt, “Doing Data Science, Straight Talk from The Frontline”, O’Reilly
Publisher Media, 2013.
REFERENCES
1. Jure Leskovek, Anand Rajaraman and Jeffrey Ullman, “Mining of Massive Datasets”, Cambridge University Press, 2020.
2. João Moreira, Andre Carvalho,”A General Introduction to Data Analytics”, Wiley Publisher,2018.
3. Jake VanderPlas, “Python Data Science Handbook”, O’Reilly Publisher Media, 2016.
4. Philipp Janert, “Data Analysis with Open Source Tools”, O’Reilly Publisher Media, 2010.
WEB REFERENCE
5. https://nptel.ac.in/courses/110/106/110106072/
6. https://www.coursera.org/learn/julia-programming
20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS
27/07/2025
DATA
Data is a collection of facts, such as numbers, words,
measurements, observations or just descriptions of things.
5
20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS
27/07/2025
TYPES OF
DATA
6
27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 7
QUANTITATIVE DATA
 Numerical information(numbers)
 Deals with numbers and things you can measure objectively: dimensions
such as height, width, and length, temperature and humidity, prices, area
and volume.
27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 8
QUALITATIVEDATA
 Descriptive information: It is used to describe or categorize something
 It deals with characteristics and descriptors which cannot be measured but
can be observed subjectively.
 Example: Smells, taste, texture, colour etc
27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 9
QUANTITATIVEDATA
CONTINUOUS DISCRETE
• It represents measurement • It represents items that can be
counted
• For example, height can be measured
in more precise scale: meters,
centimetres etc
• Forexample, total number of
students in a class
• Eg: Height= 165.5 cm, 155.8 cm etc • Eg: Total number of students = 25. It
cannot be 22.5 or 20.3 etc
27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 10
QUALITATIVEDATA
NOMINAL ORDINAL
• Nominal scale is a naming scale where
variables are simply named or labelled.
• Ordinal scale, just beyond naming the
variables and it follows a specific
order.
• It is unordered. • It is ordered.
• Eg:Name of your school, type
of car etc.
• Eg:Rating a restaurant on a
scale from 0(lowest) to 5(highest).
27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 11
Structureofbigdata
20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS
27/07/2025
WHAT IS HAPPENING IN
INTERNET?
12
20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS
27/07/2025
DATA
INFLATION
13
20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS
27/07/2025
MAKING DATA WORK FOR
YOU
Use data to better describe the present or better predict the
future
14
20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS
27/07/2025 15
20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS
27/07/2025
UNDERSTANDIN
G DATA
SCIENCE
16
20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS
27/07/2025
UNDERSTANDIN
G DATA
SCIENCE
17
20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS
27/07/2025
UNDERSTANDIN
G DATA
SCIENCE
18
20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS
27/07/2025
UNDERSTANDIN
G DATA
SCIENCE
19
20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS
27/07/2025
UNDERSTANDIN
G DATA
SCIENCE
20
20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS
27/07/2025
UNDERSTANDING DATA
SCIENCE
21
27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 22
DATA SCIENCE
 Data Science is the process of using data to find solutions to predict outcomes
for a problem statement.
20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS
27/07/2025
DATA SCIENCE PROCESS
23
20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS
27/07/2025
DATA SCIENCE PROCESS
24
20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS
27/07/2025
THE DATA SCIENCE
STEPS
25
20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS
27/07/2025
DATA
SOURCES
From
Where data
comes
26
20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS
27/07/2025
DATA
SOURCES
Company data
• Collected by companies
• Helps them make data-driven
decisions
Open data
• Free, open data sources
• Can be used, shared, and built-on by
anyone
27
27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 28
COMPANY DATA
WEB DATA
SURVEY
DATA
CUSTOMER DATA LOGISTICS DATA
FINANCIAL TRANSACTIONS
27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 29
OPEN DATA
Open data is data that can be freely used, re-used and redistributed by anyone
Public APIs Public APIs
20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS
27/07/2025
OTHER DATA
TYPES
Image data
30
27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 31
OTHER DATATYPES
Text data
20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS
27/07/2025
OTHER DATA
TYPES
Network data
32
27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 33
DATA STORAGE
Cloud
Big Data
27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 34
DATA GENERATION
Internet Search
20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS
27/07/2025
DATA
GENERATION
Recommendation Systems
35
27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 36
20AD201 DATA ANALYTICS
Introduction to Data Science, Different Sectors using Data science,
Purpose and Components of Python in Data Science. Applications of
Data Science, Data Science and Ethical Issues- Discussions on privacy,
security, ethics- A look back at Data Science- Next-generation data
scientists.
MODULE I : INTRODUCTION TO DATA SCIENCE 10
27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 37
Purpose And Components Of Python In Data
Science
 Beginner-friendly syntax
 Rich ecosystem of libraries
 Strong community support
 Integration with databases, web, and cloud
 Rapid development and prototyping
WHY PYTHON?
27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 38
Purpose And Components Of Python In Data
Science
Feature Python R Java
Ease of Learning ✅ Easy Moderate Hard
Visualization ✅ Good ✅ Excellent Basic
ML/AI Support ✅ Excellent ✅ Good ✅ Moderate
Community Support ✅ Strong ✅ Strong ✅ Strong
Scalability ✅ High Moderate ✅ High
27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 39
APPLICATIONS
Recommendation
Tracking Customer Spending Habit, Shopping Behavior
27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 40
APPLICATIONS
Data about the condition of the
traffic of different road, collected
through camera kept beside the
road, at entry and exit point of the
city, GPS device placed in the
vehicle.
Smart Traffic system
27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 41
APPLICATIONS
 Big data analysis helps drive a car without human
interpretation.
 In the various spot of car camera, a sensor placed, that
gather data like the size of the surrounding car, obstacle,
distance from those, etc.
 These data are being analyzed, then various calculation
like how many angles to rotate, what should be speed,
when to stop, etc carried out.
 These calculations help to take action automatically.
Auto Driving Car
20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS
27/07/2025
APPLICATIONS
Image & Speech Recognition
42
27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 43
APPLICATIONS
 what type of video, music users are
watching, listening most
 how long users are spending on site
Media and entertainment sector
20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS
27/07/2025
APPLICATIONS
Health Care
44
27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 45
APPLICATIONS
Forecasts
27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 46
20AD201 DATA ANALYTICS
Introduction to Data Science, Different Sectors using Data science,
Purpose and Components of Python in Data Science. Applications of
Data Science, Data Science and Ethical Issues- Discussions on privacy,
security, ethics- A look back at Data Science- Next-generation data
scientists.
MODULE I : INTRODUCTION TO DATA SCIENCE 10
27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 47
What is Data Ethics?
Data Ethics is the study and practice of responsible data collection,
storage, use, and sharing.
It addresses questions such as:
• Are we collecting data fairly?
• Who owns the data?
• Are our models causing harm?
27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 48
Why Ethics in Data Science?
Data influences real-life decisions:
The systems we build using data can affect important parts of our lives
— like who gets a job, a loan, medical treatment, or even how long
someone stays in jail.
Bias in algorithms can be harmful:
If the data used to train these systems is biased or unfair, the outcomes
will also be unfair. For example, a hiring tool may favor one gender or
race over another without anyone realizing.
Wrong use of data breaks trust:
When companies or governments misuse personal data — like selling it
or using it without permission — people lose trust, and it can lead to
lawsuits, scandals, or even harm to individuals.
27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 49
Ethical Questions in Data Science
 Did people agree to share their data?
Are users clearly told what data is being collected and why — and did they give
permission knowingly?
 Are we open about how we use the data?
Do we clearly explain how the data will be used, who will have access, and what
decisions will be made based on it?
 Is the system treating everyone fairly?
Are all groups — regardless of race, gender, or background — getting equal and
unbiased outcomes from the AI or data system?
 Do people have control over their data?
Can users say "no" to being tracked or remove their data from the system if they
want to?
27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 50
What is Data Privacy?
Privacy refers to the right of individuals to control their personal
data.
In data science, privacy involves:
 Limiting data access
 Using data only for intended purposes
 Protecting identity
27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 51
Common Privacy Violations
 Collecting data without user consent
 Storing personal data without encryption
 Tracking users without informing them (e.g., cookies)
 Selling or sharing data to third parties without permission
27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 52
Protecting Privacy
 Anonymization: Remove identifiers
 Data minimization: Collect only what's needed
 Encryption: Secure data at rest and in transit
 Privacy policies: Be transparent with users
27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 53
What is Data Security?
 Protecting data from unauthorized access, theft, or damage
 Ensures confidentiality, integrity, and availability (CIA Triad)
Common Threats
Hacking and phishing
Weak authentication
Insider threats (e.g., employees misusing access)
Cloud misconfigurations
27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 54
Best Practices for Data Security
 Use strong passwords and two-factor authentication
 Encrypt data and regularly back it up
 Assign role-based access to sensitive data
 Audit data usage and access logs
27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 55
20AD201 DATA ANALYTICS
Introduction to Data Science, Different Sectors using Data science,
Purpose and Components of Python in Data Science. Applications of
Data Science, Data Science and Ethical Issues- Discussions on privacy,
security, ethics- A look back at Data Science- Next-generation data
scientists.
MODULE I : INTRODUCTION TO DATA SCIENCE 10
20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS
27/07/2025 56
27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 57
Data Science THEN vs NOW
Past Present
Manual analysis Automated ML pipelines
Small samples Big data sets
Excel sheets Python, Jupyter Notebooks
Local processing Cloud-based systems
27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 58
20AD201 DATA ANALYTICS
Introduction to Data Science, Different Sectors using Data science,
Purpose and Components of Python in Data Science. Applications of
Data Science, Data Science and Ethical Issues- Discussions on privacy,
security, ethics- A look back at Data Science- Next-generation data
scientists.
MODULE I : INTRODUCTION TO DATA SCIENCE 10
27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 59
Who Are Next-Generation Data Scientists?
 They are the future data experts who won’t just write code but
will also think, question, and act responsibly.
 They will solve real-life problems with data – not just build
models for profits.
 They care about accuracy, fairness, and impact.
“A good data scientist doesn’t just predict the future — they help shape it wisely.”
27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 60
Skills and Qualities They Need
 Strong in math, coding, and statistics
 Can prepare and clean messy data
 Follow every step carefully — don’t rush!
 Ask questions and test ideas like a scientist
 Use simple and honest models, not just complex ones
27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 61
Current Gen vs. Next-Gen Data Scientists
Aspect Current Generation Next-Generation
Goal
Deliver fast results for business
needs
Solve problems responsibly for long-
term impact
Approach
Often rush to apply trendy
algorithms
Take time to understand data and the
problem
Data
Handling
Focus mostly on model building
Spend more time on data cleaning &
understanding
Skills Technical coding and ML skills
Technical + soft skills + ethical
thinking
Mindset Task-oriented (complete the job)
Curious, careful, and scientific
thinker
Model Use
Use black-box models without full
understanding
Prefer interpretable and fair models
Ethics Often overlooked in the rush to build Core part of decision-making
20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS
27/07/2025
Thank You
62

More Related Content

PPTX
FDS_dept_ppt.pptx
PPT
PDS Unit - 1 Introdiction to DS.ppt
PPTX
Unit 1-FDS. .pptx
PPTX
Chapter 2.pptx emerging technology data science
PPTX
Short term internship project report on power Bi
PPTX
CSE3038_Module1 - updated v1.1bvjchcghvkhvjkvjvkjvh.pptx
PDF
KIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdf
PDF
Introduction to Data Analytics and data analytics life cycle
FDS_dept_ppt.pptx
PDS Unit - 1 Introdiction to DS.ppt
Unit 1-FDS. .pptx
Chapter 2.pptx emerging technology data science
Short term internship project report on power Bi
CSE3038_Module1 - updated v1.1bvjchcghvkhvjkvjvkjvh.pptx
KIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdf
Introduction to Data Analytics and data analytics life cycle

Similar to Introduction to Data Analytics and Data Science (20)

PDF
Data science
PPTX
Data science fullOCS353 UNIT 1 UPDATED.pptx
PDF
Data Science & Big Data - Theory.pdf
PDF
Introduction to Data Science - Fundamentals
PPTX
Data Science_Unit-1.2 part - 2 of intro.pptx
PDF
Data Science & AI Road Map by Python & Computer science tutor in Malaysia
PPTX
Chapter 1 Introduction to Datascience (1).pptx
PDF
00-01 DSnDA.pdf
PPTX
Introduction to Data Science.pptx
DOCX
Course outline
PPTX
Best Data Science Course in Rohini, BY DICS
PPTX
Data science Nagarajan and madhav.pptx
PPTX
Chapter 2 - EMTE.pptx
PPTX
Introduction to Data Science and Analytics
PPTX
DILEEP DATA SCIERNCES PROJECT POWERPOINT PPT
PPTX
DA DS traning.pptx. Data Science is marking its graph on a high note by expan...
PPTX
Data Science and Analytics Lesson 1.pptx
PPTX
Chapter 2- Data Science and big data.pptx
PPTX
Introduction to Data Science.pptx
PPTX
Data Science presentation for explanation of numpy and pandas
Data science
Data science fullOCS353 UNIT 1 UPDATED.pptx
Data Science & Big Data - Theory.pdf
Introduction to Data Science - Fundamentals
Data Science_Unit-1.2 part - 2 of intro.pptx
Data Science & AI Road Map by Python & Computer science tutor in Malaysia
Chapter 1 Introduction to Datascience (1).pptx
00-01 DSnDA.pdf
Introduction to Data Science.pptx
Course outline
Best Data Science Course in Rohini, BY DICS
Data science Nagarajan and madhav.pptx
Chapter 2 - EMTE.pptx
Introduction to Data Science and Analytics
DILEEP DATA SCIERNCES PROJECT POWERPOINT PPT
DA DS traning.pptx. Data Science is marking its graph on a high note by expan...
Data Science and Analytics Lesson 1.pptx
Chapter 2- Data Science and big data.pptx
Introduction to Data Science.pptx
Data Science presentation for explanation of numpy and pandas
Ad

Recently uploaded (20)

PPTX
Business Ppt On Nestle.pptx huunnnhhgfvu
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PDF
Clinical guidelines as a resource for EBP(1).pdf
PDF
Foundation of Data Science unit number two notes
PPTX
Introduction to machine learning and Linear Models
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PDF
Business Analytics and business intelligence.pdf
PDF
.pdf is not working space design for the following data for the following dat...
PPTX
Qualitative Qantitative and Mixed Methods.pptx
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PPTX
Database Infoormation System (DBIS).pptx
PDF
Mega Projects Data Mega Projects Data
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PPTX
climate analysis of Dhaka ,Banglades.pptx
PPTX
Introduction to Knowledge Engineering Part 1
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PPTX
Business Acumen Training GuidePresentation.pptx
Business Ppt On Nestle.pptx huunnnhhgfvu
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
Clinical guidelines as a resource for EBP(1).pdf
Foundation of Data Science unit number two notes
Introduction to machine learning and Linear Models
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
Business Analytics and business intelligence.pdf
.pdf is not working space design for the following data for the following dat...
Qualitative Qantitative and Mixed Methods.pptx
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
Database Infoormation System (DBIS).pptx
Mega Projects Data Mega Projects Data
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
IBA_Chapter_11_Slides_Final_Accessible.pptx
climate analysis of Dhaka ,Banglades.pptx
Introduction to Knowledge Engineering Part 1
Galatica Smart Energy Infrastructure Startup Pitch Deck
Business Acumen Training GuidePresentation.pptx
Ad

Introduction to Data Analytics and Data Science

  • 1. 27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 1 Presentation by Mrs. C. Kavitha Assistant Professor(OG)/AI & DS SRI RAMAKRISHNA ENGINEERING COLLEGE [Educational Service : SNR Sons Charitable Trust] [Autonomous Institution, Reaccredited by NAAC with ‘A+’ Grade] [Approved by AICTE and Permanently Affiliated to Anna University, Chennai] [ISO 9001:2015 Certified and all Eligible Programmes Accredited by NBA] VATTAMALAIPALAYAM, N.G.G.O. COLONY POST, COIMBATORE – 641 022. 20AD201 – DATA ANALYTICS Department of Artificial Intelligence and Data Science
  • 2. 27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 2 COURSE OUTCOME CO1: Understand the key concepts of data science and address the different sectors used in data science. CO2: Examine the basic concepts of exploratory data analysis and feature selection algorithms. CO3: Apply data analytics process for feature generation and selection. CO4: Analyze a complex dataset using data visualization tool to inspire real-time projects. CO5: Explore tools and practices for working with text and recommender system.
  • 3. 27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 3 20AD201 DATA ANALYTICS Introduction to Data Science, Different Sectors using Data science, Purpose and Components of Python in Data Science. Applications of Data Science, Data Science and Ethical Issues- Discussions on privacy, security, ethics- A look back at Data Science- Next-generation data scientists. MODULE I : INTRODUCTION TO DATA SCIENCE 10
  • 4. 27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 4 Textbooks and References TEXTBOOKS 1. Joel Grus, “Data Science from Scratch”, Shroff Publisher /O’Reilly Publisher Media,2019. 2. Cathy O’Neil and Rachel Schutt, “Doing Data Science, Straight Talk from The Frontline”, O’Reilly Publisher Media, 2013. REFERENCES 1. Jure Leskovek, Anand Rajaraman and Jeffrey Ullman, “Mining of Massive Datasets”, Cambridge University Press, 2020. 2. João Moreira, Andre Carvalho,”A General Introduction to Data Analytics”, Wiley Publisher,2018. 3. Jake VanderPlas, “Python Data Science Handbook”, O’Reilly Publisher Media, 2016. 4. Philipp Janert, “Data Analysis with Open Source Tools”, O’Reilly Publisher Media, 2010. WEB REFERENCE 5. https://nptel.ac.in/courses/110/106/110106072/ 6. https://www.coursera.org/learn/julia-programming
  • 5. 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 27/07/2025 DATA Data is a collection of facts, such as numbers, words, measurements, observations or just descriptions of things. 5
  • 6. 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 27/07/2025 TYPES OF DATA 6
  • 7. 27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 7 QUANTITATIVE DATA  Numerical information(numbers)  Deals with numbers and things you can measure objectively: dimensions such as height, width, and length, temperature and humidity, prices, area and volume.
  • 8. 27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 8 QUALITATIVEDATA  Descriptive information: It is used to describe or categorize something  It deals with characteristics and descriptors which cannot be measured but can be observed subjectively.  Example: Smells, taste, texture, colour etc
  • 9. 27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 9 QUANTITATIVEDATA CONTINUOUS DISCRETE • It represents measurement • It represents items that can be counted • For example, height can be measured in more precise scale: meters, centimetres etc • Forexample, total number of students in a class • Eg: Height= 165.5 cm, 155.8 cm etc • Eg: Total number of students = 25. It cannot be 22.5 or 20.3 etc
  • 10. 27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 10 QUALITATIVEDATA NOMINAL ORDINAL • Nominal scale is a naming scale where variables are simply named or labelled. • Ordinal scale, just beyond naming the variables and it follows a specific order. • It is unordered. • It is ordered. • Eg:Name of your school, type of car etc. • Eg:Rating a restaurant on a scale from 0(lowest) to 5(highest).
  • 11. 27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 11 Structureofbigdata
  • 12. 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 27/07/2025 WHAT IS HAPPENING IN INTERNET? 12
  • 13. 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 27/07/2025 DATA INFLATION 13
  • 14. 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 27/07/2025 MAKING DATA WORK FOR YOU Use data to better describe the present or better predict the future 14
  • 15. 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 27/07/2025 15
  • 16. 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 27/07/2025 UNDERSTANDIN G DATA SCIENCE 16
  • 17. 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 27/07/2025 UNDERSTANDIN G DATA SCIENCE 17
  • 18. 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 27/07/2025 UNDERSTANDIN G DATA SCIENCE 18
  • 19. 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 27/07/2025 UNDERSTANDIN G DATA SCIENCE 19
  • 20. 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 27/07/2025 UNDERSTANDIN G DATA SCIENCE 20
  • 21. 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 27/07/2025 UNDERSTANDING DATA SCIENCE 21
  • 22. 27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 22 DATA SCIENCE  Data Science is the process of using data to find solutions to predict outcomes for a problem statement.
  • 23. 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 27/07/2025 DATA SCIENCE PROCESS 23
  • 24. 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 27/07/2025 DATA SCIENCE PROCESS 24
  • 25. 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 27/07/2025 THE DATA SCIENCE STEPS 25
  • 26. 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 27/07/2025 DATA SOURCES From Where data comes 26
  • 27. 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 27/07/2025 DATA SOURCES Company data • Collected by companies • Helps them make data-driven decisions Open data • Free, open data sources • Can be used, shared, and built-on by anyone 27
  • 28. 27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 28 COMPANY DATA WEB DATA SURVEY DATA CUSTOMER DATA LOGISTICS DATA FINANCIAL TRANSACTIONS
  • 29. 27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 29 OPEN DATA Open data is data that can be freely used, re-used and redistributed by anyone Public APIs Public APIs
  • 30. 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 27/07/2025 OTHER DATA TYPES Image data 30
  • 31. 27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 31 OTHER DATATYPES Text data
  • 32. 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 27/07/2025 OTHER DATA TYPES Network data 32
  • 33. 27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 33 DATA STORAGE Cloud Big Data
  • 34. 27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 34 DATA GENERATION Internet Search
  • 35. 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 27/07/2025 DATA GENERATION Recommendation Systems 35
  • 36. 27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 36 20AD201 DATA ANALYTICS Introduction to Data Science, Different Sectors using Data science, Purpose and Components of Python in Data Science. Applications of Data Science, Data Science and Ethical Issues- Discussions on privacy, security, ethics- A look back at Data Science- Next-generation data scientists. MODULE I : INTRODUCTION TO DATA SCIENCE 10
  • 37. 27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 37 Purpose And Components Of Python In Data Science  Beginner-friendly syntax  Rich ecosystem of libraries  Strong community support  Integration with databases, web, and cloud  Rapid development and prototyping WHY PYTHON?
  • 38. 27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 38 Purpose And Components Of Python In Data Science Feature Python R Java Ease of Learning ✅ Easy Moderate Hard Visualization ✅ Good ✅ Excellent Basic ML/AI Support ✅ Excellent ✅ Good ✅ Moderate Community Support ✅ Strong ✅ Strong ✅ Strong Scalability ✅ High Moderate ✅ High
  • 39. 27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 39 APPLICATIONS Recommendation Tracking Customer Spending Habit, Shopping Behavior
  • 40. 27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 40 APPLICATIONS Data about the condition of the traffic of different road, collected through camera kept beside the road, at entry and exit point of the city, GPS device placed in the vehicle. Smart Traffic system
  • 41. 27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 41 APPLICATIONS  Big data analysis helps drive a car without human interpretation.  In the various spot of car camera, a sensor placed, that gather data like the size of the surrounding car, obstacle, distance from those, etc.  These data are being analyzed, then various calculation like how many angles to rotate, what should be speed, when to stop, etc carried out.  These calculations help to take action automatically. Auto Driving Car
  • 42. 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 27/07/2025 APPLICATIONS Image & Speech Recognition 42
  • 43. 27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 43 APPLICATIONS  what type of video, music users are watching, listening most  how long users are spending on site Media and entertainment sector
  • 44. 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 27/07/2025 APPLICATIONS Health Care 44
  • 45. 27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 45 APPLICATIONS Forecasts
  • 46. 27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 46 20AD201 DATA ANALYTICS Introduction to Data Science, Different Sectors using Data science, Purpose and Components of Python in Data Science. Applications of Data Science, Data Science and Ethical Issues- Discussions on privacy, security, ethics- A look back at Data Science- Next-generation data scientists. MODULE I : INTRODUCTION TO DATA SCIENCE 10
  • 47. 27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 47 What is Data Ethics? Data Ethics is the study and practice of responsible data collection, storage, use, and sharing. It addresses questions such as: • Are we collecting data fairly? • Who owns the data? • Are our models causing harm?
  • 48. 27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 48 Why Ethics in Data Science? Data influences real-life decisions: The systems we build using data can affect important parts of our lives — like who gets a job, a loan, medical treatment, or even how long someone stays in jail. Bias in algorithms can be harmful: If the data used to train these systems is biased or unfair, the outcomes will also be unfair. For example, a hiring tool may favor one gender or race over another without anyone realizing. Wrong use of data breaks trust: When companies or governments misuse personal data — like selling it or using it without permission — people lose trust, and it can lead to lawsuits, scandals, or even harm to individuals.
  • 49. 27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 49 Ethical Questions in Data Science  Did people agree to share their data? Are users clearly told what data is being collected and why — and did they give permission knowingly?  Are we open about how we use the data? Do we clearly explain how the data will be used, who will have access, and what decisions will be made based on it?  Is the system treating everyone fairly? Are all groups — regardless of race, gender, or background — getting equal and unbiased outcomes from the AI or data system?  Do people have control over their data? Can users say "no" to being tracked or remove their data from the system if they want to?
  • 50. 27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 50 What is Data Privacy? Privacy refers to the right of individuals to control their personal data. In data science, privacy involves:  Limiting data access  Using data only for intended purposes  Protecting identity
  • 51. 27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 51 Common Privacy Violations  Collecting data without user consent  Storing personal data without encryption  Tracking users without informing them (e.g., cookies)  Selling or sharing data to third parties without permission
  • 52. 27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 52 Protecting Privacy  Anonymization: Remove identifiers  Data minimization: Collect only what's needed  Encryption: Secure data at rest and in transit  Privacy policies: Be transparent with users
  • 53. 27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 53 What is Data Security?  Protecting data from unauthorized access, theft, or damage  Ensures confidentiality, integrity, and availability (CIA Triad) Common Threats Hacking and phishing Weak authentication Insider threats (e.g., employees misusing access) Cloud misconfigurations
  • 54. 27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 54 Best Practices for Data Security  Use strong passwords and two-factor authentication  Encrypt data and regularly back it up  Assign role-based access to sensitive data  Audit data usage and access logs
  • 55. 27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 55 20AD201 DATA ANALYTICS Introduction to Data Science, Different Sectors using Data science, Purpose and Components of Python in Data Science. Applications of Data Science, Data Science and Ethical Issues- Discussions on privacy, security, ethics- A look back at Data Science- Next-generation data scientists. MODULE I : INTRODUCTION TO DATA SCIENCE 10
  • 56. 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 27/07/2025 56
  • 57. 27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 57 Data Science THEN vs NOW Past Present Manual analysis Automated ML pipelines Small samples Big data sets Excel sheets Python, Jupyter Notebooks Local processing Cloud-based systems
  • 58. 27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 58 20AD201 DATA ANALYTICS Introduction to Data Science, Different Sectors using Data science, Purpose and Components of Python in Data Science. Applications of Data Science, Data Science and Ethical Issues- Discussions on privacy, security, ethics- A look back at Data Science- Next-generation data scientists. MODULE I : INTRODUCTION TO DATA SCIENCE 10
  • 59. 27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 59 Who Are Next-Generation Data Scientists?  They are the future data experts who won’t just write code but will also think, question, and act responsibly.  They will solve real-life problems with data – not just build models for profits.  They care about accuracy, fairness, and impact. “A good data scientist doesn’t just predict the future — they help shape it wisely.”
  • 60. 27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 60 Skills and Qualities They Need  Strong in math, coding, and statistics  Can prepare and clean messy data  Follow every step carefully — don’t rush!  Ask questions and test ideas like a scientist  Use simple and honest models, not just complex ones
  • 61. 27/07/2025 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 61 Current Gen vs. Next-Gen Data Scientists Aspect Current Generation Next-Generation Goal Deliver fast results for business needs Solve problems responsibly for long- term impact Approach Often rush to apply trendy algorithms Take time to understand data and the problem Data Handling Focus mostly on model building Spend more time on data cleaning & understanding Skills Technical coding and ML skills Technical + soft skills + ethical thinking Mindset Task-oriented (complete the job) Curious, careful, and scientific thinker Model Use Use black-box models without full understanding Prefer interpretable and fair models Ethics Often overlooked in the rush to build Core part of decision-making
  • 62. 20AD201 - DATAANALYTICS - Mrs. C. Kavitha, AP/AI&DS 27/07/2025 Thank You 62