SlideShare a Scribd company logo
Data Science
Muhammad Suleman Memon
Assistant Professor
Department of Information Technology,
Dadu Campus,
University of Sindh
What is
Data
Science?
Data science is the domain of
study that deals with vast
volumes.
Find unseen patterns, derive
meaningful information, and
make business decisions.
Data science uses complex
machine learning algorithms to
build predictive models.
Data Science
Applications
Sources of the Data
Data Science Lifecycle
Prerequisites
for Data
Science
1. Machine Learning
2. Modeling
3. Statistics
4. Programming
5. Databases
Who Oversees the Data Science Process?
• Business Managers
• To collaborate with the data science team to characterize the problem and
establish an analytical method.
• IT Managers
• Developing the infrastructure and architecture to enable data science
activities.
• Data Science Managers
• Supervise the working procedures of all data science team members.
• They also manage and keep track of the day-to-day activities of the three data
science teams.
What is a
Data
Scientist?
professionals who have the technical ability
to handle complicated issues as well as the
desire to investigate what questions need to
be answered.
They're a mix of mathematicians, computer
scientists, and trend forecasters.
They're also in high demand and well-paid
because they work in both the business and
IT sectors.
On a daily
basis, a data
scientist
may do the
following
tasks:
Discover patterns and
trends in datasets to get
insights.
Create forecasting
algorithms and data
models.
Improve the quality of
data or product offerings
by utilising machine
learning techniques.
Distribute suggestions to
other teams and top
management.
In data analysis, use data
tools such as R, SAS,
Python, or SQL.
Top the field of data
science innovations.
What Does a
Data Scientist
Do?
Determine the
problem.
Determines the
correct set of
variables and
datasets.
Gather structured
and unstructured
data from many
sources.
Convert raw data
into a suitable
format.
Apply ML
algorithms.
Interpret the data to
find opportunities
and solutions.
Prepare the
results and
insights to share
with stake
holders.
Why Become
a Data
Scientist?
• According to Glassdoor and Forbes,
demand for data scientists will
increase by 28 percent by 2026,
which speaks of the profession’s
durability and longevity, so if you
want a secure career, data science
offers you that chance.
Use of Data
Science
1. Data science may detect patterns in seemingly
unstructured or unconnected data, allowing
conclusions and predictions to be made.
2. Tech businesses that acquire user data can
utilize strategies to transform that data into
valuable or profitable information.
3. Data Science has also made inroads into the
transportation industry, such as with driverless
cars.
4. Data Science applications provide a better level
of therapeutic customization through genetics
and genomics research.
Data Scientist
Job role: Determine what the
problem is, what questions
need answers, and where to
find the data. Also, they mine,
clean, and present the relevant
data.
Skills needed: Programming
skills (SAS, R, Python),
storytelling and data
visualization, statistical and
mathematical skills, knowledge
of Hadoop, SQL, and Machine
Learning.
Data Analyst
Job role: Analysts bridge the gap
between the data scientists and the
business analysts, organizing and
analyzing data to answer the
questions the organization poses.
They take the technical analyses and
turn them into qualitative action
items.
Skills needed: Statistical and
mathematical skills, programming
skills (SAS, R, Python), plus
experience in data wrangling and
data visualization.
Data Engineer
Job role: Data engineers focus on
developing, deploying, managing,
and optimizing the organization’s
data infrastructure and data
pipelines. Engineers support data
scientists by helping to transfer
and transform data for queries.
Skills needed: NoSQL databases
(e.g., MongoDB, Cassandra DB),
programming languages such as
Java and Scala, and frameworks
(Apache Hadoop).
Data
Science
Tools
Data Analysis: SAS, Jupyter, R
Studio, MATLAB, Excel, RapidMiner
Data Warehousing: Informatica/
Talend, AWS Redshift
Data Visualization: Jupyter, Tableau,
Cognos, RAW
Machine Learning: Spark MLib,
Mahout, Azure ML studio
Difference
Between
Business
Intelligence
and Data
Science
BUSINESS INTELLIGENCE DATA SCIENCE
Uses structured data Uses both structured and
unstructured data
Analytical in nature - provides a
historical report of the data
Scientific in nature - perform an in-
depth statistical analysis on the
data
Use of basic statistics with
emphasis on visualization
(dashboards, reports)
Leverages more sophisticated
statistical and predictive analysis
and machine learning (ML)
Compares historical data to current
data to identify trends
Combines historical and current
data to predict future performance
and outcomes
Applications
of Data
Science
1. Healthcare
2. Gaming
3. Image
Recognition
4.
Recommendation
Systems
5. Logistics
6. Fraud
Detection
7. Internet Search
8. Speech
recognition
9. Targeted
Advertising
10. Airline Route
Planning
11. Augmented
Reality
Programming Language
for Data Science
Python
Fundamental
Python
Libraries for
Data
Scientists
Numpy
SciPy
Pandas
Scikit-Learn
IDE
Pycharm
Getting Started
Import pandas as pd
1
Import numpy as np
2
Import
matplotlib.pyplot as
plt
3
Getting Started
data = { ’year ’: [2010 , 2011 , 2012 ,2010 , 2011 , 2012 ,2010 , 2011 , 2012],
’team ’: [’ FCBarcelona ’, ’ FCBarcelona ’,’ FCBarcelona ’, ’ RMadrid ’,’ RMadrid ’, ’ RMadrid ’,’ ValenciaCF ’, ’
ValenciaCF ’,’ ValenciaCF ’
],
’wins ’: [30 , 28 , 32 , 29 , 32 , 26 , 21 , 17 , 19] ,
’ draws ’: [6 , 7, 4, 5, 4, 7, 8, 10 , 8] ,
’ losses ’: [2 , 3, 2, 4, 2, 5, 9, 11 , 11]
}
football = pd . DataFrame ( data , columns = [
’year ’, ’team ’, ’wins ’, ’ draws ’, ’ losses ’
]
)
Output
Read CSV
• Import pandas as pd
• mydata = pd.read_csv(‘data.csv’)
First Five Rows
• mydata.head()
Last Five Rows
• mydata.tail()
Show Statistical Information
• mydata.describe()
Selecting Data
• mydata[‘column’]
Subset of Rows
• mydata[5:10]
Thank You

More Related Content

PPTX
Data science in business Administration Nagarajan.pptx
PDF
Untitled document.pdf
PPTX
Data scientist What is inside it?
PPTX
Data science
PPTX
INTRODUCTION TO DATA SCIENCE -CONCEPTS.pptx
PPTX
AI and data science notes.pptx for DICT module 2
PPTX
The Power of Data Science by DICS INNOVATIVE.pptx
PPTX
introductiontodatascience-230122140841-b90a0856 (1).pptx
Data science in business Administration Nagarajan.pptx
Untitled document.pdf
Data scientist What is inside it?
Data science
INTRODUCTION TO DATA SCIENCE -CONCEPTS.pptx
AI and data science notes.pptx for DICT module 2
The Power of Data Science by DICS INNOVATIVE.pptx
introductiontodatascience-230122140841-b90a0856 (1).pptx

Similar to Introduction to Data Science.pdf (20)

PPTX
Data Engineer vs Data Scientist vs Data Analyst.pptx
PDF
Decoding Data Science
PPTX
Introduction to Data Science.pptx
PPTX
Career_Jobs_in_Data_Science.pptx
PPTX
Introduction to Data Science.pptx
PPTX
introduction to data science
PPTX
Unit 1-FDS. .pptx
PPTX
intro to data science Clustering and visualization of data science subfields ...
PPTX
Impact of Data Science
PDF
Data Science Job ready #DataScienceInterview Question and Answers 2022 | #Dat...
PPTX
Become a successful Data Scientist. Start Now!
PPTX
Data Science Introduction: Concepts, lifecycle, applications.pptx
PPTX
What is data_science_by_khawar_shehzad
PPTX
Data science | demand of data science with AI
PPTX
Data Science Careers with CBitss: Analyst, Engineer & Scientist
PDF
Data science and Machine learning Booklet
PDF
Essential Skills required for Aspiring Data Scientists.pdf
PPTX
introduction TO DS 1.pptxvbvcbvcbvcbvcbvcb
PDF
How to become a data scientist
PPTX
Big Data Courses In Mumbai
Data Engineer vs Data Scientist vs Data Analyst.pptx
Decoding Data Science
Introduction to Data Science.pptx
Career_Jobs_in_Data_Science.pptx
Introduction to Data Science.pptx
introduction to data science
Unit 1-FDS. .pptx
intro to data science Clustering and visualization of data science subfields ...
Impact of Data Science
Data Science Job ready #DataScienceInterview Question and Answers 2022 | #Dat...
Become a successful Data Scientist. Start Now!
Data Science Introduction: Concepts, lifecycle, applications.pptx
What is data_science_by_khawar_shehzad
Data science | demand of data science with AI
Data Science Careers with CBitss: Analyst, Engineer & Scientist
Data science and Machine learning Booklet
Essential Skills required for Aspiring Data Scientists.pdf
introduction TO DS 1.pptxvbvcbvcbvcbvcbvcb
How to become a data scientist
Big Data Courses In Mumbai
Ad

More from University of Sindh (13)

PPT
Python: Introduction to Functions: A complete guide for beginners
PDF
Introduction to Clustering: A complete guide
PPT
Introduction to the descriptive statistics
PPT
C plus plus Inheritance a complete guide
PPT
Introduction to Inheritance in C plus plus
PDF
Object Oriented Programming using C Plus Plus
PDF
Object Oriented Programming using C plus plus
PPTX
Introduction to Edges Detection Techniques
PPTX
Introduction to Data Science and Data Analysis
PPT
digitalimagefundamentals.ppt
PDF
Histogram Equalization.pdf
PDF
Machine Learning.pdf
PPTX
Data Science
Python: Introduction to Functions: A complete guide for beginners
Introduction to Clustering: A complete guide
Introduction to the descriptive statistics
C plus plus Inheritance a complete guide
Introduction to Inheritance in C plus plus
Object Oriented Programming using C Plus Plus
Object Oriented Programming using C plus plus
Introduction to Edges Detection Techniques
Introduction to Data Science and Data Analysis
digitalimagefundamentals.ppt
Histogram Equalization.pdf
Machine Learning.pdf
Data Science
Ad

Recently uploaded (20)

PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Modernizing your data center with Dell and AMD
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Electronic commerce courselecture one. Pdf
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Encapsulation theory and applications.pdf
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Network Security Unit 5.pdf for BCA BBA.
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Encapsulation_ Review paper, used for researhc scholars
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
cuic standard and advanced reporting.pdf
Chapter 3 Spatial Domain Image Processing.pdf
Modernizing your data center with Dell and AMD
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Electronic commerce courselecture one. Pdf
Advanced methodologies resolving dimensionality complications for autism neur...
20250228 LYD VKU AI Blended-Learning.pptx
NewMind AI Monthly Chronicles - July 2025
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Reach Out and Touch Someone: Haptics and Empathic Computing
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
Review of recent advances in non-invasive hemoglobin estimation
NewMind AI Weekly Chronicles - August'25 Week I
Per capita expenditure prediction using model stacking based on satellite ima...
Encapsulation theory and applications.pdf
MYSQL Presentation for SQL database connectivity
Network Security Unit 5.pdf for BCA BBA.
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Encapsulation_ Review paper, used for researhc scholars
Understanding_Digital_Forensics_Presentation.pptx
cuic standard and advanced reporting.pdf

Introduction to Data Science.pdf

  • 1. Data Science Muhammad Suleman Memon Assistant Professor Department of Information Technology, Dadu Campus, University of Sindh
  • 2. What is Data Science? Data science is the domain of study that deals with vast volumes. Find unseen patterns, derive meaningful information, and make business decisions. Data science uses complex machine learning algorithms to build predictive models.
  • 6. Prerequisites for Data Science 1. Machine Learning 2. Modeling 3. Statistics 4. Programming 5. Databases
  • 7. Who Oversees the Data Science Process? • Business Managers • To collaborate with the data science team to characterize the problem and establish an analytical method. • IT Managers • Developing the infrastructure and architecture to enable data science activities. • Data Science Managers • Supervise the working procedures of all data science team members. • They also manage and keep track of the day-to-day activities of the three data science teams.
  • 8. What is a Data Scientist? professionals who have the technical ability to handle complicated issues as well as the desire to investigate what questions need to be answered. They're a mix of mathematicians, computer scientists, and trend forecasters. They're also in high demand and well-paid because they work in both the business and IT sectors.
  • 9. On a daily basis, a data scientist may do the following tasks: Discover patterns and trends in datasets to get insights. Create forecasting algorithms and data models. Improve the quality of data or product offerings by utilising machine learning techniques. Distribute suggestions to other teams and top management. In data analysis, use data tools such as R, SAS, Python, or SQL. Top the field of data science innovations.
  • 10. What Does a Data Scientist Do? Determine the problem. Determines the correct set of variables and datasets. Gather structured and unstructured data from many sources. Convert raw data into a suitable format. Apply ML algorithms. Interpret the data to find opportunities and solutions. Prepare the results and insights to share with stake holders.
  • 11. Why Become a Data Scientist? • According to Glassdoor and Forbes, demand for data scientists will increase by 28 percent by 2026, which speaks of the profession’s durability and longevity, so if you want a secure career, data science offers you that chance.
  • 12. Use of Data Science 1. Data science may detect patterns in seemingly unstructured or unconnected data, allowing conclusions and predictions to be made. 2. Tech businesses that acquire user data can utilize strategies to transform that data into valuable or profitable information. 3. Data Science has also made inroads into the transportation industry, such as with driverless cars. 4. Data Science applications provide a better level of therapeutic customization through genetics and genomics research.
  • 13. Data Scientist Job role: Determine what the problem is, what questions need answers, and where to find the data. Also, they mine, clean, and present the relevant data. Skills needed: Programming skills (SAS, R, Python), storytelling and data visualization, statistical and mathematical skills, knowledge of Hadoop, SQL, and Machine Learning.
  • 14. Data Analyst Job role: Analysts bridge the gap between the data scientists and the business analysts, organizing and analyzing data to answer the questions the organization poses. They take the technical analyses and turn them into qualitative action items. Skills needed: Statistical and mathematical skills, programming skills (SAS, R, Python), plus experience in data wrangling and data visualization.
  • 15. Data Engineer Job role: Data engineers focus on developing, deploying, managing, and optimizing the organization’s data infrastructure and data pipelines. Engineers support data scientists by helping to transfer and transform data for queries. Skills needed: NoSQL databases (e.g., MongoDB, Cassandra DB), programming languages such as Java and Scala, and frameworks (Apache Hadoop).
  • 16. Data Science Tools Data Analysis: SAS, Jupyter, R Studio, MATLAB, Excel, RapidMiner Data Warehousing: Informatica/ Talend, AWS Redshift Data Visualization: Jupyter, Tableau, Cognos, RAW Machine Learning: Spark MLib, Mahout, Azure ML studio
  • 17. Difference Between Business Intelligence and Data Science BUSINESS INTELLIGENCE DATA SCIENCE Uses structured data Uses both structured and unstructured data Analytical in nature - provides a historical report of the data Scientific in nature - perform an in- depth statistical analysis on the data Use of basic statistics with emphasis on visualization (dashboards, reports) Leverages more sophisticated statistical and predictive analysis and machine learning (ML) Compares historical data to current data to identify trends Combines historical and current data to predict future performance and outcomes
  • 18. Applications of Data Science 1. Healthcare 2. Gaming 3. Image Recognition 4. Recommendation Systems 5. Logistics 6. Fraud Detection 7. Internet Search 8. Speech recognition 9. Targeted Advertising 10. Airline Route Planning 11. Augmented Reality
  • 22. Getting Started Import pandas as pd 1 Import numpy as np 2 Import matplotlib.pyplot as plt 3
  • 23. Getting Started data = { ’year ’: [2010 , 2011 , 2012 ,2010 , 2011 , 2012 ,2010 , 2011 , 2012], ’team ’: [’ FCBarcelona ’, ’ FCBarcelona ’,’ FCBarcelona ’, ’ RMadrid ’,’ RMadrid ’, ’ RMadrid ’,’ ValenciaCF ’, ’ ValenciaCF ’,’ ValenciaCF ’ ], ’wins ’: [30 , 28 , 32 , 29 , 32 , 26 , 21 , 17 , 19] , ’ draws ’: [6 , 7, 4, 5, 4, 7, 8, 10 , 8] , ’ losses ’: [2 , 3, 2, 4, 2, 5, 9, 11 , 11] } football = pd . DataFrame ( data , columns = [ ’year ’, ’team ’, ’wins ’, ’ draws ’, ’ losses ’ ] )
  • 25. Read CSV • Import pandas as pd • mydata = pd.read_csv(‘data.csv’)
  • 26. First Five Rows • mydata.head()
  • 27. Last Five Rows • mydata.tail()
  • 28. Show Statistical Information • mydata.describe()
  • 30. Subset of Rows • mydata[5:10]