SlideShare a Scribd company logo
INTRODUCTION OF
DATA SCIENCE
.
❖ Data Science is about data gathering, analysis and decision making.
❖ Data Science is about finding patterns in data, through
analysis, and make future predictions.
❖ By using Data Science, companies are able to make:
⮚ Better decisions (should we choose A or B)
⮚ Predictive analysis (what will happen next?)
⮚ Pattern discoveries (find pattern, or maybe hidden information in the
data)
DATA SCIENCE
� Statistics
� Domain Expertise
� Data Engineering
� Visualization
� Machine Learning
� Advanced Computing
DATA SCIENCE COMPONENTS
❖ The essential component of Data Science is Statistics.
❖ It is a method to collect and analyze the numerical data in a large
amount to get useful and meaningful insights.
❖ There are two main categories of Statistics:
⮚ Descriptive Statistics
⮚ Inferential Statistics
STATISTICS
❖ Visualization means representing the data in visuals such as maps,
graphs, etc.
❖ So that people can understand it easily.
❖ It makes it easy to access a vast amount of data.
❖ The main goal of data visualization is to make it easier to identify
patterns, trends, and outliers in large data sets.
VISUALIZATION
❖ Machine Learning acts as a backbone for data science. It means
providing training to a machine in such a way that it acts as a
human brain.
❖ Various algorithms are used to solve the problems. With the help
of Machine Learning, it becomes easy to make predictions about
future data.
For example :
❖ Social media platform , Face book
MACHINE LEARNING
❖ Domain expertise means the specialized knowledge or skills of a
particular area.
❖ There are various areas in data science for which we need domain
experts.
❖ The lesser we know about the problem, the more difficult it will be to
solve it.
DOMAIN EXPERTISE
❖ Data Engineering involves acquiring, storing, retrieving, and
transforming the data.
❖ The key to understanding data engineering lies in the engineering
part.
❖ Data engineers design and build pipelines that transform and
transport data into a format, and it reaches the Data Scientists or
other end users in a highly usable state.
DATA ENGINEERING
❖ Advanced computing involves designing, writing, debugging, and
maintaining the source code of computer programs.
❖ Advanced computing capabilities are used to handle a growing
range of challenging science and engineering problems, many of
which are compute- and data-intensive.
ADVANCED COMPUTING
FACETS OF DATA
.
❖ Data science is focused on making sense of complex datasets and
in building predictive models from those data.
❖ As such, it encompasses a wide array of different activities, from the
upstream processes of acquiring, cleaning and integrating data to
downstream processes of analysis, modelling and prediction.
FACETS OF DATA
⮚ Structured
⮚ Unstructured
⮚ Natural Language
⮚ Machine-generated
⮚ Graph-based
⮚ Audio, video and images
⮚ Streaming
THE MAIN CATEGORIES OF
DATA
There are many facets of data science, including:
⮚ Identifying the structure of data.
⮚ Cleaning, filtering, reorganizing, augmenting, and aggregating data.
⮚ Visualizing data.
⮚ Data analysis, statistics, and modelling.
⮚ Machine Learning.
⮚ Assembling data processing pipelines to link these steps.
⮚ Leveraging high-end computational resources for large-scale
problems.
FACETS OF DATA
DATA SCIENCE
PROCESS
.
⮚ Step 1: Frame the problem.
⮚ Step 2: Collect the raw data needed for your problem.
⮚ Step 3: Process the data for analysis.
⮚ Step 4: Explore the data.
⮚ Step 5: Perform in-depth analysis.
⮚ Step 6: Communicate results of the analysis.
DATA SCIENCE PROCESS
❖ The first thing you have to do before you solve a problem is to
define exactly what it is.
❖ You need to be able to translate data questions into something
actionable.
You should ask questions like the following:
❖ Who are the customers?
❖ Why are they buying our product?
❖ How do we predict if a customer is going to buy our product?
Frame the problem
❖ Once you defined the problem, you need data to give you the
insights needed to turn the problem around with a solution.
❖ This part of the process involves thinking through what data you
need and finding ways to get that data, whether it’s querying internal
databases, or purchasing external datasets.
❖ You can export the CRM data in a CSV file for further analysis.
Collect the raw data needed
for your problem
❖ Now that you have all of the raw data, you’ll need to process it
before you can do any analysis.
❖ Oftentimes, data can be quite messy, especially if it hasn’t been
well-maintained.
❖ You see errors that will corrupt your analysis: values set to null
though they really are zero, duplicate values, and missing values.
❖ It’s up to you to go through and check your data to make sure you’ll
get accurate insights.
Process the data for analysis
❖ When your data is clean, you should start playing with it.
❖ The difficulty here isn’t coming up with ideas to test, it’s coming up
with ideas that are likely to turn into insights.
❖ You have to look at some of the most interesting patterns that can
help explain why sales are reduced for this group.
Explore the data
❖ This step of the process is where you going to have to apply your
statistical, mathematical and technological knowledge and leverage
all of the data science tools at your disposal to crunch the data and
find every insight.
❖ You can now combine all of those qualitative insights with data from
your quantitative analysis to craft a story that moves people to
action.
Perform in-depth analysis
❖ It’s important that the VP Sales understand why the insights you
uncovered are important.
❖ Ultimately, you’ve been called upon to create a solution throughout
the data science process.
❖ Proper communication will mean the difference
between action and inaction on your proposals.
❖ You start by explaining the reasons behind the underperformance of
the older demographic.
Communicate results of the
analysis
DATASCIENCE.pptx

More Related Content

PDF
Data Analyst Beginner Guide for 2023
PPTX
Data Science Introduction to Data Science
PPTX
Introduction to Data Analytics - PPM.pptx
PPTX
Data analysis (Seminar for MR) (1).pptx
PDF
Data Science Unit1 AMET.pdf
PPTX
Data Processing & Explain each term in details.pptx
PPTX
Data Science in Python.pptx
PPTX
Data Science Training in Chandigarh h
Data Analyst Beginner Guide for 2023
Data Science Introduction to Data Science
Introduction to Data Analytics - PPM.pptx
Data analysis (Seminar for MR) (1).pptx
Data Science Unit1 AMET.pdf
Data Processing & Explain each term in details.pptx
Data Science in Python.pptx
Data Science Training in Chandigarh h

Similar to DATASCIENCE.pptx (20)

PPTX
Data Analytics course.pptx
PPTX
Introduction to Business Analytics Part 1
PPTX
Introduction to data science
PPTX
Data science in business Administration Nagarajan.pptx
PPTX
Data Analytics Course in Noida. pptx
PPTX
Data science
PPTX
Data Science topic and introduction to basic concepts involving data manageme...
PPTX
Which institute is best for data science?
PPTX
Best Selenium certification course
PPTX
Data science training in hyd ppt (1)
PPTX
Data science training institute in hyderabad
PPTX
Data science training in Hyderabad
PPTX
Data science training Hyderabad
PPTX
Data science online training in hyderabad
PPTX
Data science training in hyd ppt (1)
PPTX
data science training and placement
PPTX
online data science training
PPTX
Data science online training in hyderabad
PPTX
data science online training in hyderabad
PPTX
Best data science training in Hyderabad
Data Analytics course.pptx
Introduction to Business Analytics Part 1
Introduction to data science
Data science in business Administration Nagarajan.pptx
Data Analytics Course in Noida. pptx
Data science
Data Science topic and introduction to basic concepts involving data manageme...
Which institute is best for data science?
Best Selenium certification course
Data science training in hyd ppt (1)
Data science training institute in hyderabad
Data science training in Hyderabad
Data science training Hyderabad
Data science online training in hyderabad
Data science training in hyd ppt (1)
data science training and placement
online data science training
Data science online training in hyderabad
data science online training in hyderabad
Best data science training in Hyderabad
Ad

More from KarthicaMarasamy (15)

PPTX
COMPUTER NETWORK -LAN ,WAN ,MAN FUNCTIONSpptx
PPTX
Bayer's Theorem Naive Bayer's classifier
PPTX
Roles of Datascience.pptx
PPTX
Software Testing 1.pptx
PDF
powerpoint 1.pdf
PPTX
class 3.pptx
PPTX
class 2.pptx
PPTX
Software Testing
PPTX
Network (Hub,switches)
PPTX
Computer network layers
PPTX
Presentation more c_programmingcharacter_and_string_handling_
PPTX
C programming
PPTX
Fundamentals steps in Digital Image processing
PPTX
DIGITAL IMAGE PROCESSING
PPTX
COMPUTER NETWORK -LAN ,WAN ,MAN FUNCTIONSpptx
Bayer's Theorem Naive Bayer's classifier
Roles of Datascience.pptx
Software Testing 1.pptx
powerpoint 1.pdf
class 3.pptx
class 2.pptx
Software Testing
Network (Hub,switches)
Computer network layers
Presentation more c_programmingcharacter_and_string_handling_
C programming
Fundamentals steps in Digital Image processing
DIGITAL IMAGE PROCESSING
Ad

Recently uploaded (20)

PDF
.pdf is not working space design for the following data for the following dat...
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PPTX
Business Acumen Training GuidePresentation.pptx
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPTX
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
PPT
Reliability_Chapter_ presentation 1221.5784
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PPTX
Supervised vs unsupervised machine learning algorithms
PPT
Chapter 2 METAL FORMINGhhhhhhhjjjjmmmmmmmmm
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PDF
Fluorescence-microscope_Botany_detailed content
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PPTX
climate analysis of Dhaka ,Banglades.pptx
PPTX
Business Ppt On Nestle.pptx huunnnhhgfvu
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
.pdf is not working space design for the following data for the following dat...
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
Business Acumen Training GuidePresentation.pptx
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
Reliability_Chapter_ presentation 1221.5784
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
Supervised vs unsupervised machine learning algorithms
Chapter 2 METAL FORMINGhhhhhhhjjjjmmmmmmmmm
STUDY DESIGN details- Lt Col Maksud (21).pptx
Fluorescence-microscope_Botany_detailed content
Acceptance and paychological effects of mandatory extra coach I classes.pptx
climate analysis of Dhaka ,Banglades.pptx
Business Ppt On Nestle.pptx huunnnhhgfvu
Miokarditis (Inflamasi pada Otot Jantung)
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
Introduction-to-Cloud-ComputingFinal.pptx

DATASCIENCE.pptx

  • 2. ❖ Data Science is about data gathering, analysis and decision making. ❖ Data Science is about finding patterns in data, through analysis, and make future predictions. ❖ By using Data Science, companies are able to make: ⮚ Better decisions (should we choose A or B) ⮚ Predictive analysis (what will happen next?) ⮚ Pattern discoveries (find pattern, or maybe hidden information in the data) DATA SCIENCE
  • 3. � Statistics � Domain Expertise � Data Engineering � Visualization � Machine Learning � Advanced Computing DATA SCIENCE COMPONENTS
  • 4. ❖ The essential component of Data Science is Statistics. ❖ It is a method to collect and analyze the numerical data in a large amount to get useful and meaningful insights. ❖ There are two main categories of Statistics: ⮚ Descriptive Statistics ⮚ Inferential Statistics STATISTICS
  • 5. ❖ Visualization means representing the data in visuals such as maps, graphs, etc. ❖ So that people can understand it easily. ❖ It makes it easy to access a vast amount of data. ❖ The main goal of data visualization is to make it easier to identify patterns, trends, and outliers in large data sets. VISUALIZATION
  • 6. ❖ Machine Learning acts as a backbone for data science. It means providing training to a machine in such a way that it acts as a human brain. ❖ Various algorithms are used to solve the problems. With the help of Machine Learning, it becomes easy to make predictions about future data. For example : ❖ Social media platform , Face book MACHINE LEARNING
  • 7. ❖ Domain expertise means the specialized knowledge or skills of a particular area. ❖ There are various areas in data science for which we need domain experts. ❖ The lesser we know about the problem, the more difficult it will be to solve it. DOMAIN EXPERTISE
  • 8. ❖ Data Engineering involves acquiring, storing, retrieving, and transforming the data. ❖ The key to understanding data engineering lies in the engineering part. ❖ Data engineers design and build pipelines that transform and transport data into a format, and it reaches the Data Scientists or other end users in a highly usable state. DATA ENGINEERING
  • 9. ❖ Advanced computing involves designing, writing, debugging, and maintaining the source code of computer programs. ❖ Advanced computing capabilities are used to handle a growing range of challenging science and engineering problems, many of which are compute- and data-intensive. ADVANCED COMPUTING
  • 11. ❖ Data science is focused on making sense of complex datasets and in building predictive models from those data. ❖ As such, it encompasses a wide array of different activities, from the upstream processes of acquiring, cleaning and integrating data to downstream processes of analysis, modelling and prediction. FACETS OF DATA
  • 12. ⮚ Structured ⮚ Unstructured ⮚ Natural Language ⮚ Machine-generated ⮚ Graph-based ⮚ Audio, video and images ⮚ Streaming THE MAIN CATEGORIES OF DATA
  • 13. There are many facets of data science, including: ⮚ Identifying the structure of data. ⮚ Cleaning, filtering, reorganizing, augmenting, and aggregating data. ⮚ Visualizing data. ⮚ Data analysis, statistics, and modelling. ⮚ Machine Learning. ⮚ Assembling data processing pipelines to link these steps. ⮚ Leveraging high-end computational resources for large-scale problems. FACETS OF DATA
  • 15. ⮚ Step 1: Frame the problem. ⮚ Step 2: Collect the raw data needed for your problem. ⮚ Step 3: Process the data for analysis. ⮚ Step 4: Explore the data. ⮚ Step 5: Perform in-depth analysis. ⮚ Step 6: Communicate results of the analysis. DATA SCIENCE PROCESS
  • 16. ❖ The first thing you have to do before you solve a problem is to define exactly what it is. ❖ You need to be able to translate data questions into something actionable. You should ask questions like the following: ❖ Who are the customers? ❖ Why are they buying our product? ❖ How do we predict if a customer is going to buy our product? Frame the problem
  • 17. ❖ Once you defined the problem, you need data to give you the insights needed to turn the problem around with a solution. ❖ This part of the process involves thinking through what data you need and finding ways to get that data, whether it’s querying internal databases, or purchasing external datasets. ❖ You can export the CRM data in a CSV file for further analysis. Collect the raw data needed for your problem
  • 18. ❖ Now that you have all of the raw data, you’ll need to process it before you can do any analysis. ❖ Oftentimes, data can be quite messy, especially if it hasn’t been well-maintained. ❖ You see errors that will corrupt your analysis: values set to null though they really are zero, duplicate values, and missing values. ❖ It’s up to you to go through and check your data to make sure you’ll get accurate insights. Process the data for analysis
  • 19. ❖ When your data is clean, you should start playing with it. ❖ The difficulty here isn’t coming up with ideas to test, it’s coming up with ideas that are likely to turn into insights. ❖ You have to look at some of the most interesting patterns that can help explain why sales are reduced for this group. Explore the data
  • 20. ❖ This step of the process is where you going to have to apply your statistical, mathematical and technological knowledge and leverage all of the data science tools at your disposal to crunch the data and find every insight. ❖ You can now combine all of those qualitative insights with data from your quantitative analysis to craft a story that moves people to action. Perform in-depth analysis
  • 21. ❖ It’s important that the VP Sales understand why the insights you uncovered are important. ❖ Ultimately, you’ve been called upon to create a solution throughout the data science process. ❖ Proper communication will mean the difference between action and inaction on your proposals. ❖ You start by explaining the reasons behind the underperformance of the older demographic. Communicate results of the analysis