SlideShare a Scribd company logo
2
Most read
5
Most read
6
Most read
R programming for Data Science
Part-1
w
www.dataspoof.info
Steps to do any Data
Science project
Identify the problem
(question)
Collect & Prepare the
data
Explore the data
Communicatethe
results
www.dataspoof.info
Data Collection
Data collection is the process of gathering information from a
specific source, which can be used to answer relevant questions
and evaluate outcomes.
Data can help us in:
• learning more about customers, items, products, ..etc.
• discovering trends in the current system, organization,
..etc.
• segmenting elements into different groups based on
their individual needs.
• decision making process to improve the quality of the
system.
• improving the quality of the product or service based
on the feedback obtained. www.dataspoof.info
Data Sources
www.dataspoof.info
Data Format
www.dataspoof.info
Define Data
Data is a set of facts such as numbers, words, measurements, observations or descriptions
of things.
There are two types of data are there
• Qualitative data: descriptive information (describes something).
• Quantitative data: numerical information (numbers).
www.dataspoof.info
Qualitative vs
Quantitative
Types of Data Values
ā–ŗNumeric:
•Discrete - integer values. Example: number of car in the park.
•Continuous - any value in a pre-defined range (float, double). Example: average mark (e.g., 63.4)
ā–ŗCategorical: values are selected from a predefined number of categories.
•Ordinal - categories could be meaningfully ordered. Example: grades (A, B, C, D, E, F).
•Nominal - don’t have any order. Example: eye colours (blue, black, honey, etc.)
•Binary - the special case of nominal, with only 2 possible
categories. Example: binary value (1, 0)
www.dataspoof.info
Types of Data Values
ā–ŗDate: datetime, timestamp. Example: 11.10.2018.
ā–ŗText: Multidimensional data
ā–ŗTime series: Data points indexed in the time order
Types of Data Category
There are two main categories
• Experimental data: Data collected from strictly controlled/designed experiments with efforts
made to ensure statistical validity.
Examples- Medical clinical trials, Election polls
• Observational data: Data collected from ’real-world’ settings without control over the captured
underlying phenomena. It is easier to collect and obtain, but results and conclusions from such
data may be biased or inconclusive.
Examples- Almost all data used in data mining, bushiness analytic and data science are
observational data.
Various Data Types are
• Numbers
• String
• Relational data
• Factors or categorical variables
• Dates and times
• Description

More Related Content

PPTX
Data mining course learning outcomes,Data Mining CMAP
PDF
Data management 26 sept 2020 by dr tmh myanmar
PPTX
Data mining - Process, Techniques and Research Topics
DOCX
The marketing research process consists of defining the research problem and ...
PPTX
Analyzing data
PPT
Brm lect-03
PPT
Brief Introduction to the 12 Steps of Evaluation Data Cleaning
PPTX
Multi variate presentation
Data mining course learning outcomes,Data Mining CMAP
Data management 26 sept 2020 by dr tmh myanmar
Data mining - Process, Techniques and Research Topics
The marketing research process consists of defining the research problem and ...
Analyzing data
Brm lect-03
Brief Introduction to the 12 Steps of Evaluation Data Cleaning
Multi variate presentation

What's hot (20)

PPTX
Data mining
PPTX
83341 ch25 jacobsen
PPTX
Factorial Design, Sampling, Census and Questionnaire
PPTX
Data analysis – using computers for presentation
PPTX
Business Statistics
PDF
6 ijaems sept-2015-6-a review of data security primitives in data mining
PPTX
Business statistics
PPT
Using Qualitative Data Analysis Software By Michelle C. Bligh, Ph.D., Claremo...
PDF
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
PPTX
Research Methods for Computational Statistics
PPTX
Machine learning in Data Science
PDF
Kenett On Information NYU-Poly 2013
PPT
Website development company surat
PPS
Statistics
PPTX
Introduction to statistics
PPT
Statistics collection of data
PPTX
Collecting, analyzing, and interpreting qualitative data
PDF
Data Analytics all units
PDF
Sessione I - Big Data Li-Chun Zhang, Discussion: Test mining, machin learn...
PPT
Propensity Score Matching Using SAS Enterprise Guide
Data mining
83341 ch25 jacobsen
Factorial Design, Sampling, Census and Questionnaire
Data analysis – using computers for presentation
Business Statistics
6 ijaems sept-2015-6-a review of data security primitives in data mining
Business statistics
Using Qualitative Data Analysis Software By Michelle C. Bligh, Ph.D., Claremo...
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
Research Methods for Computational Statistics
Machine learning in Data Science
Kenett On Information NYU-Poly 2013
Website development company surat
Statistics
Introduction to statistics
Statistics collection of data
Collecting, analyzing, and interpreting qualitative data
Data Analytics all units
Sessione I - Big Data Li-Chun Zhang, Discussion: Test mining, machin learn...
Propensity Score Matching Using SAS Enterprise Guide
Ad

Similar to R programming for data science (20)

PPTX
Data analytics unit 1 aktu updated syllabus new
PPTX
Advance Data Mining - Machine Learning -
PPTX
Data Analysis in Research: Descriptive Statistics & Normality
PDF
Module 2 Data Collection and Management.pdf
PPTX
Data 2014
PPTX
Researchpe-5.pptx
PPTX
Research Data Management
PPTX
Data Science Introduction to Data Science
PPTX
Data Processing & Explain each term in details.pptx
PDF
Introduction to Data Analytics, AKTU - UNIT-1
PPTX
Data Science topic and introduction to basic concepts involving data manageme...
PPTX
CSE3038_Module1 - updated v1.1bvjchcghvkhvjkvjvkjvh.pptx
PPT
chap1.ppt
PPT
Information_System_and_Data_mining12.ppt
PPT
chap1.ppt
PPT
chap1.ppt
Ā 
PPTX
Introduction to data science
PDF
Data Collection Methods
PPTX
Pre_requisties of ML Lect 1.pptxvcbvcbvcbvcb
PDF
MIDS UNIT-1.pdf building, testing, and Deployment
Data analytics unit 1 aktu updated syllabus new
Advance Data Mining - Machine Learning -
Data Analysis in Research: Descriptive Statistics & Normality
Module 2 Data Collection and Management.pdf
Data 2014
Researchpe-5.pptx
Research Data Management
Data Science Introduction to Data Science
Data Processing & Explain each term in details.pptx
Introduction to Data Analytics, AKTU - UNIT-1
Data Science topic and introduction to basic concepts involving data manageme...
CSE3038_Module1 - updated v1.1bvjchcghvkhvjkvjvkjvh.pptx
chap1.ppt
Information_System_and_Data_mining12.ppt
chap1.ppt
chap1.ppt
Ā 
Introduction to data science
Data Collection Methods
Pre_requisties of ML Lect 1.pptxvcbvcbvcbvcb
MIDS UNIT-1.pdf building, testing, and Deployment
Ad

Recently uploaded (20)

PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PDF
Fluorescence-microscope_Botany_detailed content
PDF
ā€œGetting Started with Data Analytics Using R – Concepts, Tools & Case Studiesā€
PDF
.pdf is not working space design for the following data for the following dat...
PDF
Foundation of Data Science unit number two notes
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PPTX
Qualitative Qantitative and Mixed Methods.pptx
PPTX
Introduction to machine learning and Linear Models
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PPTX
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
PDF
Lecture1 pattern recognition............
PDF
Clinical guidelines as a resource for EBP(1).pdf
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
oil_refinery_comprehensive_20250804084928 (1).pptx
IBA_Chapter_11_Slides_Final_Accessible.pptx
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
Fluorescence-microscope_Botany_detailed content
ā€œGetting Started with Data Analytics Using R – Concepts, Tools & Case Studiesā€
.pdf is not working space design for the following data for the following dat...
Foundation of Data Science unit number two notes
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
Qualitative Qantitative and Mixed Methods.pptx
Introduction to machine learning and Linear Models
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
Introduction-to-Cloud-ComputingFinal.pptx
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
Lecture1 pattern recognition............
Clinical guidelines as a resource for EBP(1).pdf
Recruitment and Placement PPT.pdfbjfibjdfbjfobj

R programming for data science

  • 1. R programming for Data Science Part-1 w www.dataspoof.info
  • 2. Steps to do any Data Science project Identify the problem (question) Collect & Prepare the data Explore the data Communicatethe results www.dataspoof.info
  • 3. Data Collection Data collection is the process of gathering information from a specific source, which can be used to answer relevant questions and evaluate outcomes. Data can help us in: • learning more about customers, items, products, ..etc. • discovering trends in the current system, organization, ..etc. • segmenting elements into different groups based on their individual needs. • decision making process to improve the quality of the system. • improving the quality of the product or service based on the feedback obtained. www.dataspoof.info
  • 6. Define Data Data is a set of facts such as numbers, words, measurements, observations or descriptions of things. There are two types of data are there • Qualitative data: descriptive information (describes something). • Quantitative data: numerical information (numbers). www.dataspoof.info
  • 8. Types of Data Values ā–ŗNumeric: •Discrete - integer values. Example: number of car in the park. •Continuous - any value in a pre-defined range (float, double). Example: average mark (e.g., 63.4) ā–ŗCategorical: values are selected from a predefined number of categories. •Ordinal - categories could be meaningfully ordered. Example: grades (A, B, C, D, E, F). •Nominal - don’t have any order. Example: eye colours (blue, black, honey, etc.) •Binary - the special case of nominal, with only 2 possible categories. Example: binary value (1, 0) www.dataspoof.info
  • 9. Types of Data Values ā–ŗDate: datetime, timestamp. Example: 11.10.2018. ā–ŗText: Multidimensional data ā–ŗTime series: Data points indexed in the time order
  • 10. Types of Data Category There are two main categories • Experimental data: Data collected from strictly controlled/designed experiments with efforts made to ensure statistical validity. Examples- Medical clinical trials, Election polls • Observational data: Data collected from ’real-world’ settings without control over the captured underlying phenomena. It is easier to collect and obtain, but results and conclusions from such data may be biased or inconclusive. Examples- Almost all data used in data mining, bushiness analytic and data science are observational data.
  • 11. Various Data Types are • Numbers • String • Relational data • Factors or categorical variables • Dates and times • Description