SlideShare a Scribd company logo
What is error?
• Error (statistical error) describes the
difference between a value obtained from a
data collection process and the 'true' value
for the population.
• The greater the error, the less representative
the data are of the population.
Why does error matter?
• The greater the error, the less reliable are the results
of the study.
• A credible data source will have measures in place
throughout the data collection process to minimise the
amount of error, and will also be transparent about
the size of the expected error so that users can decide
whether the data are 'fit for purpose'.
Data can be affected by two types
of error:
• Sampling Error
• Non-sampling Error
SAMPLING ERROR
• Sampling error occurs solely as a result of using
a sample from a population, rather than conducting a
census (complete enumeration) of the population.
• It refers to the difference between an estimate for a
population based on data from a sample and the 'true'
value for that population which would result if
a census were taken.
• Sampling errors do not occur in a census, as the
census values are based on the entire population.
• Sampling error can be measured and controlled
in random samples where each unit has a chance of
selection, and that chance can be calculated.
• In general, increasing the sample size will reduce the
sample error.
Sampling error can occur when:
• The proportions of different characteristics within
the sample are not similar to the proportions of
the characteristics for the whole population (i.E. If
we are taking a sample of men and women and we
know that 51% of the total population are women
and 49% are men, then we should aim to have
similar proportions in our sample);
• The sample is too small to accurately represent the
population; and
• The sampling method is not random.
NON-SAMPLING ERROR
• Non-sampling error is caused by factors
other than those related to sample selection.
• It refers to the presence of any factor,
whether systemic or random, that results in
the data values not accurately reflecting the
'true' value for the population.
• Non-sampling error can occur at any stage of
a census or sample study, and are not easily
identified or quantified.
Non-sampling Error Can Include :
• Coverage error: this occurs when a unit in the sample is incorrectly excluded or
included, or is duplicated in the sample (e.g. a field interviewer fails to interview
a selected household or some people in a household).
• Non-response error: this refers to the failure to obtain a response from
some unit because of absence, non-contact, refusal, or some other reason. Non-
response can be complete non-response (i.e. no data has been obtained at all
from a selected unit) or partial non-response (i.e. the answers to some
questions have not been provided by a selected unit).
• Response error: this refers to a type of error caused
by respondents intentionally or accidentally providing inaccurate responses.
This occurs when concepts, questions or instructions are not clearly understood
by the respondent; when there are high levels of respondent burden and
memory recall required; and because some questions can result in a tendency
to answer in a socially desirable way (giving a response which they feel is more
acceptable rather than being an accurate response).
• Interviewer error: this occurs when interviewers incorrectly record information;
are not neutral or objective; influence the respondent to answer in a particular
way; or assume responses based on appearance or other characteristics.
• Processing error: this refers to errors that occur in the process of data
collection, data entry, coding, editing and output.
Why do we measure error?
• Error is expected in a data collection process,
particularly if the data is obtained from
a sample survey. Although non-sampling error
is difficult to measure, sampling error can be
measured to give an indication of the accuracy
of any estimate value for the population. This
assists users to make informed decisions
about whether the statistics are suited to their
needs.
How do we measure error?
• Two common measures of error are: standard error and the
relative standard error.
• Standard Error (SE) is a measure of the variation between
any estimated population value that is based on a sample
rather than true value for the population.
• SE of any estimate for a measure of average magnitude of
the difference between sample estimate and population
parameters taken over the all sample estimate from the
population.
• It is important to consider the Standard Error as it affects
the accuracy of the estimates and, therefore, the
importance that can be placed on the interpretations drawn
from the data.
• SE is applied for std. deviation of sampling
distribution of any estimate
• The standard error of the mean (SEM) can be
expressed as:
where
s is the standard deviation of the population.
n is the size (number of observations) of the sample.
• Relative Standard Error (RSE) is the standard
error expressed as a proportion of an
estimated value. It is usually displayed as a
percentage. RSEs are a useful measure as they
provide an indication of the relative size of the
error likely to have occurred due to sampling.
A high RSE indicates less confidence that an
estimated value is close to the true population
value.
Standard Error v/s Relative Standard
Error
• The Standard Error measure indicates the extent
to which a survey estimate is likely to deviate
from the true population and is expressed as a
number.
• The Relative Standard Error (RSE) is the standard
error expressed as a fraction of the estimate and
is usually expressed as a percentage.
• Estimates with a RSE of 25% or greater are
subject to high sampling error and should be
used with caution.
PROBABLE ERROR
• In statistics, probable error defines the half-
range of an interval about a central point for the
distribution, such that half of the values from the
distribution will lie within the interval and half
outside.
• Measure of the error of estimate for a sample
from a normal distribution, it is computed by
multiplying the standard error with 0.6745
• Thus for a symmetric distribution, it is equivalent
to half the interquartile range, or the median
Absolute deviation.
PE= 0.67449 (SE)
PROBABLE ERROR OF COEFFICIENT OF
CORRELATION
• It is an measure of testing reliability of an
observed value of coefficient of correlation. it
depends on the condition of random sampling
• It is represented by “r”
What can measures of error tell us?
• The standard error can be used to construct a
confidence interval.
A confidence interval is a range in which it is
estimated the true population value lies.
• Confidence intervals of different sizes can be created
to represent different levels of confidence that the
true population value will lie within a particular range.
• A common confidence interval used in statistics is the
95% confidence interval. In a 'normal distribution', the
95% confidence interval is measured by two standard
errors either side of the estimate.
SIGNIFICANCE OF PROBABLE ERROR
• Can be used of determining limits within which
coefficient of correlation of population is
expected to be located
• It is used to test if an observed value of sample
correlation coefficient is significant of any
correlation in population
• If r < PE, then correlation=insignificant
• If r > 6PE then r= significant
• If r < 6PE then sample size is too small for any
estimation
Type I And Type II Errors
• In statistical hypothesis testing, a type I
error is the incorrect rejection of a true null
hypothesis (H0) (also known as a "false
positive" finding), while a type II error is
incorrectly retaining a false null hypothesis
(also known as a "false negative" finding).
• More simply stated, a type I error is to falsely
infer the existence of something that is not
there, while a type II error is to falsely infer
the absence of something that is.
• A type I error (or error of the first kind) is the incorrect
rejection of a true null hypothesis.
• Usually a type I error leads one to conclude that a supposed
effect or relationship exists when in fact it doesn't.
• (H0)=true but is rejected
• Let the probability of making type I error by rejecting H0 = a
• Then probability of accepting H0 = 1-a
• Examples of type I errors-
• a test that shows a patient to have a disease when in fact the
patient does not have the disease,
• a fire alarm going on indicating a fire when in fact there is no
fire, or
• an experiment indicating that a medical treatment should
cure a disease when in fact it does not.
• A type II error (or error of the second kind) is the
failure to reject a false null hypothesis.
• Similarly, probability of making type II error= b
• Examples of type II errors –
a. a blood test failing to detect the disease it was
designed to detect, in a patient who really has the
disease;
b. a fire breaking out and the fire alarm does not
ring; or
c. a clinical trial of a medical treatment failing to
show that the treatment works when really it does
LEVEL OF SIGNIFICANCE
• Statistical tests fix the probability of committing
type I error at certain level, called the level of
significance.
• If the calculative probability is less than LOS, then
null hypothesis is rejected or accepted otherwise
• 2 commonly used LOS are-
• 1% LOS and 5% LOS
• Simply, LOS means chances of making error
• If we chose 5% LOS , it implies that 5 out of 100
we are likely to reject the correct H0
• Example: if a=0.05 the probability of making error
is 5% and when a=0.01 the probability of making
error is 1%

More Related Content

PPTX
Errors and types
PPTX
Epidemiology Chapter 5.pptx
PPTX
sample size determination and power of study
PPTX
sampling error.pptx
PPTX
De-Mystifying Stats: A primer on basic statistics
PPTX
Sampling Error as part of business stats
PPTX
Hypothesis
PPTX
CO 3. Hypothesis Testing which is basicl
Errors and types
Epidemiology Chapter 5.pptx
sample size determination and power of study
sampling error.pptx
De-Mystifying Stats: A primer on basic statistics
Sampling Error as part of business stats
Hypothesis
CO 3. Hypothesis Testing which is basicl

Similar to 2_Errors in Experimental Observations_ML.ppt (20)

PPTX
Errors in Sampling - Types, Examples and Concepts
PPT
Hypothesis testing.ppt
PPT
Chapter8
PPT
chapter8.ppt
PPT
chapter8.ppt
PPTX
Bio-Statistics in Bio-Medical research
PPT
chapter8.pptnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
PPTX
Inferential Applied Statistics for researchers
PDF
BASIC STATISTICS AND THEIR INTERPRETATION AND USE IN EPIDEMIOLOGY 050822.pdf
PPTX
Designing a sample survey 19-10-2020.pptx
PPTX
Testing Of Hypothesis
PPTX
Zoology Bs 5th regular roll no 42 ..pptx
PPTX
Tests of significance Periodontology
PPTX
How to compute for sample size.pptx
PPTX
Presentation research- chapter 10-11 istiqlal
PDF
Errors, types of errors found in analytical chemistry
PDF
Research method ch07 statistical methods 1
PPTX
How to do the maths
PPT
PDF
Sample size determination
Errors in Sampling - Types, Examples and Concepts
Hypothesis testing.ppt
Chapter8
chapter8.ppt
chapter8.ppt
Bio-Statistics in Bio-Medical research
chapter8.pptnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
Inferential Applied Statistics for researchers
BASIC STATISTICS AND THEIR INTERPRETATION AND USE IN EPIDEMIOLOGY 050822.pdf
Designing a sample survey 19-10-2020.pptx
Testing Of Hypothesis
Zoology Bs 5th regular roll no 42 ..pptx
Tests of significance Periodontology
How to compute for sample size.pptx
Presentation research- chapter 10-11 istiqlal
Errors, types of errors found in analytical chemistry
Research method ch07 statistical methods 1
How to do the maths
Sample size determination
Ad

More from VGaneshKarthikeyan (20)

PPT
1.3 Basic coding skills_fundamentals .ppt
PPT
5_Model for Predictions_Machine_Learning.ppt
PDF
FINAL_DAY11_INTERFACES_Roles_and_Responsibility.pdf
PPTX
FINAL_DAY10_INTERFACES_roles and benefits.pptx
PPTX
FINAL_DAY8_VISIBILITY_LABELS_Roles and.pptx
PPTX
FINAL_DAY9_METHOD_OVERRIDING_Role and benefits .pptx
PPT
JAVA_BASICS_Data_abstraction_encapsulation.ppt
PPT
Java ppt-class_Introduction_class_Objects.ppt
PPT
INT104 DBMS - Introduction_Atomicity.ppt
PDF
6. Implementation of classes_and_its_advantages.pdf
PPT
Operators_in_C++_advantages_applications.ppt
PPT
1_Standard error Experimental Data_ML.ppt
PPTX
Unit III Part I_Opertaor_Overloading.pptx
PPTX
Linear_discriminat_analysis_in_Machine_Learning.pptx
PPTX
K-Mean clustering_Introduction_Applications.pptx
PPTX
Numpy_defintion_description_usage_examples.pptx
PPT
Refined_Lecture-14-Linear Algebra-Review.ppt
PPT
randomwalks_states_figures_events_happenings.ppt
PPT
stochasticmodellinganditsapplications.ppt
PPTX
1.10 Tuples_sets_usage_applications_advantages.pptx
1.3 Basic coding skills_fundamentals .ppt
5_Model for Predictions_Machine_Learning.ppt
FINAL_DAY11_INTERFACES_Roles_and_Responsibility.pdf
FINAL_DAY10_INTERFACES_roles and benefits.pptx
FINAL_DAY8_VISIBILITY_LABELS_Roles and.pptx
FINAL_DAY9_METHOD_OVERRIDING_Role and benefits .pptx
JAVA_BASICS_Data_abstraction_encapsulation.ppt
Java ppt-class_Introduction_class_Objects.ppt
INT104 DBMS - Introduction_Atomicity.ppt
6. Implementation of classes_and_its_advantages.pdf
Operators_in_C++_advantages_applications.ppt
1_Standard error Experimental Data_ML.ppt
Unit III Part I_Opertaor_Overloading.pptx
Linear_discriminat_analysis_in_Machine_Learning.pptx
K-Mean clustering_Introduction_Applications.pptx
Numpy_defintion_description_usage_examples.pptx
Refined_Lecture-14-Linear Algebra-Review.ppt
randomwalks_states_figures_events_happenings.ppt
stochasticmodellinganditsapplications.ppt
1.10 Tuples_sets_usage_applications_advantages.pptx
Ad

Recently uploaded (20)

PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PPTX
Institutional Correction lecture only . . .
PDF
Insiders guide to clinical Medicine.pdf
PPTX
Renaissance Architecture: A Journey from Faith to Humanism
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PPTX
PPH.pptx obstetrics and gynecology in nursing
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PDF
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PDF
RMMM.pdf make it easy to upload and study
PDF
Classroom Observation Tools for Teachers
PDF
O7-L3 Supply Chain Operations - ICLT Program
PDF
Anesthesia in Laparoscopic Surgery in India
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PDF
Complications of Minimal Access Surgery at WLH
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PDF
Sports Quiz easy sports quiz sports quiz
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
Institutional Correction lecture only . . .
Insiders guide to clinical Medicine.pdf
Renaissance Architecture: A Journey from Faith to Humanism
O5-L3 Freight Transport Ops (International) V1.pdf
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PPH.pptx obstetrics and gynecology in nursing
human mycosis Human fungal infections are called human mycosis..pptx
Module 4: Burden of Disease Tutorial Slides S2 2025
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
Supply Chain Operations Speaking Notes -ICLT Program
RMMM.pdf make it easy to upload and study
Classroom Observation Tools for Teachers
O7-L3 Supply Chain Operations - ICLT Program
Anesthesia in Laparoscopic Surgery in India
Microbial diseases, their pathogenesis and prophylaxis
Complications of Minimal Access Surgery at WLH
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
Sports Quiz easy sports quiz sports quiz

2_Errors in Experimental Observations_ML.ppt

  • 1. What is error? • Error (statistical error) describes the difference between a value obtained from a data collection process and the 'true' value for the population. • The greater the error, the less representative the data are of the population.
  • 2. Why does error matter? • The greater the error, the less reliable are the results of the study. • A credible data source will have measures in place throughout the data collection process to minimise the amount of error, and will also be transparent about the size of the expected error so that users can decide whether the data are 'fit for purpose'.
  • 3. Data can be affected by two types of error: • Sampling Error • Non-sampling Error
  • 4. SAMPLING ERROR • Sampling error occurs solely as a result of using a sample from a population, rather than conducting a census (complete enumeration) of the population. • It refers to the difference between an estimate for a population based on data from a sample and the 'true' value for that population which would result if a census were taken. • Sampling errors do not occur in a census, as the census values are based on the entire population. • Sampling error can be measured and controlled in random samples where each unit has a chance of selection, and that chance can be calculated. • In general, increasing the sample size will reduce the sample error.
  • 5. Sampling error can occur when: • The proportions of different characteristics within the sample are not similar to the proportions of the characteristics for the whole population (i.E. If we are taking a sample of men and women and we know that 51% of the total population are women and 49% are men, then we should aim to have similar proportions in our sample); • The sample is too small to accurately represent the population; and • The sampling method is not random.
  • 6. NON-SAMPLING ERROR • Non-sampling error is caused by factors other than those related to sample selection. • It refers to the presence of any factor, whether systemic or random, that results in the data values not accurately reflecting the 'true' value for the population. • Non-sampling error can occur at any stage of a census or sample study, and are not easily identified or quantified.
  • 7. Non-sampling Error Can Include : • Coverage error: this occurs when a unit in the sample is incorrectly excluded or included, or is duplicated in the sample (e.g. a field interviewer fails to interview a selected household or some people in a household). • Non-response error: this refers to the failure to obtain a response from some unit because of absence, non-contact, refusal, or some other reason. Non- response can be complete non-response (i.e. no data has been obtained at all from a selected unit) or partial non-response (i.e. the answers to some questions have not been provided by a selected unit). • Response error: this refers to a type of error caused by respondents intentionally or accidentally providing inaccurate responses. This occurs when concepts, questions or instructions are not clearly understood by the respondent; when there are high levels of respondent burden and memory recall required; and because some questions can result in a tendency to answer in a socially desirable way (giving a response which they feel is more acceptable rather than being an accurate response). • Interviewer error: this occurs when interviewers incorrectly record information; are not neutral or objective; influence the respondent to answer in a particular way; or assume responses based on appearance or other characteristics. • Processing error: this refers to errors that occur in the process of data collection, data entry, coding, editing and output.
  • 8. Why do we measure error? • Error is expected in a data collection process, particularly if the data is obtained from a sample survey. Although non-sampling error is difficult to measure, sampling error can be measured to give an indication of the accuracy of any estimate value for the population. This assists users to make informed decisions about whether the statistics are suited to their needs.
  • 9. How do we measure error? • Two common measures of error are: standard error and the relative standard error. • Standard Error (SE) is a measure of the variation between any estimated population value that is based on a sample rather than true value for the population. • SE of any estimate for a measure of average magnitude of the difference between sample estimate and population parameters taken over the all sample estimate from the population. • It is important to consider the Standard Error as it affects the accuracy of the estimates and, therefore, the importance that can be placed on the interpretations drawn from the data.
  • 10. • SE is applied for std. deviation of sampling distribution of any estimate • The standard error of the mean (SEM) can be expressed as: where s is the standard deviation of the population. n is the size (number of observations) of the sample.
  • 11. • Relative Standard Error (RSE) is the standard error expressed as a proportion of an estimated value. It is usually displayed as a percentage. RSEs are a useful measure as they provide an indication of the relative size of the error likely to have occurred due to sampling. A high RSE indicates less confidence that an estimated value is close to the true population value.
  • 12. Standard Error v/s Relative Standard Error • The Standard Error measure indicates the extent to which a survey estimate is likely to deviate from the true population and is expressed as a number. • The Relative Standard Error (RSE) is the standard error expressed as a fraction of the estimate and is usually expressed as a percentage. • Estimates with a RSE of 25% or greater are subject to high sampling error and should be used with caution.
  • 13. PROBABLE ERROR • In statistics, probable error defines the half- range of an interval about a central point for the distribution, such that half of the values from the distribution will lie within the interval and half outside. • Measure of the error of estimate for a sample from a normal distribution, it is computed by multiplying the standard error with 0.6745 • Thus for a symmetric distribution, it is equivalent to half the interquartile range, or the median Absolute deviation. PE= 0.67449 (SE)
  • 14. PROBABLE ERROR OF COEFFICIENT OF CORRELATION • It is an measure of testing reliability of an observed value of coefficient of correlation. it depends on the condition of random sampling • It is represented by “r”
  • 15. What can measures of error tell us? • The standard error can be used to construct a confidence interval. A confidence interval is a range in which it is estimated the true population value lies. • Confidence intervals of different sizes can be created to represent different levels of confidence that the true population value will lie within a particular range. • A common confidence interval used in statistics is the 95% confidence interval. In a 'normal distribution', the 95% confidence interval is measured by two standard errors either side of the estimate.
  • 16. SIGNIFICANCE OF PROBABLE ERROR • Can be used of determining limits within which coefficient of correlation of population is expected to be located • It is used to test if an observed value of sample correlation coefficient is significant of any correlation in population • If r < PE, then correlation=insignificant • If r > 6PE then r= significant • If r < 6PE then sample size is too small for any estimation
  • 17. Type I And Type II Errors • In statistical hypothesis testing, a type I error is the incorrect rejection of a true null hypothesis (H0) (also known as a "false positive" finding), while a type II error is incorrectly retaining a false null hypothesis (also known as a "false negative" finding). • More simply stated, a type I error is to falsely infer the existence of something that is not there, while a type II error is to falsely infer the absence of something that is.
  • 18. • A type I error (or error of the first kind) is the incorrect rejection of a true null hypothesis. • Usually a type I error leads one to conclude that a supposed effect or relationship exists when in fact it doesn't. • (H0)=true but is rejected • Let the probability of making type I error by rejecting H0 = a • Then probability of accepting H0 = 1-a • Examples of type I errors- • a test that shows a patient to have a disease when in fact the patient does not have the disease, • a fire alarm going on indicating a fire when in fact there is no fire, or • an experiment indicating that a medical treatment should cure a disease when in fact it does not.
  • 19. • A type II error (or error of the second kind) is the failure to reject a false null hypothesis. • Similarly, probability of making type II error= b • Examples of type II errors – a. a blood test failing to detect the disease it was designed to detect, in a patient who really has the disease; b. a fire breaking out and the fire alarm does not ring; or c. a clinical trial of a medical treatment failing to show that the treatment works when really it does
  • 20. LEVEL OF SIGNIFICANCE • Statistical tests fix the probability of committing type I error at certain level, called the level of significance. • If the calculative probability is less than LOS, then null hypothesis is rejected or accepted otherwise • 2 commonly used LOS are- • 1% LOS and 5% LOS • Simply, LOS means chances of making error • If we chose 5% LOS , it implies that 5 out of 100 we are likely to reject the correct H0 • Example: if a=0.05 the probability of making error is 5% and when a=0.01 the probability of making error is 1%