SlideShare a Scribd company logo
UNIT-2
Regression, Probability & Parametric Test
Regression: Curve fitting by the method of least squares, fitting the lines y= a +
bx and x = a + by, Multiple regression, standard error of regression
Probability: Definition of probability, Binomial distribution, Normal
distribution, Poisson’s distribution, properties - problems Sample, Population,
large sample, small sample, Null hypothesis, alternative hypothesis, sampling,
essence of sampling, types of sampling, Error-I type, Error-II type, Standard
error of mean (SEM) - Pharmaceutical examples
Parametric test: t-test(Sample, Pooled or Unpaired and Paired), ANOVA,
(One way and Two way), Least Significance difference
BINOMIAL DISTRIBUTION
The binomial distribution is calculated by multiplying the probability of success raised
to the power of the number of successes and the probability of failure raised to the
power of the difference between the number of successes and the number of trials.
P(X= x) = nCxpxqn-x P(X) = n!/(n-x)! x! (px(1-p)n-x)
OR
p=1-q
n=the number of times success occurs in trails
x=the number of successes desired
P=probability of success
q=probability of failure
Unit-2 Biostatistics Probability Definition
Properties of Binomial Distribution
1. Binomial distribution has a fixed number of independent trials; i.e., n.
2. In each trial, there are only two outcomes, success or failure.
3. The probability of success (p) remains constant across all trials.
4. Each trial is independent, with no impact on others.
5. It is a discrete probability distribution with specific, countable values.
6. Probability Distribution Function (PDF) calculates probabilities for 'x' successes in 'n' trials.
7. Mean (µ) equals np, and Variance (σ²) equals npq.
8. The shape of the binomial curve varies based on 'n' and 'p,' tending towards symmetry with larger 'n.'
9. For large 'n,' it approximates a normal distribution (Central Limit Theorem).
10. Cumulative Distribution Function (CDF) finds cumulative probabilities for ≤ 'x' successes.
POISSON DISTRIBUTION
The Poisson distribution is a discrete probability distribution that
calculates the likelihood of a certain number of events happening in a fixed
time or space, assuming the events occur independently and at a constant rate.
Poisson distribution is characterized by a single parameter, lambda (λ), which
represents the average rate of occurrence of the events. The probability mass
function of the Poisson distribution is given by:
• P(X = x) is the Probability of Observing x Events
• e is the Base of the Natural Logarithm (approximately 2.71828)
• λ is the Average Rate of Occurrence of Events
• X is the Number of Events that Occur
P (X = x) = e-λ
λ
x/ x!
X
Bell shaped graph
CONTINOUS DISTRIBUTION
In probability theory and
statistics, the
continuous uniform
distributions or
rectangular distributions
are a family of symmetric
probability distributions.
UNIFORM
EXPONENTIAL
The exponential
distribution is a probability
distribution that models
the time between events
that happen continuously
and independently at a
constant rate
-ve
+ve
NORMAL
Normal Distribution is the most
common or normal form of
distribution of Random
Variables, hence the name “normal
distribution.” It is also called Gaussian
Distribution in Statistics or Probability.
We use this distribution to represent a
large number of random variables. It
serves as a foundation for statistics and
probability theory.
Properties of Normal Distribution
• Symmetry: The normal distribution is symmetric around its mean. This means the
left side of the distribution mirrors the right side.
• Mean, Median, and Mode: In a normal distribution, the mean, median, and
mode are all equal and located at the center of the distribution.
• Bell-shaped Curve: The curve is bell-shaped, indicating that most of the
observations cluster around the central peak, and the probabilities for values further
away from the mean taper off equally in both directions.
• Standard Deviation: The spread of the distribution is determined by the standard
deviation. About 68% of the data falls within one standard deviation of the mean,
95% within two standard deviations, and 99.7% within three standard deviations.
Studying the
graph it is clear
that using
Empirical Rule
we distribute
data broadly in
three parts. And
thus, empirical
rule is also called
“68 – 95 – 99.7”
rule.
SAMPLING
sampling is the selection of a subset or a statistical
sample (termed sample for short) of individuals from within a statistical
population to estimate characteristics of the whole population
Sampling has lower costs and faster data collection compared to recording data
from the entire population (in many cases, collecting the whole population is
impossible, like getting sizes of all stars in the universe), and thus, it can
provide insights in cases where it is infeasible to measure an entire population.
Unit-2 Biostatistics Probability Definition
Sampling Methods
Within any of the types of frames identified above, a variety of sampling methods can be
employed individually or in combination. Factors commonly influencing the choice
between these designs include:
• Nature and quality of the frame
• Availability of auxiliary information about units on the frame
• Accuracy requirements, and the need to measure accuracy
• Whether detailed analysis of the sample is expected
• Cost/operational concerns
SAMPLING DESIGN
UNIVERSAL SAMPLING
SAMPLING UNIT
SAMPLING FRAME
SAMPLING SIZE
SAMPLING METHOD
BUDGET
SAMPLING DESIGN
SAMPLING METHODS/
TECHNIQUES
PROBABILITY
NON-PROBABILITY
SIMPLE SAMPLING
STRATIFIED SAMPLING
SYSTEMIC SAMPLING
MULTI STAGE SAMPLING
MULTI PHASE SAMPLING
CLUSTER SAMPLING
PURPOSIVE OR JUDGEMENT
CONVENIENCE
QUOTA
SNOWBALL
CONSECUTIVE
Unit-2 Biostatistics Probability Definition
HYPOTHESIS
NULL HYPOTHESIS ALTERNATE HYPOTHESIS
H0 H1 Ha
The null and alternative hypotheses are two competing claims that
researchers weigh evidence for and against using a statistical test
There is no effect on the population There is an effect on the population
The effect is usually the effect of
the independent variable on the
dependent variable
The null and alternative are always claims about the population.
That’s because the goal of hypothesis testing is to make inferences about a population based on
a sample.
Often, we infer whether there’s an effect in the population by looking at differences between groups
or relationships between variables in the sample. It’s critical for your research to write strong
hypotheses.
You can use a statistical test to decide whether the evidence favors the null or alternative hypothesis.
Each type of statistical test comes with a specific way of phrasing the null and alternative hypothesis.
However, the hypotheses can also be phrased in a general way that applies to any test.
NULL HYPOTHESIS
Claim that there is no effect in the population
If the sample provides enough evidence against the claim that there’s no effect in the
population (p ≤ α), then we can reject the null hypothesis. Otherwise, we fail to reject the null
hypothesis.
Null hypotheses often include phrases such as “no effect,” “no difference,” or “no
relationship.” When written in mathematical terms, they always include an equality (usually
=, but sometimes ≥ or ≤).
You can never know with complete certainty whether there is an effect in the population.
Some percentage of the time, your inference about the population will be incorrect. When
you incorrectly reject the null hypothesis, it’s called a type I error. When you
incorrectly fail to reject it, it’s a type II error.
NULL HYPOTHESIS
Ex. Does the amount of text highlighted in the textbook affect exam scores?
The amount of text highlighted in the textbook has no effect on exam scores.
Ex. Does daily meditation decrease the incidence of depression?
Daily meditation does not decrease the incidence of depression.
ALTERNATE HYPOTHESIS
The alternative hypothesis (Ha) is the other answer to your research question. It
claims that there’s an effect on the population.
Often, your alternative hypothesis is the same as your research hypothesis. In
other words, it’s the claim that you expect or hope will be true.
The alternative hypothesis is the complement to the null hypothesis. Null and
alternative hypotheses are exhaustive, meaning that together they cover every
possible outcome. They are also mutually exclusive, meaning that only one can
be true at a time.
If you reject the null hypothesis, you can say that the alternative
hypothesis is supported. On the other hand, if you fail to reject the null
hypothesis, then you can say that the alternative hypothesis is not
supported. Never say that you’ve proven or disproven a hypothesis.
Alternative hypotheses often include phrases such as “an effect,” “a
difference,” or “a relationship.” When alternative hypotheses are written in
mathematical terms, they always include an inequality (usually ≠, but
sometimes < or >).
Ex. Does daily meditation decrease the incidence of depression?- Daily
meditation decreases the incidence of depression.
SIMILARITIES AND DIFFERENCES BETWEEN
NULL AND ALTERNATIVE HYPOTHESES
• They’re both answers to the research question.
• They both make claims about the population.
• They’re both evaluated by statistical tests.
Null hypotheses Alternative hypotheses
A claim that there is no effect in the
population.
A claim that there is an effect in the
population.
H0 H1 Ha
• No effect, No difference, No
relationship, No change, Does not
increase, Does not decrease
• An effect, A difference, A
relationship, A change, Increases,
Decreases
Equality symbol (=, ≥, or ≤) Inequality symbol (≠, <, or >)
Does the independent variable affect the dependent variable?
• Null hypothesis (H0): Independent variable does not affect dependent variable.
• Alternative hypothesis (Ha): Independent variable affects dependent variable.
Statistical test Null hypothesis Alternative hypothesis
One-way ANOVA with
two groups
The mean dependent variable does
not differ between group 1 (µ1) and
group 2 (µ2) in the population; µ1 =
µ2.
The mean dependent
variable differs between group
1 (µ1) and group 2 (µ2) in the
population; µ1 ≠ µ2.
One-way ANOVA with
three groups
The mean dependent variable does
not differ between group 1 (µ1), group
2 (µ2), and group 3 (µ3) in the
population; µ1 = µ2 = µ3.
The mean dependent
variable of group 1 (µ1), group
2 (µ2), and group 3 (µ3) are not
all equal in the population.
ERROR
In statistics, a Type I error is a false positive conclusion, while a Type II error is a
false negative conclusion.
Making a statistical decision always involves uncertainties, so the risks of making these
errors are unavoidable in hypothesis testing.
The probability of making a Type I error is the significance level, or alpha (α), while the
probability of making a Type II error is beta (β). These risks can be minimized through
careful planning in your study design.
Example: Type I vs Type II error
You decide to get tested for COVID-19 based on mild symptoms. There are
two errors that could potentially occur:
• Type I error (false positive): the test result says you have coronavirus,
but you actually don’t.
• Type II error (false negative): the test result says you don’t have
coronavirus, but you actually do.
Parametric test: t-test(Sample, Pooled or Unpaired and
Paired),ANOV
A, (One way and Two way), Least
Signi
fi
cance di
ff
erence
T-TEST
A t test is a statistical test that is used to compare the means of two groups.
It is often used in hypothesis testing to determine whether a process or
treatment actually has an effect on the population of interest, or whether
two groups are different from one another.
A t test can only be used when comparing the means of two groups (a.k.a.
pairwise comparison). If you want to compare more than two groups, or if
you want to do multiple pairwise comparisons, use an ANOVA test or a
post-hoc test.
The t test is a parametric test of difference, meaning that it makes the same
assumptions about your data as other parametric tests. The t test assumes your data:
1.are independent
2.are (approximately) normally distributed
3.have a similar amount of variance within each group being compared (a.k.a.
homogeneity of variance)
If your data do not fit these assumptions, you can try a nonparametric alternative to
the t test, such as the Wilcoxon Signed-Rank test for data with unequal variances.
t test used for- need to consider two things: whether the groups being compared come
from a single population or two different populations, and whether you want to test the
difference in a specific direction.
One-sample, two-sample, or paired t test?
• If the groups come from a single population (e.g., measuring before and after an
experimental treatment), perform a paired t test. This is a within-subjects design.
• If the groups come from two different populations (e.g., two different species, or
people from two separate cities), perform a two-
sample t test (a.k.a. independent t test). This is a between-subjects design.
• If there is one group being compared against a standard value (e.g., comparing the
acidity of a liquid to a neutral pH of 7), perform a one-sample t test.
One-tailed or two-tailed t test?
• If you only care whether the two populations are different from one
another, perform a two-tailed t test.
• If you want to know whether one population mean is greater than or
less than the other, perform a one-tailed t test.
Performing a t-test
The t test estimates the true difference between two group means using the ratio of the difference in group means over the
pooled standard error of both groups. You can calculate it manually using a formula, or use statistical analysis software.
The formula for the two-sample t test (a.k.a. the Student’s t-test) is shown below
In this formula, t is the t value, x1 and x2 are the means of the two groups being compared, s2 is
the pooled standard error of the two groups, and n1 and n2 are the number of observations in each of the groups.
A larger t value shows that the difference between group means is greater than the pooled standard error, indicating a more
significant difference between the groups.
You can compare your calculated t value against the values in a critical value chart (e.g., Student’s t table) to determine
whether your t value is greater than what would be expected by chance. If so, you can reject the null hypothesis and
conclude that the two groups are in fact different.
Most statistical software (R, SPSS, etc.) includes a t test function
THANK YOU

More Related Content

PPTX
5.INFERENTIAL STATISTICS-GENERAL CONCEPTS.pptx
PPT
Statistics analysis, dr VPV Part 2.ppt..
PPT
Quantitative_analysis and methods built software
PPT
Soni_Biostatistics.ppt
PPT
Quantitative_analysis.ppt
PPTX
Review & Hypothesis Testing
PPT
Malimu statistical significance testing.
PPT
Ds vs Is discuss 3.1
5.INFERENTIAL STATISTICS-GENERAL CONCEPTS.pptx
Statistics analysis, dr VPV Part 2.ppt..
Quantitative_analysis and methods built software
Soni_Biostatistics.ppt
Quantitative_analysis.ppt
Review & Hypothesis Testing
Malimu statistical significance testing.
Ds vs Is discuss 3.1

Similar to Unit-2 Biostatistics Probability Definition (20)

PPTX
REVIEWCOMPREHENSIVE-EXAM. BY bjohn MBpptx
PPT
250Lec5INFERENTIAL STATISTICS FOR RESEARC
PPT
Descriptive And Inferential Statistics for Nursing Research
PDF
Machine Learning Machine Learning Interview
PPTX
Estimation and hypothesis
PPT
Statistics
PPT
Chapter34
PDF
Research method ch07 statistical methods 1
PPT
Statistics
PPT
Quantitative analysis
PPTX
hypothesis in research .......................
PPTX
Sampling distribution
PPT
Review of Chapters 1-5.ppt
DOCX
Inferential statistics
PPTX
MANS_PRESENTATION[1] hgfhdsgfkdfkjdfjd.pptx
PPTX
MANS_PRESENTATION[1] hgfhdsgfkdfkjdfjd.pptx
PPTX
hypothesis testing overview
 
PPT
Introductory Statistics
DOCX
Estimation in statistics
PDF
Hypothesis testing - Environmental Data analysis
REVIEWCOMPREHENSIVE-EXAM. BY bjohn MBpptx
250Lec5INFERENTIAL STATISTICS FOR RESEARC
Descriptive And Inferential Statistics for Nursing Research
Machine Learning Machine Learning Interview
Estimation and hypothesis
Statistics
Chapter34
Research method ch07 statistical methods 1
Statistics
Quantitative analysis
hypothesis in research .......................
Sampling distribution
Review of Chapters 1-5.ppt
Inferential statistics
MANS_PRESENTATION[1] hgfhdsgfkdfkjdfjd.pptx
MANS_PRESENTATION[1] hgfhdsgfkdfkjdfjd.pptx
hypothesis testing overview
 
Introductory Statistics
Estimation in statistics
Hypothesis testing - Environmental Data analysis
Ad

More from KRUTIKA CHANNE (6)

PDF
ISCHEMIC HEART DISEASE (Cardiovascular System))
PDF
Unit- 4 Biostatistics & Research Methodology.pdf
PDF
Pathophysiology_Unit1_BPharm CELL INJURY
PDF
unit- 5 Biostatistics and Research Methodology.pdf
PPTX
GRAPHS BIOSTATICS BPHARM 8 SEM UNIT 1 & 3.pptx
PDF
BIOSTATICS & RESEARCH METHODOLOGY UNIT-1.pdf
ISCHEMIC HEART DISEASE (Cardiovascular System))
Unit- 4 Biostatistics & Research Methodology.pdf
Pathophysiology_Unit1_BPharm CELL INJURY
unit- 5 Biostatistics and Research Methodology.pdf
GRAPHS BIOSTATICS BPHARM 8 SEM UNIT 1 & 3.pptx
BIOSTATICS & RESEARCH METHODOLOGY UNIT-1.pdf
Ad

Recently uploaded (20)

PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PDF
Basic Mud Logging Guide for educational purpose
PPTX
GDM (1) (1).pptx small presentation for students
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PDF
Pre independence Education in Inndia.pdf
PPTX
Renaissance Architecture: A Journey from Faith to Humanism
PDF
O7-L3 Supply Chain Operations - ICLT Program
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PDF
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
PDF
Classroom Observation Tools for Teachers
PDF
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
PDF
Computing-Curriculum for Schools in Ghana
PPTX
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PPTX
Lesson notes of climatology university.
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
Pharmacology of Heart Failure /Pharmacotherapy of CHF
Basic Mud Logging Guide for educational purpose
GDM (1) (1).pptx small presentation for students
Final Presentation General Medicine 03-08-2024.pptx
102 student loan defaulters named and shamed – Is someone you know on the list?
Pre independence Education in Inndia.pdf
Renaissance Architecture: A Journey from Faith to Humanism
O7-L3 Supply Chain Operations - ICLT Program
O5-L3 Freight Transport Ops (International) V1.pdf
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
Classroom Observation Tools for Teachers
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
Computing-Curriculum for Schools in Ghana
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
Abdominal Access Techniques with Prof. Dr. R K Mishra
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
human mycosis Human fungal infections are called human mycosis..pptx
Lesson notes of climatology university.
3rd Neelam Sanjeevareddy Memorial Lecture.pdf

Unit-2 Biostatistics Probability Definition

  • 2. Regression: Curve fitting by the method of least squares, fitting the lines y= a + bx and x = a + by, Multiple regression, standard error of regression Probability: Definition of probability, Binomial distribution, Normal distribution, Poisson’s distribution, properties - problems Sample, Population, large sample, small sample, Null hypothesis, alternative hypothesis, sampling, essence of sampling, types of sampling, Error-I type, Error-II type, Standard error of mean (SEM) - Pharmaceutical examples Parametric test: t-test(Sample, Pooled or Unpaired and Paired), ANOVA, (One way and Two way), Least Significance difference
  • 3. BINOMIAL DISTRIBUTION The binomial distribution is calculated by multiplying the probability of success raised to the power of the number of successes and the probability of failure raised to the power of the difference between the number of successes and the number of trials. P(X= x) = nCxpxqn-x P(X) = n!/(n-x)! x! (px(1-p)n-x) OR p=1-q n=the number of times success occurs in trails x=the number of successes desired P=probability of success q=probability of failure
  • 5. Properties of Binomial Distribution 1. Binomial distribution has a fixed number of independent trials; i.e., n. 2. In each trial, there are only two outcomes, success or failure. 3. The probability of success (p) remains constant across all trials. 4. Each trial is independent, with no impact on others. 5. It is a discrete probability distribution with specific, countable values. 6. Probability Distribution Function (PDF) calculates probabilities for 'x' successes in 'n' trials. 7. Mean (µ) equals np, and Variance (σ²) equals npq. 8. The shape of the binomial curve varies based on 'n' and 'p,' tending towards symmetry with larger 'n.' 9. For large 'n,' it approximates a normal distribution (Central Limit Theorem). 10. Cumulative Distribution Function (CDF) finds cumulative probabilities for ≤ 'x' successes.
  • 6. POISSON DISTRIBUTION The Poisson distribution is a discrete probability distribution that calculates the likelihood of a certain number of events happening in a fixed time or space, assuming the events occur independently and at a constant rate. Poisson distribution is characterized by a single parameter, lambda (λ), which represents the average rate of occurrence of the events. The probability mass function of the Poisson distribution is given by: • P(X = x) is the Probability of Observing x Events • e is the Base of the Natural Logarithm (approximately 2.71828) • λ is the Average Rate of Occurrence of Events • X is the Number of Events that Occur P (X = x) = e-λ λ x/ x!
  • 8. CONTINOUS DISTRIBUTION In probability theory and statistics, the continuous uniform distributions or rectangular distributions are a family of symmetric probability distributions. UNIFORM
  • 9. EXPONENTIAL The exponential distribution is a probability distribution that models the time between events that happen continuously and independently at a constant rate -ve +ve
  • 10. NORMAL Normal Distribution is the most common or normal form of distribution of Random Variables, hence the name “normal distribution.” It is also called Gaussian Distribution in Statistics or Probability. We use this distribution to represent a large number of random variables. It serves as a foundation for statistics and probability theory.
  • 11. Properties of Normal Distribution • Symmetry: The normal distribution is symmetric around its mean. This means the left side of the distribution mirrors the right side. • Mean, Median, and Mode: In a normal distribution, the mean, median, and mode are all equal and located at the center of the distribution. • Bell-shaped Curve: The curve is bell-shaped, indicating that most of the observations cluster around the central peak, and the probabilities for values further away from the mean taper off equally in both directions. • Standard Deviation: The spread of the distribution is determined by the standard deviation. About 68% of the data falls within one standard deviation of the mean, 95% within two standard deviations, and 99.7% within three standard deviations.
  • 12. Studying the graph it is clear that using Empirical Rule we distribute data broadly in three parts. And thus, empirical rule is also called “68 – 95 – 99.7” rule.
  • 13. SAMPLING sampling is the selection of a subset or a statistical sample (termed sample for short) of individuals from within a statistical population to estimate characteristics of the whole population Sampling has lower costs and faster data collection compared to recording data from the entire population (in many cases, collecting the whole population is impossible, like getting sizes of all stars in the universe), and thus, it can provide insights in cases where it is infeasible to measure an entire population.
  • 15. Sampling Methods Within any of the types of frames identified above, a variety of sampling methods can be employed individually or in combination. Factors commonly influencing the choice between these designs include: • Nature and quality of the frame • Availability of auxiliary information about units on the frame • Accuracy requirements, and the need to measure accuracy • Whether detailed analysis of the sample is expected • Cost/operational concerns
  • 16. SAMPLING DESIGN UNIVERSAL SAMPLING SAMPLING UNIT SAMPLING FRAME SAMPLING SIZE SAMPLING METHOD BUDGET SAMPLING DESIGN
  • 17. SAMPLING METHODS/ TECHNIQUES PROBABILITY NON-PROBABILITY SIMPLE SAMPLING STRATIFIED SAMPLING SYSTEMIC SAMPLING MULTI STAGE SAMPLING MULTI PHASE SAMPLING CLUSTER SAMPLING PURPOSIVE OR JUDGEMENT CONVENIENCE QUOTA SNOWBALL CONSECUTIVE
  • 19. HYPOTHESIS NULL HYPOTHESIS ALTERNATE HYPOTHESIS H0 H1 Ha The null and alternative hypotheses are two competing claims that researchers weigh evidence for and against using a statistical test There is no effect on the population There is an effect on the population The effect is usually the effect of the independent variable on the dependent variable
  • 20. The null and alternative are always claims about the population. That’s because the goal of hypothesis testing is to make inferences about a population based on a sample. Often, we infer whether there’s an effect in the population by looking at differences between groups or relationships between variables in the sample. It’s critical for your research to write strong hypotheses. You can use a statistical test to decide whether the evidence favors the null or alternative hypothesis. Each type of statistical test comes with a specific way of phrasing the null and alternative hypothesis. However, the hypotheses can also be phrased in a general way that applies to any test.
  • 21. NULL HYPOTHESIS Claim that there is no effect in the population If the sample provides enough evidence against the claim that there’s no effect in the population (p ≤ α), then we can reject the null hypothesis. Otherwise, we fail to reject the null hypothesis. Null hypotheses often include phrases such as “no effect,” “no difference,” or “no relationship.” When written in mathematical terms, they always include an equality (usually =, but sometimes ≥ or ≤). You can never know with complete certainty whether there is an effect in the population. Some percentage of the time, your inference about the population will be incorrect. When you incorrectly reject the null hypothesis, it’s called a type I error. When you incorrectly fail to reject it, it’s a type II error.
  • 22. NULL HYPOTHESIS Ex. Does the amount of text highlighted in the textbook affect exam scores? The amount of text highlighted in the textbook has no effect on exam scores. Ex. Does daily meditation decrease the incidence of depression? Daily meditation does not decrease the incidence of depression.
  • 23. ALTERNATE HYPOTHESIS The alternative hypothesis (Ha) is the other answer to your research question. It claims that there’s an effect on the population. Often, your alternative hypothesis is the same as your research hypothesis. In other words, it’s the claim that you expect or hope will be true. The alternative hypothesis is the complement to the null hypothesis. Null and alternative hypotheses are exhaustive, meaning that together they cover every possible outcome. They are also mutually exclusive, meaning that only one can be true at a time.
  • 24. If you reject the null hypothesis, you can say that the alternative hypothesis is supported. On the other hand, if you fail to reject the null hypothesis, then you can say that the alternative hypothesis is not supported. Never say that you’ve proven or disproven a hypothesis. Alternative hypotheses often include phrases such as “an effect,” “a difference,” or “a relationship.” When alternative hypotheses are written in mathematical terms, they always include an inequality (usually ≠, but sometimes < or >). Ex. Does daily meditation decrease the incidence of depression?- Daily meditation decreases the incidence of depression.
  • 25. SIMILARITIES AND DIFFERENCES BETWEEN NULL AND ALTERNATIVE HYPOTHESES • They’re both answers to the research question. • They both make claims about the population. • They’re both evaluated by statistical tests.
  • 26. Null hypotheses Alternative hypotheses A claim that there is no effect in the population. A claim that there is an effect in the population. H0 H1 Ha • No effect, No difference, No relationship, No change, Does not increase, Does not decrease • An effect, A difference, A relationship, A change, Increases, Decreases Equality symbol (=, ≥, or ≤) Inequality symbol (≠, <, or >)
  • 27. Does the independent variable affect the dependent variable? • Null hypothesis (H0): Independent variable does not affect dependent variable. • Alternative hypothesis (Ha): Independent variable affects dependent variable. Statistical test Null hypothesis Alternative hypothesis One-way ANOVA with two groups The mean dependent variable does not differ between group 1 (µ1) and group 2 (µ2) in the population; µ1 = µ2. The mean dependent variable differs between group 1 (µ1) and group 2 (µ2) in the population; µ1 ≠ µ2. One-way ANOVA with three groups The mean dependent variable does not differ between group 1 (µ1), group 2 (µ2), and group 3 (µ3) in the population; µ1 = µ2 = µ3. The mean dependent variable of group 1 (µ1), group 2 (µ2), and group 3 (µ3) are not all equal in the population.
  • 28. ERROR In statistics, a Type I error is a false positive conclusion, while a Type II error is a false negative conclusion. Making a statistical decision always involves uncertainties, so the risks of making these errors are unavoidable in hypothesis testing. The probability of making a Type I error is the significance level, or alpha (α), while the probability of making a Type II error is beta (β). These risks can be minimized through careful planning in your study design.
  • 29. Example: Type I vs Type II error You decide to get tested for COVID-19 based on mild symptoms. There are two errors that could potentially occur: • Type I error (false positive): the test result says you have coronavirus, but you actually don’t. • Type II error (false negative): the test result says you don’t have coronavirus, but you actually do.
  • 30. Parametric test: t-test(Sample, Pooled or Unpaired and Paired),ANOV A, (One way and Two way), Least Signi fi cance di ff erence
  • 31. T-TEST A t test is a statistical test that is used to compare the means of two groups. It is often used in hypothesis testing to determine whether a process or treatment actually has an effect on the population of interest, or whether two groups are different from one another. A t test can only be used when comparing the means of two groups (a.k.a. pairwise comparison). If you want to compare more than two groups, or if you want to do multiple pairwise comparisons, use an ANOVA test or a post-hoc test.
  • 32. The t test is a parametric test of difference, meaning that it makes the same assumptions about your data as other parametric tests. The t test assumes your data: 1.are independent 2.are (approximately) normally distributed 3.have a similar amount of variance within each group being compared (a.k.a. homogeneity of variance) If your data do not fit these assumptions, you can try a nonparametric alternative to the t test, such as the Wilcoxon Signed-Rank test for data with unequal variances.
  • 33. t test used for- need to consider two things: whether the groups being compared come from a single population or two different populations, and whether you want to test the difference in a specific direction. One-sample, two-sample, or paired t test? • If the groups come from a single population (e.g., measuring before and after an experimental treatment), perform a paired t test. This is a within-subjects design. • If the groups come from two different populations (e.g., two different species, or people from two separate cities), perform a two- sample t test (a.k.a. independent t test). This is a between-subjects design. • If there is one group being compared against a standard value (e.g., comparing the acidity of a liquid to a neutral pH of 7), perform a one-sample t test.
  • 34. One-tailed or two-tailed t test? • If you only care whether the two populations are different from one another, perform a two-tailed t test. • If you want to know whether one population mean is greater than or less than the other, perform a one-tailed t test.
  • 35. Performing a t-test The t test estimates the true difference between two group means using the ratio of the difference in group means over the pooled standard error of both groups. You can calculate it manually using a formula, or use statistical analysis software. The formula for the two-sample t test (a.k.a. the Student’s t-test) is shown below In this formula, t is the t value, x1 and x2 are the means of the two groups being compared, s2 is the pooled standard error of the two groups, and n1 and n2 are the number of observations in each of the groups. A larger t value shows that the difference between group means is greater than the pooled standard error, indicating a more significant difference between the groups. You can compare your calculated t value against the values in a critical value chart (e.g., Student’s t table) to determine whether your t value is greater than what would be expected by chance. If so, you can reject the null hypothesis and conclude that the two groups are in fact different. Most statistical software (R, SPSS, etc.) includes a t test function