SlideShare a Scribd company logo
Hypothesis
Testing -I
Definition
 A hypothesis test is a statistical test that is used for
determining whether there is enough evidence
from the sample data to draw a conclusion for
the entire population.
 Two types of conclusions:
1. Null Hypothesis (Ho): is the hypothesis that any
observe variation in a sample is simply because of
random chance variation or we can say “the
hypothesis - that there is no significant difference
between the sample and the population, and any
observed difference is due to randomness or
experimental error.”
Rupak Roy
2. Alternative Hypothesis ( Ha ):
is the hypothesis testing that is contrary to the
null hypothesis.
Examples:
If i replace the battery in my car, then my car will give
better mileage?
Null Hypothesis (Ho): no difference of mileage even if we
replace the battery of the car.
Alternative Hypothesis (Ha): difference in mileage if we
replace the battery of the car
Rupak Roy
Significance level i.e. alpha a
If the criteria used for rejecting the null
hypothesis is less than 5% i.e. 0.05(p-value)
then we will conclude that there is difference
between sample and population. In other
words we are rejecting the null hypothesis.
The most standard value for rejecting null
hypothesis is 0.05; however we can change
depending on our need.
Rupak Roy
Example
 If
P (value) > Significance level (a), then we will
accept the null hypothesis
 Else
P (value) < Significance level (a), then we will
reject the null hypothesis
Another term for saying we have rejected the
null hypothesis is Statistically Significant result.
Rupak Roy
Stages of Hypothesis
1. Select
Null hypothesis (Ho): no difference of mileage if we
replace the battery of the car.
Alternative Hypothesis(Ha): difference in mileage if we
replace the battery of the car.
2. Test Distribution: select appropriate distribution like
norm.dist, binom.dist, t-distribution with
significance level: alpha (a) 5% i.e. 0.05
3. P-value ( example, p = 1- norm.dist(………)=0.09
4. Result: failed to reject the null i.e. accepting the null
hypothesis and discarding the alternative hypothesis. We
will conclude that there is no difference in mileage even if
we replace the battery of the car.
Rupak Roy
Example
A food production unit produces a particular product of an average
weight of 10 lbs. with a standard deviation of 0.35 lbs. A random
sample of 30 units found a slightly increase of average weight by 2 lbs.
i.e. 12 lbs. So are there any issues in the product process?
Significance level (a) = 0.05
Null Hypothesis (H0): There are no issues in the production process,
what we found in the sample are due to random chance variation /
randomness.
Alternative Hypothesis (H1): There are some issues in the production
process that is leading to the increase in weight per unit.
Test Distribution: normal distribution
Rupak Roy
Example: continued
In Excel,
normal distribution = norm.dist( X, mean, Standard deviation, Cumulative)
where,
X =12, mean = 10, standard deviation = 0.35 and cumulative =
TRUE/False
Therefore,
= 1- norm.dist
(Because we need to calculate P-value for greater than 10 lbs.)
=1- norm.dist (12,10,0.35,TRUE)
= 5.5089E-09 i.e. less than 0.05
Since P-value is smaller than Significance level (a), we have failed to
reject the H1 i.e. accepting the alternative hypothesis and discarding the
Null hypothesis.
In other words, we will conclude that there are some issues in the
production process that leads to the increase in weight per unit of
production.
Rupak Roy
Terminology
Confidence level: is (1-significance level),
it refers how confident you are about your
conclusion.
So, if null hypothesis is rejected at a 5% level of
significance, then it means you are 95% (1- 0.05)
confident about your conclusion.
Again, if null hypothesis is rejected at a 1% level of
significance, then it means you are 99% (1-0.01)
confident about your conclusion.
Rupak Roy
Central Limit Theorem (CLT)
 The central limit theorem says irrespective of
the underlying population distribution, when
you pick a multiple random samples from an
underlying population with a sample size of at
least 30 or above. The distribution of sample
average will be normal even if the underlying
population is not normal.
Rupak Roy
Hypothesis testing when sample size is low
 Remember: Central limit theorem says if the sample size is
sufficiently large, the distribution of sample averages will
be normal irrespective of underlying population distribution
or else it will follow t-distribution.
 So to compute the probability if the sample size is less than
30, we will use t-dist to calculate the P-value.
 And is also a continuous probability distribution.
 As we can see in the
diagram when the
sample size
increases to 30,
the t-distribution
approximates
a normal distribution.
Rupak Roy
T-distance
In order to calculate t- distribution we need
t-distance i.e.
the test statistics =
Where,
(sample mean – population mean) /
( S ) standard deviation/ (N ) sample size )
Rupak Roy
Steps for T-distribution
 Select
null hypothesis (ho):
alternative hypothesis (h1):
 Significance level: 5%
 Test distribution: t-distribution(calculate P-value)
 Conclusion: reject the null hypothesis or accept
the null hypothesis.
Rupak Roy
Example
 The seller of a manufacturing company claims that
an average fluorescent light stays for 320 days. The
inspector randomly selects 10 fluorescent lights for
inspection. The sampled last with an average of 280
days along with a standard deviation of 95. What is
the likelihood that the randomly selected sample
fluorescent light would have an average life of no
more than 280 days?
Here, sample mean = 280
population mean = 320
population std. deviation = 95
sample size = 10
Rupak Roy
 In excel:
1) calculate t- distance
t =(280-320)/(95 / 10 )
Alternatively, (280-320)/(95/ (10^0.5))
t = - 1.331
2) use the T-distance value in Excel with the following
formula
= t.dist (t-distance, degrees of freedom, TRUE)
= t.dist( -1.331,9,TRUE) = 0.10788 = 11%
Therefore there is 11% likelihood that the average life for randomly selected bulbs is less
than 280 days
ALTERNATIVELY,
= 1-(t.dist( t-distance , degree of freedom, TRUE))
= 1-(t.dist(-1.331,9,TRUE) = 1- 0.1078= 0.89= 89%
Therefore there is 89% likelihood that the average life for 10 randomly selected bulbs is
more than 280 days
Note:
Df = degrees of freedom = N -1 ( here in the example N (samples size) = 10)
Rupak Roy
 Note:
Why sometimes we use
1- normal.distribution
1- t.distribution
If we have notice in any distribution, cumulative for
normal.distribution
= norm.dist(….cumulative) where
cumulative is TRUE / FALSE
TRUE (function) means < and FALSE (function) = point
probability
And what if we want > there is no function, so for that we
manually have to feed
1 – appropitate.distribution
Rupak Roy
What if population Std.deviation is not available
 If population standard deviation is not known,
sample deviation can be substitute for the
population standard deviation.
 Therefore, S =sample deviation / sample size
Rupak Roy
What if population distribution is not
normal i.e. not normal distribution?
 We are using normal distribution to calculate
p-value for hypothesis testing but it is not
always necessary that every hypothesis test
must use a normal distribution.
 If we already know the type of distribution,
then it’s better to use directly the right
distribution for hypothesis testing.
 Remember the example from our previous
slide “Stage of Hypothesis” where in point
number 2 we have mentioned that we can
choose any appropriate types of distribution.
Rupak Roy
Recap:
“Stages of Hypothesis”
1. Select
Null Hypothesis (Ho): no difference of mileage if we
replace the battery in the car.
Alternative Hypothesis (Ha): difference in mileage if we
replace the battery in the car
2. Test Distribution: select appropriate distribution like
norm.dist, binom.dist with significance level: alpha (a)
5%
3. P-value ( example, p = 1- norm.dist(………) )=0.09
4. Result: failed to reject the null i.e. accepting the null
hypothesis and discarding the alternative hypothesis.
We will conclude that there is no difference in
mileage even if we replace the battery of the car.
Rupak Roy
Next
Directional Hypothesis test
like one tail test i.e. if you have strong reason to
believe in your hypothesis.
And more.
Rupak Roy
 To be continued.
Rupak Roy

More Related Content

PDF
Directional Hypothesis testing
PDF
Multiple sample test - Anova, Chi-square, Test of association, Goodness of Fit
PDF
Linear Regression
PDF
Logistic regression
PDF
Types of Probability Distributions - Statistics II
PDF
Types of Statistics
PPTX
Machine learning session5(logistic regression)
PPTX
Machine learning session7(nb classifier k-nn)
Directional Hypothesis testing
Multiple sample test - Anova, Chi-square, Test of association, Goodness of Fit
Linear Regression
Logistic regression
Types of Probability Distributions - Statistics II
Types of Statistics
Machine learning session5(logistic regression)
Machine learning session7(nb classifier k-nn)

What's hot (20)

PPTX
Statistical Analysis with R- III
PDF
Foundations of Statistics for Ecology and Evolution. 2. Hypothesis Testing
PDF
Hypothesis and Test
PPTX
Statistical Inference Part II: Types of Sampling Distribution
PPT
Research methodology - Estimation Theory & Hypothesis Testing, Techniques of ...
PDF
Hypothesis Testing
PPT
Chapter 10
PDF
3 es timation-of_parameters[1]
PPTX
Estimation Theory
PPT
02a one sample_t-test
PPT
Business Statistics Chapter 9
PPT
Math3010 week 5
PDF
Probability and basic statistics with R
PPTX
Basic of Statistical Inference Part-III: The Theory of Estimation from Dexlab...
PPT
hypothesis test
PPTX
Interval estimation for proportions
PPTX
Stats chapter 10
PPTX
Stats chapter 11
Statistical Analysis with R- III
Foundations of Statistics for Ecology and Evolution. 2. Hypothesis Testing
Hypothesis and Test
Statistical Inference Part II: Types of Sampling Distribution
Research methodology - Estimation Theory & Hypothesis Testing, Techniques of ...
Hypothesis Testing
Chapter 10
3 es timation-of_parameters[1]
Estimation Theory
02a one sample_t-test
Business Statistics Chapter 9
Math3010 week 5
Probability and basic statistics with R
Basic of Statistical Inference Part-III: The Theory of Estimation from Dexlab...
hypothesis test
Interval estimation for proportions
Stats chapter 10
Stats chapter 11
Ad

Similar to Hypothesis Testing with ease (20)

PPT
10. sampling and hypotehsis
PPTX
STATISTIC ESTIMATION
DOCX
PAGE O&M Statistics – Inferential Statistics Hypothesis Test.docx
PPTX
Hypothesis Testing Lesson 1
PPT
1192012 155942 f023_=_statistical_inference
PPTX
hypothesis testing and statistical infernce.pptx
PPT
5--Test of hypothesis statistics (part_1).ppt
PDF
8. testing of hypothesis for variable &amp; attribute data
PPTX
Testing of Hypothesis(2).pptx data analysis
PPTX
Testing of Hypothesis.pptx. Hypothesis types
PPTX
Introduction to Testing of Hypothesis with examples
PPTX
TEST OF SIGNIFICANCE.pptx
DOCX
TEST #1Perform the following two-tailed hypothesis test, using a.docx
PPT
hypothesis testing - research oriented
PPT
Hypothesis Testing techniques in social research.ppt
PPTX
DS103 - Unit02 - Part3DS103 - Unit02 - Part3.pptx
PPT
Chapter 11
PPTX
Testing of hypotheses
PPTX
HypothesisTesting.pptx
PPTX
Probalities, Estimations and Hypothesis Testing.pptx
10. sampling and hypotehsis
STATISTIC ESTIMATION
PAGE O&M Statistics – Inferential Statistics Hypothesis Test.docx
Hypothesis Testing Lesson 1
1192012 155942 f023_=_statistical_inference
hypothesis testing and statistical infernce.pptx
5--Test of hypothesis statistics (part_1).ppt
8. testing of hypothesis for variable &amp; attribute data
Testing of Hypothesis(2).pptx data analysis
Testing of Hypothesis.pptx. Hypothesis types
Introduction to Testing of Hypothesis with examples
TEST OF SIGNIFICANCE.pptx
TEST #1Perform the following two-tailed hypothesis test, using a.docx
hypothesis testing - research oriented
Hypothesis Testing techniques in social research.ppt
DS103 - Unit02 - Part3DS103 - Unit02 - Part3.pptx
Chapter 11
Testing of hypotheses
HypothesisTesting.pptx
Probalities, Estimations and Hypothesis Testing.pptx
Ad

More from Rupak Roy (20)

PDF
Hierarchical Clustering - Text Mining/NLP
PDF
Clustering K means and Hierarchical - NLP
PDF
Network Analysis - NLP
PDF
Topic Modeling - NLP
PDF
Sentiment Analysis Practical Steps
PDF
NLP - Sentiment Analysis
PDF
Text Mining using Regular Expressions
PDF
Introduction to Text Mining
PDF
Apache Hbase Architecture
PDF
Introduction to Hbase
PDF
Apache Hive Table Partition and HQL
PDF
Installing Apache Hive, internal and external table, import-export
PDF
Introductive to Hive
PDF
Scoop Job, import and export to RDBMS
PDF
Apache Scoop - Import with Append mode and Last Modified mode
PDF
Introduction to scoop and its functions
PDF
Introduction to Flume
PDF
Apache Pig Relational Operators - II
PDF
Passing Parameters using File and Command Line
PDF
Apache PIG Relational Operations
Hierarchical Clustering - Text Mining/NLP
Clustering K means and Hierarchical - NLP
Network Analysis - NLP
Topic Modeling - NLP
Sentiment Analysis Practical Steps
NLP - Sentiment Analysis
Text Mining using Regular Expressions
Introduction to Text Mining
Apache Hbase Architecture
Introduction to Hbase
Apache Hive Table Partition and HQL
Installing Apache Hive, internal and external table, import-export
Introductive to Hive
Scoop Job, import and export to RDBMS
Apache Scoop - Import with Append mode and Last Modified mode
Introduction to scoop and its functions
Introduction to Flume
Apache Pig Relational Operators - II
Passing Parameters using File and Command Line
Apache PIG Relational Operations

Recently uploaded (20)

PDF
DOC-20250806-WA0002._20250806_112011_0000.pdf
PPTX
5 Stages of group development guide.pptx
PPT
Chapter four Project-Preparation material
PDF
Roadmap Map-digital Banking feature MB,IB,AB
PPTX
job Avenue by vinith.pptxvnbvnvnvbnvbnbmnbmbh
PDF
Types of control:Qualitative vs Quantitative
PDF
Elevate Cleaning Efficiency Using Tallfly Hair Remover Roller Factory Expertise
PDF
SIMNET Inc – 2023’s Most Trusted IT Services & Solution Provider
PPTX
The Marketing Journey - Tracey Phillips - Marketing Matters 7-2025.pptx
PDF
Stem Cell Market Report | Trends, Growth & Forecast 2025-2034
PDF
Unit 1 Cost Accounting - Cost sheet
PDF
How to Get Funding for Your Trucking Business
PDF
Solara Labs: Empowering Health through Innovative Nutraceutical Solutions
PPTX
Probability Distribution, binomial distribution, poisson distribution
DOCX
Euro SEO Services 1st 3 General Updates.docx
PPTX
Lecture (1)-Introduction.pptx business communication
PDF
MSPs in 10 Words - Created by US MSP Network
PDF
BsN 7th Sem Course GridNNNNNNNN CCN.pdf
PDF
kom-180-proposal-for-a-directive-amending-directive-2014-45-eu-and-directive-...
DOCX
unit 1 COST ACCOUNTING AND COST SHEET
DOC-20250806-WA0002._20250806_112011_0000.pdf
5 Stages of group development guide.pptx
Chapter four Project-Preparation material
Roadmap Map-digital Banking feature MB,IB,AB
job Avenue by vinith.pptxvnbvnvnvbnvbnbmnbmbh
Types of control:Qualitative vs Quantitative
Elevate Cleaning Efficiency Using Tallfly Hair Remover Roller Factory Expertise
SIMNET Inc – 2023’s Most Trusted IT Services & Solution Provider
The Marketing Journey - Tracey Phillips - Marketing Matters 7-2025.pptx
Stem Cell Market Report | Trends, Growth & Forecast 2025-2034
Unit 1 Cost Accounting - Cost sheet
How to Get Funding for Your Trucking Business
Solara Labs: Empowering Health through Innovative Nutraceutical Solutions
Probability Distribution, binomial distribution, poisson distribution
Euro SEO Services 1st 3 General Updates.docx
Lecture (1)-Introduction.pptx business communication
MSPs in 10 Words - Created by US MSP Network
BsN 7th Sem Course GridNNNNNNNN CCN.pdf
kom-180-proposal-for-a-directive-amending-directive-2014-45-eu-and-directive-...
unit 1 COST ACCOUNTING AND COST SHEET

Hypothesis Testing with ease

  • 2. Definition  A hypothesis test is a statistical test that is used for determining whether there is enough evidence from the sample data to draw a conclusion for the entire population.  Two types of conclusions: 1. Null Hypothesis (Ho): is the hypothesis that any observe variation in a sample is simply because of random chance variation or we can say “the hypothesis - that there is no significant difference between the sample and the population, and any observed difference is due to randomness or experimental error.” Rupak Roy
  • 3. 2. Alternative Hypothesis ( Ha ): is the hypothesis testing that is contrary to the null hypothesis. Examples: If i replace the battery in my car, then my car will give better mileage? Null Hypothesis (Ho): no difference of mileage even if we replace the battery of the car. Alternative Hypothesis (Ha): difference in mileage if we replace the battery of the car Rupak Roy
  • 4. Significance level i.e. alpha a If the criteria used for rejecting the null hypothesis is less than 5% i.e. 0.05(p-value) then we will conclude that there is difference between sample and population. In other words we are rejecting the null hypothesis. The most standard value for rejecting null hypothesis is 0.05; however we can change depending on our need. Rupak Roy
  • 5. Example  If P (value) > Significance level (a), then we will accept the null hypothesis  Else P (value) < Significance level (a), then we will reject the null hypothesis Another term for saying we have rejected the null hypothesis is Statistically Significant result. Rupak Roy
  • 6. Stages of Hypothesis 1. Select Null hypothesis (Ho): no difference of mileage if we replace the battery of the car. Alternative Hypothesis(Ha): difference in mileage if we replace the battery of the car. 2. Test Distribution: select appropriate distribution like norm.dist, binom.dist, t-distribution with significance level: alpha (a) 5% i.e. 0.05 3. P-value ( example, p = 1- norm.dist(………)=0.09 4. Result: failed to reject the null i.e. accepting the null hypothesis and discarding the alternative hypothesis. We will conclude that there is no difference in mileage even if we replace the battery of the car. Rupak Roy
  • 7. Example A food production unit produces a particular product of an average weight of 10 lbs. with a standard deviation of 0.35 lbs. A random sample of 30 units found a slightly increase of average weight by 2 lbs. i.e. 12 lbs. So are there any issues in the product process? Significance level (a) = 0.05 Null Hypothesis (H0): There are no issues in the production process, what we found in the sample are due to random chance variation / randomness. Alternative Hypothesis (H1): There are some issues in the production process that is leading to the increase in weight per unit. Test Distribution: normal distribution Rupak Roy
  • 8. Example: continued In Excel, normal distribution = norm.dist( X, mean, Standard deviation, Cumulative) where, X =12, mean = 10, standard deviation = 0.35 and cumulative = TRUE/False Therefore, = 1- norm.dist (Because we need to calculate P-value for greater than 10 lbs.) =1- norm.dist (12,10,0.35,TRUE) = 5.5089E-09 i.e. less than 0.05 Since P-value is smaller than Significance level (a), we have failed to reject the H1 i.e. accepting the alternative hypothesis and discarding the Null hypothesis. In other words, we will conclude that there are some issues in the production process that leads to the increase in weight per unit of production. Rupak Roy
  • 9. Terminology Confidence level: is (1-significance level), it refers how confident you are about your conclusion. So, if null hypothesis is rejected at a 5% level of significance, then it means you are 95% (1- 0.05) confident about your conclusion. Again, if null hypothesis is rejected at a 1% level of significance, then it means you are 99% (1-0.01) confident about your conclusion. Rupak Roy
  • 10. Central Limit Theorem (CLT)  The central limit theorem says irrespective of the underlying population distribution, when you pick a multiple random samples from an underlying population with a sample size of at least 30 or above. The distribution of sample average will be normal even if the underlying population is not normal. Rupak Roy
  • 11. Hypothesis testing when sample size is low  Remember: Central limit theorem says if the sample size is sufficiently large, the distribution of sample averages will be normal irrespective of underlying population distribution or else it will follow t-distribution.  So to compute the probability if the sample size is less than 30, we will use t-dist to calculate the P-value.  And is also a continuous probability distribution.  As we can see in the diagram when the sample size increases to 30, the t-distribution approximates a normal distribution. Rupak Roy
  • 12. T-distance In order to calculate t- distribution we need t-distance i.e. the test statistics = Where, (sample mean – population mean) / ( S ) standard deviation/ (N ) sample size ) Rupak Roy
  • 13. Steps for T-distribution  Select null hypothesis (ho): alternative hypothesis (h1):  Significance level: 5%  Test distribution: t-distribution(calculate P-value)  Conclusion: reject the null hypothesis or accept the null hypothesis. Rupak Roy
  • 14. Example  The seller of a manufacturing company claims that an average fluorescent light stays for 320 days. The inspector randomly selects 10 fluorescent lights for inspection. The sampled last with an average of 280 days along with a standard deviation of 95. What is the likelihood that the randomly selected sample fluorescent light would have an average life of no more than 280 days? Here, sample mean = 280 population mean = 320 population std. deviation = 95 sample size = 10 Rupak Roy
  • 15.  In excel: 1) calculate t- distance t =(280-320)/(95 / 10 ) Alternatively, (280-320)/(95/ (10^0.5)) t = - 1.331 2) use the T-distance value in Excel with the following formula = t.dist (t-distance, degrees of freedom, TRUE) = t.dist( -1.331,9,TRUE) = 0.10788 = 11% Therefore there is 11% likelihood that the average life for randomly selected bulbs is less than 280 days ALTERNATIVELY, = 1-(t.dist( t-distance , degree of freedom, TRUE)) = 1-(t.dist(-1.331,9,TRUE) = 1- 0.1078= 0.89= 89% Therefore there is 89% likelihood that the average life for 10 randomly selected bulbs is more than 280 days Note: Df = degrees of freedom = N -1 ( here in the example N (samples size) = 10) Rupak Roy
  • 16.  Note: Why sometimes we use 1- normal.distribution 1- t.distribution If we have notice in any distribution, cumulative for normal.distribution = norm.dist(….cumulative) where cumulative is TRUE / FALSE TRUE (function) means < and FALSE (function) = point probability And what if we want > there is no function, so for that we manually have to feed 1 – appropitate.distribution Rupak Roy
  • 17. What if population Std.deviation is not available  If population standard deviation is not known, sample deviation can be substitute for the population standard deviation.  Therefore, S =sample deviation / sample size Rupak Roy
  • 18. What if population distribution is not normal i.e. not normal distribution?  We are using normal distribution to calculate p-value for hypothesis testing but it is not always necessary that every hypothesis test must use a normal distribution.  If we already know the type of distribution, then it’s better to use directly the right distribution for hypothesis testing.  Remember the example from our previous slide “Stage of Hypothesis” where in point number 2 we have mentioned that we can choose any appropriate types of distribution. Rupak Roy
  • 19. Recap: “Stages of Hypothesis” 1. Select Null Hypothesis (Ho): no difference of mileage if we replace the battery in the car. Alternative Hypothesis (Ha): difference in mileage if we replace the battery in the car 2. Test Distribution: select appropriate distribution like norm.dist, binom.dist with significance level: alpha (a) 5% 3. P-value ( example, p = 1- norm.dist(………) )=0.09 4. Result: failed to reject the null i.e. accepting the null hypothesis and discarding the alternative hypothesis. We will conclude that there is no difference in mileage even if we replace the battery of the car. Rupak Roy
  • 20. Next Directional Hypothesis test like one tail test i.e. if you have strong reason to believe in your hypothesis. And more. Rupak Roy
  • 21.  To be continued. Rupak Roy