SlideShare a Scribd company logo
a111chen @2023
Using Google Sheets For Statistics
Objective
• Try to use Google Sheets function especially : LET, ARRAYFORMULA
• For function LET,
• It can show how the variable in the formula, the variable can be in array or range
• We can compare with different statistic formula and more easy.
• In future, just copy the formula, and change the variable, and boom finished.
• For function ARRAYFORMULA,
• In my opinion, Google Sheets Arrayformula is more easy to detect mistake at the
formula compare with Mircosoft Excel
Table of Content
• Construct Frequency Table with Normal Distribution
• Normality testing
• Normal Approximate to Binomial Distribution
• Sample Size
• Hypothesis Testing
• One Sample
• Two Sample
• Correlation between Two Sample
• K-Independent Sample
Construct Frequency Table
• Lowest
• =MIN(Dataset)
• Highest
• =MAX(Dataset)
• Total Class No
• =ROUNDUP(LN(COUNT(Dataset))/LN(2),0)
• Class Width
• =ROUNDUP((Highest-Lowest)/TotalNo,0)
• Frequency
• =COUNTIFS(Dataset,">="&Lower,Dataset,"<"&Upper)
https://www.youtube.com/watch?v=YfVu7xGHgnA
Frequency Graph
0
5
10
15
20
25
70 - 75 75 - 80 80 - 85 85 - 90 90 - 95 95 - 100
Class Boundaries
Frequency
(Min)
(Max)
(Mean)
(Median)
(Q1)
(Q3)
https://www.youtube.com/watch?v=39lsUsJsc2c
What graph can show the Normality?
8.3%
5.0%
20.0%
21.7%
38.3%
6.7%
70 - 75 75 - 80 80 - 85 85 - 90 90 - 95 95 - 100
Class Boundaries
Skew: -0.75, Kurt:0.06
Frequency in % with Normal Distribution
Count in %
-3.0
-2.0
-1.0
0.0
1.0
2.0
3.0
-3 -2 -1 0 1 2 3
Data
quantiles
(Z-score)
Normal theoretical quantile (Z-score)
QQ Plot
https://www.youtube.com/watch?v=g5DTW2IQwxk
Construct Normal Distribution
• Normal Distribution
• =NORM.DIST(Midpoint,Mean,SD,FALSE)*ClassWidth
• Normal theoretical quantile (Z-score)
• =NORM.S.INV((RANK.AVG(x,Dataset,1)-0.5)/n)
• Data quantiles (Z-score)
• =STANDARDIZE(x,Mean,SD)
• Normality test
• Skew
• =SKEW(Dataset)
• Kurt
• =KURT(Dataset)
• Positive skewness extending toward more positive values.
• Negative skewness extending toward more negative values.
• Positive kurtosis indicates a relatively peaked distribution.
• Negative kurtosis indicates a relatively flat distribution.
Common use
• Total sample number, n
• =COUNT(Dataset)
• =let(f,F2:F6, SUM(f))
• Mean
• =AVERAGE(Dataset)
• =let(n,F7,x,$E$2:$E$6,f,F2:F6, SUMPRODUCT(x,f)/n)
• Standard deviation,SD
• =STDEV.S(Dataset)
• =let(n,F7, x,$E$2:$E$6,f,F2:F6, Mean,F8, sqrt(SUMPRODUCT(f,x^2)/n-Mean^2))
Binomial Distribution
• Number of trials, n
• probability of success, p
• P(X=x)
• = BINOM.DIST(x,n,p,FALSE)
• Check Probability between (from table)
• =PROB(RangeX,RangeP(X=x),LowerLimit,UpperLimit)
• =SUMIFS(RangeP(X=x), RangeX,">="& LowerLimit, RangeX,"<="& UpperLimit)
PROB returns the error value.
• If any value in prob_range ≤ 0 or if any value in
prob_range > 1,
• If the sum of the values in prob_range is not
equal to 1,
• If x_range and prob_range contain a different
number of data points,
Normal Approximate to Binomial Distribution
• Rough guideline:
• np >=10 and n(1-p) >=10
• Example
• n = 75, p = 0.6
• Mean = 45
• Std Dev. = 4.24
-0.02
0.00
0.02
0.04
0.06
0.08
0.10
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75
n = 75, p = 0.6, Mean = 45, Std Dev. = 4.24
Normal Approximate to Binomial
Distribution
P(X=x) x Norm
https://www.youtube.com/watch?v=CCqWkJ_pqNU&t=313s https://www.youtube.com/watch?v=A2sd09qCvcg
Simulation with Random Number
• Random between 0 to 1
• =RAND()
• Random within specific range
• =RAND()*([UpperLimit]-[LowerLimit])+[LowerLimit]
• =RANDBETWEEN([LowerLimit],[UpperLimit])
• Normally simulate about 1,000 dataset
• Example
• =let(Price,$B$2,VariableCost,RANDBETWEEN($D$6*100,$E$6*100)/100,DemandQ
ty,RANDBETWEEN($D$7*100,$E$7*100)/100,FixedCost,$B$3,(Price-
VariableCost)*DemandQty-FixedCost)
https://www.indeed.com/career-advice/career-development/how-to-randomize-numbers-in-excel
Sample Size
https://www.checkmarket.com/blog/how-to-estimate-your-population-and-survey-sample-size/
Hypothesis Testing
Type I - Error
• critical value of the sample mean (1)
• =let(n,B4,mean,B2,SD,B3,Alpha,B7, mean+ CONFIDENCE.NORM(Alpha,SD,n))
• critical value of the sample mean (2)
• =let(n,B4,mean,B2,SD,B3,Alpha,B7, mean- CONFIDENCE.NORM(Alpha,SD,n))
Type II - Error
• Power of the test
• =let(n,B4, mean,B12, SD,B3, cvalue,B13, SE,SD/SQRT(n),Zscore,(cvalue-
mean)/SE,Beta,NORM.S.DIST(Zscore)-NORM.S.DIST(0)+0.5,1-Beta)
• Reject (Power of test <0.8) or accept
T-test for One sample, and Parametric
• T-test
• =let(n,B25,Meanp,B26,Means,B27,SDs,B28, SE,SDs/SQRT(n),(Means-Meanp)/SE)
• T-test critical
• =let(n,B25, alpha,B24,DF,n-1,abs(T.INV(alpha,DF)))
• Reject (t > t-test critical) or accept
Note:
• Parametric Data is Normal distributed
• Non Parametric Data is not Normal distributed
Observe Data and Expected Data, What test?
• Independent
• Chi-Square : ⅀ (Oi - Ei)^2/Ei
• =let(OA,B82:B84,OB,C82:C84,EA,E82:E84,
EB,F82:F84, sum(ARRAYFORMULA((OA-
EA)^2/EA),ARRAYFORMULA((OB-
EB)^2/EB)))
• Chi-Square Critical
• =let(nCol,B74,nRow,B75,
alpha,B77,DF,(nCol-1)*(nRow-
1),CHISQ.INV.RT(alpha,DF))
• Reject (Chi-Square > Chi-square critical ) or
accept
• P-Value
• =let(nCol,B74,nRow,B75, chisquare,B88,
df,(nCol-1)*(nRow-1),
CHISQ.DIST.RT(chisquare,df))
• =Let(Oi,B82:C84,Ei,E82:F84,
CHISQ.TEST(Oi,Ei))
0
5
10
15
20
25
30
35
40
Yes (O1) No (O2) Yes (E1) No (E2)
Observe Expected
Heavy smoker Moderate Nonsmoker
Eij = Sum of Col(i)* Sum of Row(j)/(Total sum of Col or Row)
Chi-square Test
Related
• McNemar test
• =let(FF,B130,NFNF,C131,(abs(FF-
NFNF)-1)^2/(FF+NFNF))
• Chi-square critical
• =let(nCol,B122,nRow,B123,
alpha,B124,DF,(nCol-1)*(nRow-
1),CHISQ.INV.RT(alpha,DF))
• Reject (McNemar > Chi-square critical )
or accept
10
60
90
40
Favor Not Favor
Before/ After
Favor Not Favor
Chi-Square related Statistic
• Statistic, Phi
• =let(ChiSquare,B62,N,B60, sqrt((Chisquare/N)))
• Statistic, Cramer’s V
• =let(ChiSquare,B62,N,B60,K,B61,sqrt((ChiSquare/(N*(K-1)))))
• Statistic, Contigency coefficienct C
• =let(ChiSquare,B62,N,B60, sqrt((ChiSquare/(ChiSquare+N))))
• Probability of error, P(A)
• =let(n,D74,totalCol,B74:C74,PercentCol,B75:C75, 1-
SUMPRODUCT(totalCol,PercentCol)/n)
• Probability of error, P(B)
• =let(OA,B71:B73,OB,C71:C73,TotalRow,D71:D73,n,D74,1-
sum(ARRAYFORMULA(OA^2/TotalRow+OB^2/TotalRow))/n)
• Goodman & Kruskal tau:
• =Let(PA,B81,PB,B82,(PA-PB)/PA)
Correlation Graph, How to test?
-1000
0
1000
2000
3000
4000
5000
6000
-5 0 5 10 15 20 25 30
Y
(Price)
X (Temperature)
T-Test Correlation Test
• Pearson correlation (Parametric)
• =let(n,F4,y,B3:B12,x,C3:C12,r,PEARSON(y,x), r/sqrt((1-r^2)/(n-2)))
• =let(n,F4,y,B3:B12,x,C3:C12,r, CORREL(y,x), r/sqrt((1-r^2)/(n-2)))
• =let(y,B3:B12,x,C3:C12,n,count(x),a,n*SUMPRODUCT(x,y)-
(sum(x)*sum(y)),b,n*SUMPRODUCT(x^2)-sum(x)^2,c,n*SUMPRODUCT(y^2)-
sum(y)^2,a/SQRT(b*c))
• Spearman’s Rho (Non Parametric)
• =let(x,E114:E123,y,F114:F123,n,count(x),RankX,MAP(x,LAMBDA(r,RANK.AVG(r,x))),Ran
kY,MAP(y,LAMBDA(r,RANK.AVG(r,y))),rs,1-6*SUMXMY2(RankX,RankY)/(n^3-
n),rs*sqrt((n-2)/(1-rs^2)))
• T-Test Critical
• =let(n,F4, alpha,F7,df,n-2,abs(T.INV(alpha,df)))
• Reject (Pearson correlation > T-Test critical) or accept
• Reject (Spearman’s Rho > T-Test critical) or accept
F-Test Correlation Test
• F-Ratio (Parametric)
• =let(n,B41,k,B42,x,C23:C32,y,B23:B32,Intercept,INTERCEPT(y,x),Slope,SLOPE(y,x),
RegMeanSq,devsq(ARRAYFORMULA(x*Slope+Intercept))/(k-1),
ResidualMeanSq, SUMXMY2(ARRAYFORMULA(x*Slope+Intercept),y)/(n-k),
RegMeanSq/ResidualMeanSq)
• F-test Critical
• =let(n,B41,k,B42,dfA,k-1,dfB,n-k,F.INV.RT(0.05,dfA,dfB))
• Reject (F-Ratio > F-Test critical) or accept
T-Test Two Sample Test, Parametric
• Related
• T-test
• =let(YA,B103:B112,YB,C103:C112,n,B97,df,n-1,sd,sqrt((SUMXMY2(YA,YB)-
sum(ARRAYFORMULA(YA-YB))^2/n)/df),(sum(ARRAYFORMULA(YA-
YB))/n)/(sd/sqrt(n)))
• T-test critical
• =let(n,B97, alpha,B98,df,n-1, abs(T.INV.2T(alpha,df)))
• Reject (T-Test > T-Test critical) or accept
• P-Value
• =let(n,B97,df,n-1,Ttest,B115,T.DIST.2T(Ttest,df))
• =let(YA,B103:B112,YB,C103:C112,T.TEST(YA,YB,2,1))
tails - Specifies the number of
distribution tails.
• If 1: uses a one-tailed distribution.
• If 2: uses a two-tailed distribution.
type - Specifies the type of t-Test.
• If 1: a paired test is performed.
• If 2: a two-sample equal variance
(homoscedastic) test is performed.
• If 3: a two-sample unequal
variance (heteroscedastic) test is
performed.
T.INV(alpha,df)
= T.INV.2T(2*alpha,df)
K-Independent Sample
0
10
20
30
40
50
60
70
80
90
100
1 2 3
k (number of groups)
1 2
b (number of replicate)
n (Total number of
group interaction)
Analysis of variance (One Way)
• F Value
• =let(K,B140,GA,H145:H164,GB,I145:I164,GC,J145:J164,N,count(GA,GB,GC),
nG,count(GA),sum2n,sum(GA,GB,GC)^2/N,
sumx2n,sum(sum(GA)^2,sum(GB)^2,sum(GC)^2)/nG,sumSQn,sum(SUMSQ(GA),SU
MSQ(GB),SUMSQ(GC)),SSBetween,(sumx2n-sum2n)/(k-1),SSWithin,(sumSQn-
sumx2n)/(n-k),SSBetween/SSWithin)
• F-test Critical value
• =let(n,B139,k,B140,dfA,k-1,dfB,n-k,F.INV.RT(0.05,dfA,dfB))
ANOVA: Two-Factor With Replication
• [A]
• =let(GA,H145:H164,GB,I145:I164,GC,J145:J164,n,count(GA),
sum(sum(GA)^2,sum(GB)^2,sum(GC)^2)/n)
• [BA]
• =let(A,H145:J154,B,H155:J164,n,count(A),sum(sum(A)^2,sum(B)^2)/n)
• [AB]
• =let(GAA,H145:H154,GBA,I145:I154,GCA,J145:J154,GAB,H155:H164,GBB,I155:I164,GCB,J1
55:J164,n,count(GAA),
sum(sum(GAA)^2,sum(GBA)^2,sum(GCA)^2,sum(GAB)^2,sum(GBB)^2,sum(GCB)^2)/n)
• [Y]
• =let(A,H145:J154,B,H155:J164,SUMSQ(A,B))
• [T]
• =let(A,H145:J154,B,H155:J164,SUM(A,B)^2/count(A,B))
ANOVA: Two-Factor With Replication
• Within Group (S/AB)
• =Let(Y,E178,AB,E177,k,B168,b,B169,n,B170,(Y-AB)/(k*b*(n-1)))
• F Value - Between Group A
• =Let(Y,E178,AB,E177,k,B168,b,B169,n,B170,A,E175,T,E179,Within,D187,(A-T)/(k-
1)/Within)
• F Value - Between Group B
• =Let(Y,E178,AB,E177,k,B168,b,B169,n,B170,BA,E176,T,E179,Within,D187,(BA-T)/(b-
1)/Within)
• F Value - Interaction (AxB)
• =Let(Y,E178,AB,E177,k,B168,b,B169,n,B170,BA,E176,A,E175,T,E179,Within,D187,(AB-
A-BA+T)/(k-1)/(b-1)/Within)
• F-test Critical value
• =let(k,$B$168,b,$B$169,n,$B$170, dfWithin,(k*b*(n-
1)),df,C184,F.INV.RT(0.05,df,dfWithin))
• Reject (F Value > F-Test critical) or accept
Conclusion
• Already show some example of function LET, ARRAYFORMULA in statistics
• If need more further understand the statistic for business use, maybe can refer to
• Main Reference: Business Research Methods, Pamela Schindler, 14th Edition.
• Beside from the understand above example, it also can
• Applied to business use, for example marketing research, organization behavior
research
• Combine with lambda function, for our example Spearman’s Rho
• Create new formula

More Related Content

DOCX
PHStat Notes Using the PHStat Stack Data and .docx
PDF
USE OF EXCEL IN STATISTICS: PROBLEM SOLVING VS PROBLEM UNDERSTANDING
PDF
Use of Excel in Statistics: Problem Solving Vs Problem Understanding
PDF
Workshop 4
PDF
Statistical data handling
PDF
Use of Excel in Statistics: Problem Solving Vs Problem Understanding
PPTX
Formulas in ms excel for statistics(report2 in ict math ed)
PPTX
DATA ANALYSIS AND BUSINESS MODELING LAB.pptx
PHStat Notes Using the PHStat Stack Data and .docx
USE OF EXCEL IN STATISTICS: PROBLEM SOLVING VS PROBLEM UNDERSTANDING
Use of Excel in Statistics: Problem Solving Vs Problem Understanding
Workshop 4
Statistical data handling
Use of Excel in Statistics: Problem Solving Vs Problem Understanding
Formulas in ms excel for statistics(report2 in ict math ed)
DATA ANALYSIS AND BUSINESS MODELING LAB.pptx

Similar to Using Google Sheets statistics functions (20)

PPTX
Data simulation basics
PPT
statistics introduction
PDF
Statistics 1 revision notes
PPT
R for Statistical Computing
PDF
Statistics_Cheat_sheet_1567847508.pdf
PPT
Excle
PPTX
BRM Unit 3 Data Analysis.pptx
DOCX
Chapter 1 TestSuppose that you are an administrator in a .docx
PPTX
Data science
DOC
Ash bus 308 week 2 problem set new
DOC
Ash bus 308 week 2 problem set new
DOC
Ash bus 308 week 2 problem set new
PDF
Introduction to Business Statistics 6th Edition Ronald M. Weiers
PDF
Sim Slides,Tricks,Trends,2012jan15
DOCX
BUS 308 Week 4 Lecture 3 Developing Relationships in Exc.docx
PDF
Statistics_summary_1634533932.pdf
PPTX
LESSON 5-DATA ANALYSIS-Practical Research 2
PDF
Formulas in ms excel for statistics(report2 in ict math ed)
PPTX
BRM Unit 3 Data Analysis-1.pptx
Data simulation basics
statistics introduction
Statistics 1 revision notes
R for Statistical Computing
Statistics_Cheat_sheet_1567847508.pdf
Excle
BRM Unit 3 Data Analysis.pptx
Chapter 1 TestSuppose that you are an administrator in a .docx
Data science
Ash bus 308 week 2 problem set new
Ash bus 308 week 2 problem set new
Ash bus 308 week 2 problem set new
Introduction to Business Statistics 6th Edition Ronald M. Weiers
Sim Slides,Tricks,Trends,2012jan15
BUS 308 Week 4 Lecture 3 Developing Relationships in Exc.docx
Statistics_summary_1634533932.pdf
LESSON 5-DATA ANALYSIS-Practical Research 2
Formulas in ms excel for statistics(report2 in ict math ed)
BRM Unit 3 Data Analysis-1.pptx
Ad

More from Chen Jian Yuan (19)

PPTX
Art of War from Business Prespective.pptx
PPTX
Using Ms Excel Google Sheets.pptx
PPT
Time Management
PPT
Audit Of Imprest Petty Cash
PPT
Putrajaya.ppt
PPT
Pemandu Pelancong
PPT
Langkawi Cable Car
PPT
Marine Parks
PPT
Kuala Lumpur Part 1
PPT
Kl A Brief History
PPT
Malaysia At A Glance
PPT
Malaysia Toursim Centre
PPT
PPT
Muzium Negara
PPT
Taman Negara
PPT
Menara Kuala Lumpur
PPT
Malaysia My Second Home Programme
PPT
One Malaysia
PPT
Step By Step Showing Atm Scam
Art of War from Business Prespective.pptx
Using Ms Excel Google Sheets.pptx
Time Management
Audit Of Imprest Petty Cash
Putrajaya.ppt
Pemandu Pelancong
Langkawi Cable Car
Marine Parks
Kuala Lumpur Part 1
Kl A Brief History
Malaysia At A Glance
Malaysia Toursim Centre
Muzium Negara
Taman Negara
Menara Kuala Lumpur
Malaysia My Second Home Programme
One Malaysia
Step By Step Showing Atm Scam
Ad

Recently uploaded (20)

PPT
Data mining for business intelligence ch04 sharda
PDF
Nidhal Samdaie CV - International Business Consultant
PDF
COST SHEET- Tender and Quotation unit 2.pdf
PDF
Power and position in leadershipDOC-20250808-WA0011..pdf
PPTX
job Avenue by vinith.pptxvnbvnvnvbnvbnbmnbmbh
PDF
Reconciliation AND MEMORANDUM RECONCILATION
PPTX
Belch_12e_PPT_Ch18_Accessible_university.pptx
DOCX
unit 1 COST ACCOUNTING AND COST SHEET
PPTX
Principles of Marketing, Industrial, Consumers,
PPTX
2025 Product Deck V1.0.pptxCATALOGTCLCIA
PDF
Roadmap Map-digital Banking feature MB,IB,AB
PDF
How to Get Funding for Your Trucking Business
PDF
NISM Series V-A MFD Workbook v December 2024.khhhjtgvwevoypdnew one must use ...
PDF
Tata consultancy services case study shri Sharda college, basrur
PPT
Chapter four Project-Preparation material
PDF
Stem Cell Market Report | Trends, Growth & Forecast 2025-2034
PDF
Laughter Yoga Basic Learning Workshop Manual
DOCX
Euro SEO Services 1st 3 General Updates.docx
PDF
Unit 1 Cost Accounting - Cost sheet
PDF
Katrina Stoneking: Shaking Up the Alcohol Beverage Industry
Data mining for business intelligence ch04 sharda
Nidhal Samdaie CV - International Business Consultant
COST SHEET- Tender and Quotation unit 2.pdf
Power and position in leadershipDOC-20250808-WA0011..pdf
job Avenue by vinith.pptxvnbvnvnvbnvbnbmnbmbh
Reconciliation AND MEMORANDUM RECONCILATION
Belch_12e_PPT_Ch18_Accessible_university.pptx
unit 1 COST ACCOUNTING AND COST SHEET
Principles of Marketing, Industrial, Consumers,
2025 Product Deck V1.0.pptxCATALOGTCLCIA
Roadmap Map-digital Banking feature MB,IB,AB
How to Get Funding for Your Trucking Business
NISM Series V-A MFD Workbook v December 2024.khhhjtgvwevoypdnew one must use ...
Tata consultancy services case study shri Sharda college, basrur
Chapter four Project-Preparation material
Stem Cell Market Report | Trends, Growth & Forecast 2025-2034
Laughter Yoga Basic Learning Workshop Manual
Euro SEO Services 1st 3 General Updates.docx
Unit 1 Cost Accounting - Cost sheet
Katrina Stoneking: Shaking Up the Alcohol Beverage Industry

Using Google Sheets statistics functions

  • 1. a111chen @2023 Using Google Sheets For Statistics
  • 2. Objective • Try to use Google Sheets function especially : LET, ARRAYFORMULA • For function LET, • It can show how the variable in the formula, the variable can be in array or range • We can compare with different statistic formula and more easy. • In future, just copy the formula, and change the variable, and boom finished. • For function ARRAYFORMULA, • In my opinion, Google Sheets Arrayformula is more easy to detect mistake at the formula compare with Mircosoft Excel
  • 3. Table of Content • Construct Frequency Table with Normal Distribution • Normality testing • Normal Approximate to Binomial Distribution • Sample Size • Hypothesis Testing • One Sample • Two Sample • Correlation between Two Sample • K-Independent Sample
  • 4. Construct Frequency Table • Lowest • =MIN(Dataset) • Highest • =MAX(Dataset) • Total Class No • =ROUNDUP(LN(COUNT(Dataset))/LN(2),0) • Class Width • =ROUNDUP((Highest-Lowest)/TotalNo,0) • Frequency • =COUNTIFS(Dataset,">="&Lower,Dataset,"<"&Upper) https://www.youtube.com/watch?v=YfVu7xGHgnA
  • 5. Frequency Graph 0 5 10 15 20 25 70 - 75 75 - 80 80 - 85 85 - 90 90 - 95 95 - 100 Class Boundaries Frequency (Min) (Max) (Mean) (Median) (Q1) (Q3) https://www.youtube.com/watch?v=39lsUsJsc2c
  • 6. What graph can show the Normality? 8.3% 5.0% 20.0% 21.7% 38.3% 6.7% 70 - 75 75 - 80 80 - 85 85 - 90 90 - 95 95 - 100 Class Boundaries Skew: -0.75, Kurt:0.06 Frequency in % with Normal Distribution Count in % -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0 -3 -2 -1 0 1 2 3 Data quantiles (Z-score) Normal theoretical quantile (Z-score) QQ Plot https://www.youtube.com/watch?v=g5DTW2IQwxk
  • 7. Construct Normal Distribution • Normal Distribution • =NORM.DIST(Midpoint,Mean,SD,FALSE)*ClassWidth • Normal theoretical quantile (Z-score) • =NORM.S.INV((RANK.AVG(x,Dataset,1)-0.5)/n) • Data quantiles (Z-score) • =STANDARDIZE(x,Mean,SD) • Normality test • Skew • =SKEW(Dataset) • Kurt • =KURT(Dataset) • Positive skewness extending toward more positive values. • Negative skewness extending toward more negative values. • Positive kurtosis indicates a relatively peaked distribution. • Negative kurtosis indicates a relatively flat distribution.
  • 8. Common use • Total sample number, n • =COUNT(Dataset) • =let(f,F2:F6, SUM(f)) • Mean • =AVERAGE(Dataset) • =let(n,F7,x,$E$2:$E$6,f,F2:F6, SUMPRODUCT(x,f)/n) • Standard deviation,SD • =STDEV.S(Dataset) • =let(n,F7, x,$E$2:$E$6,f,F2:F6, Mean,F8, sqrt(SUMPRODUCT(f,x^2)/n-Mean^2))
  • 9. Binomial Distribution • Number of trials, n • probability of success, p • P(X=x) • = BINOM.DIST(x,n,p,FALSE) • Check Probability between (from table) • =PROB(RangeX,RangeP(X=x),LowerLimit,UpperLimit) • =SUMIFS(RangeP(X=x), RangeX,">="& LowerLimit, RangeX,"<="& UpperLimit) PROB returns the error value. • If any value in prob_range ≤ 0 or if any value in prob_range > 1, • If the sum of the values in prob_range is not equal to 1, • If x_range and prob_range contain a different number of data points,
  • 10. Normal Approximate to Binomial Distribution • Rough guideline: • np >=10 and n(1-p) >=10 • Example • n = 75, p = 0.6 • Mean = 45 • Std Dev. = 4.24 -0.02 0.00 0.02 0.04 0.06 0.08 0.10 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 n = 75, p = 0.6, Mean = 45, Std Dev. = 4.24 Normal Approximate to Binomial Distribution P(X=x) x Norm https://www.youtube.com/watch?v=CCqWkJ_pqNU&t=313s https://www.youtube.com/watch?v=A2sd09qCvcg
  • 11. Simulation with Random Number • Random between 0 to 1 • =RAND() • Random within specific range • =RAND()*([UpperLimit]-[LowerLimit])+[LowerLimit] • =RANDBETWEEN([LowerLimit],[UpperLimit]) • Normally simulate about 1,000 dataset • Example • =let(Price,$B$2,VariableCost,RANDBETWEEN($D$6*100,$E$6*100)/100,DemandQ ty,RANDBETWEEN($D$7*100,$E$7*100)/100,FixedCost,$B$3,(Price- VariableCost)*DemandQty-FixedCost) https://www.indeed.com/career-advice/career-development/how-to-randomize-numbers-in-excel
  • 13. Hypothesis Testing Type I - Error • critical value of the sample mean (1) • =let(n,B4,mean,B2,SD,B3,Alpha,B7, mean+ CONFIDENCE.NORM(Alpha,SD,n)) • critical value of the sample mean (2) • =let(n,B4,mean,B2,SD,B3,Alpha,B7, mean- CONFIDENCE.NORM(Alpha,SD,n)) Type II - Error • Power of the test • =let(n,B4, mean,B12, SD,B3, cvalue,B13, SE,SD/SQRT(n),Zscore,(cvalue- mean)/SE,Beta,NORM.S.DIST(Zscore)-NORM.S.DIST(0)+0.5,1-Beta) • Reject (Power of test <0.8) or accept
  • 14. T-test for One sample, and Parametric • T-test • =let(n,B25,Meanp,B26,Means,B27,SDs,B28, SE,SDs/SQRT(n),(Means-Meanp)/SE) • T-test critical • =let(n,B25, alpha,B24,DF,n-1,abs(T.INV(alpha,DF))) • Reject (t > t-test critical) or accept Note: • Parametric Data is Normal distributed • Non Parametric Data is not Normal distributed
  • 15. Observe Data and Expected Data, What test? • Independent • Chi-Square : ⅀ (Oi - Ei)^2/Ei • =let(OA,B82:B84,OB,C82:C84,EA,E82:E84, EB,F82:F84, sum(ARRAYFORMULA((OA- EA)^2/EA),ARRAYFORMULA((OB- EB)^2/EB))) • Chi-Square Critical • =let(nCol,B74,nRow,B75, alpha,B77,DF,(nCol-1)*(nRow- 1),CHISQ.INV.RT(alpha,DF)) • Reject (Chi-Square > Chi-square critical ) or accept • P-Value • =let(nCol,B74,nRow,B75, chisquare,B88, df,(nCol-1)*(nRow-1), CHISQ.DIST.RT(chisquare,df)) • =Let(Oi,B82:C84,Ei,E82:F84, CHISQ.TEST(Oi,Ei)) 0 5 10 15 20 25 30 35 40 Yes (O1) No (O2) Yes (E1) No (E2) Observe Expected Heavy smoker Moderate Nonsmoker Eij = Sum of Col(i)* Sum of Row(j)/(Total sum of Col or Row)
  • 16. Chi-square Test Related • McNemar test • =let(FF,B130,NFNF,C131,(abs(FF- NFNF)-1)^2/(FF+NFNF)) • Chi-square critical • =let(nCol,B122,nRow,B123, alpha,B124,DF,(nCol-1)*(nRow- 1),CHISQ.INV.RT(alpha,DF)) • Reject (McNemar > Chi-square critical ) or accept 10 60 90 40 Favor Not Favor Before/ After Favor Not Favor
  • 17. Chi-Square related Statistic • Statistic, Phi • =let(ChiSquare,B62,N,B60, sqrt((Chisquare/N))) • Statistic, Cramer’s V • =let(ChiSquare,B62,N,B60,K,B61,sqrt((ChiSquare/(N*(K-1))))) • Statistic, Contigency coefficienct C • =let(ChiSquare,B62,N,B60, sqrt((ChiSquare/(ChiSquare+N)))) • Probability of error, P(A) • =let(n,D74,totalCol,B74:C74,PercentCol,B75:C75, 1- SUMPRODUCT(totalCol,PercentCol)/n) • Probability of error, P(B) • =let(OA,B71:B73,OB,C71:C73,TotalRow,D71:D73,n,D74,1- sum(ARRAYFORMULA(OA^2/TotalRow+OB^2/TotalRow))/n) • Goodman & Kruskal tau: • =Let(PA,B81,PB,B82,(PA-PB)/PA)
  • 18. Correlation Graph, How to test? -1000 0 1000 2000 3000 4000 5000 6000 -5 0 5 10 15 20 25 30 Y (Price) X (Temperature)
  • 19. T-Test Correlation Test • Pearson correlation (Parametric) • =let(n,F4,y,B3:B12,x,C3:C12,r,PEARSON(y,x), r/sqrt((1-r^2)/(n-2))) • =let(n,F4,y,B3:B12,x,C3:C12,r, CORREL(y,x), r/sqrt((1-r^2)/(n-2))) • =let(y,B3:B12,x,C3:C12,n,count(x),a,n*SUMPRODUCT(x,y)- (sum(x)*sum(y)),b,n*SUMPRODUCT(x^2)-sum(x)^2,c,n*SUMPRODUCT(y^2)- sum(y)^2,a/SQRT(b*c)) • Spearman’s Rho (Non Parametric) • =let(x,E114:E123,y,F114:F123,n,count(x),RankX,MAP(x,LAMBDA(r,RANK.AVG(r,x))),Ran kY,MAP(y,LAMBDA(r,RANK.AVG(r,y))),rs,1-6*SUMXMY2(RankX,RankY)/(n^3- n),rs*sqrt((n-2)/(1-rs^2))) • T-Test Critical • =let(n,F4, alpha,F7,df,n-2,abs(T.INV(alpha,df))) • Reject (Pearson correlation > T-Test critical) or accept • Reject (Spearman’s Rho > T-Test critical) or accept
  • 20. F-Test Correlation Test • F-Ratio (Parametric) • =let(n,B41,k,B42,x,C23:C32,y,B23:B32,Intercept,INTERCEPT(y,x),Slope,SLOPE(y,x), RegMeanSq,devsq(ARRAYFORMULA(x*Slope+Intercept))/(k-1), ResidualMeanSq, SUMXMY2(ARRAYFORMULA(x*Slope+Intercept),y)/(n-k), RegMeanSq/ResidualMeanSq) • F-test Critical • =let(n,B41,k,B42,dfA,k-1,dfB,n-k,F.INV.RT(0.05,dfA,dfB)) • Reject (F-Ratio > F-Test critical) or accept
  • 21. T-Test Two Sample Test, Parametric • Related • T-test • =let(YA,B103:B112,YB,C103:C112,n,B97,df,n-1,sd,sqrt((SUMXMY2(YA,YB)- sum(ARRAYFORMULA(YA-YB))^2/n)/df),(sum(ARRAYFORMULA(YA- YB))/n)/(sd/sqrt(n))) • T-test critical • =let(n,B97, alpha,B98,df,n-1, abs(T.INV.2T(alpha,df))) • Reject (T-Test > T-Test critical) or accept • P-Value • =let(n,B97,df,n-1,Ttest,B115,T.DIST.2T(Ttest,df)) • =let(YA,B103:B112,YB,C103:C112,T.TEST(YA,YB,2,1)) tails - Specifies the number of distribution tails. • If 1: uses a one-tailed distribution. • If 2: uses a two-tailed distribution. type - Specifies the type of t-Test. • If 1: a paired test is performed. • If 2: a two-sample equal variance (homoscedastic) test is performed. • If 3: a two-sample unequal variance (heteroscedastic) test is performed. T.INV(alpha,df) = T.INV.2T(2*alpha,df)
  • 22. K-Independent Sample 0 10 20 30 40 50 60 70 80 90 100 1 2 3 k (number of groups) 1 2 b (number of replicate) n (Total number of group interaction)
  • 23. Analysis of variance (One Way) • F Value • =let(K,B140,GA,H145:H164,GB,I145:I164,GC,J145:J164,N,count(GA,GB,GC), nG,count(GA),sum2n,sum(GA,GB,GC)^2/N, sumx2n,sum(sum(GA)^2,sum(GB)^2,sum(GC)^2)/nG,sumSQn,sum(SUMSQ(GA),SU MSQ(GB),SUMSQ(GC)),SSBetween,(sumx2n-sum2n)/(k-1),SSWithin,(sumSQn- sumx2n)/(n-k),SSBetween/SSWithin) • F-test Critical value • =let(n,B139,k,B140,dfA,k-1,dfB,n-k,F.INV.RT(0.05,dfA,dfB))
  • 24. ANOVA: Two-Factor With Replication • [A] • =let(GA,H145:H164,GB,I145:I164,GC,J145:J164,n,count(GA), sum(sum(GA)^2,sum(GB)^2,sum(GC)^2)/n) • [BA] • =let(A,H145:J154,B,H155:J164,n,count(A),sum(sum(A)^2,sum(B)^2)/n) • [AB] • =let(GAA,H145:H154,GBA,I145:I154,GCA,J145:J154,GAB,H155:H164,GBB,I155:I164,GCB,J1 55:J164,n,count(GAA), sum(sum(GAA)^2,sum(GBA)^2,sum(GCA)^2,sum(GAB)^2,sum(GBB)^2,sum(GCB)^2)/n) • [Y] • =let(A,H145:J154,B,H155:J164,SUMSQ(A,B)) • [T] • =let(A,H145:J154,B,H155:J164,SUM(A,B)^2/count(A,B))
  • 25. ANOVA: Two-Factor With Replication • Within Group (S/AB) • =Let(Y,E178,AB,E177,k,B168,b,B169,n,B170,(Y-AB)/(k*b*(n-1))) • F Value - Between Group A • =Let(Y,E178,AB,E177,k,B168,b,B169,n,B170,A,E175,T,E179,Within,D187,(A-T)/(k- 1)/Within) • F Value - Between Group B • =Let(Y,E178,AB,E177,k,B168,b,B169,n,B170,BA,E176,T,E179,Within,D187,(BA-T)/(b- 1)/Within) • F Value - Interaction (AxB) • =Let(Y,E178,AB,E177,k,B168,b,B169,n,B170,BA,E176,A,E175,T,E179,Within,D187,(AB- A-BA+T)/(k-1)/(b-1)/Within) • F-test Critical value • =let(k,$B$168,b,$B$169,n,$B$170, dfWithin,(k*b*(n- 1)),df,C184,F.INV.RT(0.05,df,dfWithin)) • Reject (F Value > F-Test critical) or accept
  • 26. Conclusion • Already show some example of function LET, ARRAYFORMULA in statistics • If need more further understand the statistic for business use, maybe can refer to • Main Reference: Business Research Methods, Pamela Schindler, 14th Edition. • Beside from the understand above example, it also can • Applied to business use, for example marketing research, organization behavior research • Combine with lambda function, for our example Spearman’s Rho • Create new formula