correction maximum likelihood estimation method

COMPARISONS OF MAXIMUM LIKELIHOOD AND BIASED ESTIMATION METHODS USING
GENERALIZED LINER MODELS
M.Phil. Scholar: Syeda Salma Kazmi
Supervised by: Dr. Atif Abbasi
Department of Statistics
King Abdullah Campus Chatter Kalas
The University of Azad Jammu and Kashmir
03/04/2025 2

OUTLINES
 INTODUCTION
 MATERIAL AND METHODS
 SUMMARY AND CONCLUSIONS
 REFERENCES
03/04/2025 3

INTRODUCTION
The normal linear models are based on certain assumptions such as; the mean of Y
(the explained variable) should be a linear combination of the explanatory variables X’s.
Additionally, with constant variance the distribution of Y is presumed to be normal.
However, there are numerous experimental conditions in which the linearity and or the
normality of the explained Y may not be completely applicable. For instance, a discrete
random variable is a binary with two possible outcomes, success, or failure
GLMs offer a method of combining several different statistical models, such as
linear regression, binary regression, Poisson regression, gamma, and negative binomial
regressions, etc. GLMs can handle both discrete and continuous responses, and the
standard assumptions of normality and homoscedasticity are not always made on the
concerned explained variable.
03/04/2025 4

CONTINUE
Generalized linear model" is the phrase GLM is a larger class of models that
McCullagh and Nelder (1982, 2nd edition 1989) proposed. In this model, the
response variable is anticipated to adhere to an exponential family distribution,
where the mean is represented by This distribution is made up of specific (often
nonlinear) functions of , which some would refer to as "nonlinear." Although the
covariates have a history of being nonlinear, McCullagh and Nelder reflect the
covariates as linear because the variables only have an effect on the distribution of
yi through the linear combination .
03/04/2025 5

03/04/2025 6
COMPONENTS OF GLMs
Random component
The random component represents the conditional distribution of
the dependent variable (for the i-th of n independently sampled
observations), given the values of the independent variables in the model.
The distribution of Yi, according to Nelder and Wedderburn's
original formulation, is a member of an exponential family, such as the
binomial, normal, Gamma, Poisson, or Inverse-Gaussian distributions
etc. The probability density function of exponential family distributions
is commonly represented as
Systematic component
The linear predictor is constructed as a linear combination of
explanatory factors.
which relates, η to a set of independent variables, X1 ,X2 ,…X k

03/04/2025 7
 Link Function
A flat and invertible linearizing link function g(. ), which
converts the expected value of response variable's µi = E(Yi)
to the linear predictor (John, 2015):
where β is a q=p×1 vector of unidentified parameters and
Xi = (x1 ,x2 ,...xk) ′ is the vector of explanatory variables.The
association between the linear predictor and the expected value
of the random component is described by link function as

03/04/2025 8
Table1: Link Functions of different distributions.
Family Link Mean Function Ψ(࢞ ࢏
'
ࢼ)
Gaussian Identity μi=xi ′β 1
ߪ2
Binomial Logit μi=
exp (‫ݔ‬݅
'ߚ)
1+exp (‫ݔ‬݅
'
ߚ)
ߤ݅ (1− ߤ݅ )
Binomial Probit μi=ϕ(xi ′β) ϕ (‫ݔ‬݅
'
ߚ)2
ϕ ‫ݔ‬݅
'
ߚ (1−ϕ ‫ݔ‬݅
'
ߚ )
Binomial Cloglog μi=1-exp(-exp(xi ′β)) (1− ߤ݅ )
ߤ݅
log(1
− ߤ݅ ) 2
Poisson Log μi=exp(xi ′β) ߤ݅
Poisson Identity μi=xi ′β 1 ߤ݅
Poisson Sqrt μi=(‫'݅ݔ‬ߚ)2 4
Gamma Inverse μi=(‫'݅ݔ‬ߚ)−1
ܽߤ݅
2
Gamma Identity μi=xi ′β ܽ
ߤ݅
2
Gamma Log μi=exp(xi ′β) ܽ
Inverse
Gaussian
Inverse squared μi=(‫'݅ݔ‬ߚ)−1/2 ߣߤ݅
3
4
Table1: Link Functions of different distributions.
Family Link Mean Function Ψ( ࢏
'
ࢼ)
Gaussian Identity μi=xi ′β 1
ߪ2
Binomial Logit μi=
exp (‫ݔ‬݅
'ߚ)
1+exp (‫ݔ‬݅
'
ߚ)
ߤ݅ (1− ߤ݅ )
Binomial Probit μi=ϕ(xi ′β) ϕ (‫ݔ‬݅
'
ߚ)2
ϕ ‫ݔ‬݅
'
ߚ (1−ϕ ‫ݔ‬݅
'
ߚ )
Binomial Cloglog μi=1-exp(-exp(xi ′β)) (1− ߤ݅ )
ߤ݅
log(1
− ߤ݅ ) 2
Poisson Log μi=exp(xi ′β) ߤ݅
Poisson Identity μi=xi ′β 1 ߤ݅
Poisson Sqrt μi=(‫'݅ݔ‬ߚ)2 4
Gamma Inverse μi=(‫'݅ݔ‬ߚ)−1
ܽߤ݅
2
Gamma Identity μi=xi ′β ܽ
ߤ݅
2
Gamma Log μi=exp(xi ′β) ܽ
Inverse
Gaussian
Inverse squared μi=(‫'݅ݔ‬ߚ)−1/2 ߣߤ݅
3
4

OBJECTIVES
The undertaking research is being conducted to
 Estimate the parameters of the GLMs by considering binomial and Poisson
regression models.
 Compare the maximum likelihood estimator with biased estimators like Ridge
and Principal component regression estimators.
 Explore the effects of multicollinearity on the performance of the estimators.
03/04/2025 9

MATERIALS AND METHODS
Various method are described which are used in GLMs for estimating the
parameters.
 Maximum Likelihood Estimation
 Newton Raphson method
 Method of Fisher’s Scoring
BIASED ESTIMATION METHODS
 Ridge Regression
 Generalizing Principal Component Regression
 Mean squares error
03/04/2025 10

03/04/2025 11
Non-Linear Regression Models.
Non-Linear Regression Models.
we describe non-linear regression models such as Poisson and binomial
regression models.
 Binomial Regression
It is assumed that p(x) is solely dependent on the regressors' x values via a
linear combination of β′x, where β is an unknown.
Pdf of binomial distribution is
n is number of trails, p is the probability of success.
 Poisson Regression
The rate parameter is thought to depend on the regression with the linear
predictors β′(x) through the link function when the data are to be treated as
Poisson counts.
Pdf of Poisson distribution is

Method for detecting multicolinearity
Multicolinearity
Frisch (1934) was first introduced the term multicollinearity. The full rank condition in
a multiple linear regression model denotes the independence of the regressors. In the event that
this assumption is incorrect, multicollinearity becomes a problem and the least squares
estimation breaks down. Multicollinearity is the high degree of correlation between many
independent variables.
There are many criteria for the detection of multicollinearity in GLMs two of those are
mentioned in the following section.
 Condition number
In mathematical analysis, a function's condition number (CN) indicates how much a little
change in the input argument can change the function's output significance.
The high CN value indicated that multicollinearity is an issue. CN has the following
mathematical definition:
If k 30 then there exists a multicollinearity problem.
˃
 Variance inflation factor
For the detection of multicollinearity one method is to calculate variance inflation factor
(VIF) for each explanatory variable, value of VIF greater than 1.5 shows multicollinearity. The
range of VIF signifies level of multicollinearity. The value of VIF ‘1’ is non-collinear, which is
considering being negligible. Values of VIF ranging between 1 and 5 are moderate. It shows the
medium level of collinearity. Values of VIF greater than 5 are highly collinear.
03/04/2025 12

03/04/2025 13
Example:Apple juice data
4.1. Numerical Example 1: Binomial Regression (Apple Juice data)
• In this section we explain the use of the maximum likelihood (ML),
principal component regression (PCR) and ridge estimators to a real-life
data set. To fit a binomial regression model we considered an apple juice
data which has been used by Pena et al. (2011) and Özkale (2016). There
are four explanatory variables in this data set which are pH (x1), nisin
concentration (IU/ml) (x2), incubation temperature (C) (x3), and soluble
solids concentration (Brix) (x4). The response variable is the amount of
Alicyclobacillus acidoterrestris growing in apple juice, where 1 indicates
growth and 0 indicates no growth. Prior to calculating the results, we
standardized the independent variables and subsequently incorporated the
intercept term into the model. Then, the logistic regression model is
• where denotes the i-th observation for the j-th explanatory variable and
The eigenvalues of the matrix are obtained as λ1 = 4.2143, λ2 = 0.1774, λ3
= 0.1145, λ4 = 0.0718 and λ5 = 0.0303
•

03/04/2025 14
• . The condition number (CN) is computed as . As, the value of condition number is very large which
indicates that there is a multicollinearity issue within this dataset. First, we obtain the ML estimator by
an iterative procedure. For iterative ML algorithms, we generally choose (sufficiently close to zero) as
the convergence criterion. The iteration ends if the norm of the difference in the parameter estimates
between iterations is less than , such as where m represents the iteration step. The ordinary least square
(OLS) estimator is considered as an initial estimate having the values; the initial working response
variable is defined as and is the initial weight matrix computed as The ridge regression parameter is
computed by following Abbasi and Özkale (2021) such as which gives 8.0874 other value of k is also
selected randomly such as . For choosing the amount of principal components (PCs) we used the
percentage of total variation (PTV) method. The PTV method is defined as follows:
• where r denotes the amount of PCs retained in the model. In this example the amount of PCs retain in
the model is r =2.
• The results are shown in the following Table 4.1 shows the results of iteratively obtain estimators along
with the SMSE values. It is seen that ridge estimators acquire the smallest SMSE as compared to ML
and PCR estimators for k1. Then, PCR has smaller SMSE value. While the SMSE value of ML
estimator is largest, which shows that multicollinearity affects the performance of ML estimator. The
results show that the presentation of the ridge estimator is best amongst all other estimators to
encounter the multicollinearity problem for k1. However, for k2 the PCR estimator has smaller SMSE
value as compared to ridge estimator. Thus, the ridge estimator performs better for large value of k
whereas for small k value PCR performs better than ridge estimator. Table 4.1 shows the results only
for two values of k while figure 4.1 is given to assess the performance of the estimators for remaining
values of k. From figure 4.1 it is clear when the k values fall below approximately 0.14 then PCR
estimator performs better than ridge estimator. However, when k is greater than approximately 0.14
then ridge estimator outperforms its counterparts.

Table 4.1 Iteratively obtain estimators and their SMSE values for k1 and k2 Binomial
response (Apple Juice data)
ML Ridge PCR Ridge
Coefficients =8.0874 =8.0874
-1.3159 -0.3187 38.5127 -1.2817
8.9941 0.1184 -7.1565 8.7376
-10.7939 -0.1598 -9.0721 -10.4668
6.1903 0.0671 -3.1447 5.9583
-5.8053 -0.0911 -6.2584 -5.6525
SMSE 61.4822 4.2419 10.0057 58.8241
03/04/2025 15

Figure 4.1: SMSE values of the estimators for different k
values
03/04/2025 16

Table 4.2 Iteratively obtain estimators and their SMSE values for k1 and k2.
ML Ridge PCR Ridge
Coefficients =0.07776 =0.255
0.1262 0.1450 0.92500 0.1811
1.5576 1.5226 0.1673 1.4448
2.6709 2.5805 0.3033 2.4004
-1.4157 -1.3522 -0.1210 -1.2281
3.8847 3.0314 25.8305 2.5819
SMSE 0.1262 0.1450 0.92500 0.1811
03/04/2025 17

Figure 4.2: SMSE values of the estimators for different k
values
03/04/2025 18

Table 4.3 Estimated MSE values of the estimators when p=4 and binomial response
N ML Ridge PCR
50
0.75
0.85
0.95
0.99
41.4072
67.7412
154.9522
902.0276
0.0018
0.0013
0.0018
0.0010
16.5545
30.9442
11.0888
0.5428
100
0.75
0.85
0.95
0.99
31.3085
43.6502
146.4066
668.7058
0.0011
0.0021
0.0019
0.0020
10.2321
12.7278
9.4802
0.4604
200
0.75
0.85
0.95
0.99
32.2094
47.1508
124.7506
701.2450
0.0005
0.0019
0.0015
0.0015
9.1016
13.2858
8.9455
0.4862
400
0.75
0.85
0.95
0.99
33.3879
48.1748
104.3254
687.4707
0.0017
0.0003
0.0020
0.0020
9.0938
12.8648
18.3423
0.5186
600
0.75
0.85
0.95
0.99
28.1283
49.0983
120.9762
647.6524
0.0013
0.0023
0.0016
0.0020
7.4680
12.7745
10.1470
0.4944
800
0.75
0.85
0.95
0.99
30.5335
46.5441
119.2554
655.6240
0.0024
0.0016
0.0013
0.0020
8.0997
12.1224
9.8435
0.5136
03/04/2025 19

Table 4.4 Estimated MSE values of the estimators when p=6 and binomial response.
N ML Ridge PCR
50
0.75
0.85
0.95
0.99
76.1947
150.1024
217.1285
1764.5085
0.0008
0.0049
0.0022
0.0009
22.6668
40.9033
37.0063
0.5074
100
0.75
0.85
0.95
0.99
52.7569
130.6881
197.5604
1394.5707
0.0005
0.0012
0.0040
0.0012
17.7369
26.7614
28.7083
0.4354
200
0.75
0.85
0.95
0.99
48.4059
82.4005
228.0517
1136.2293
0.0016
0.0017
0.0011
0.0001
13.9079
17.0723
21.2104
0.4485
400
0.75
0.85
0.95
0.99
45.0608
74.5023
187.0555
976.9752
0.0019
0.0018
0.0034
0.0020
12.0324
15.2926
26.5472
0.4178
600
0.75
0.85
0.95
0.99
50.2852
76.4025
234.3184
1080.1857
0.0004
0.0022
0.0023
0.0021
13.2633
15.1403
20.9875
0.4149
800
0.75
0.85
0.95
50.2052
73.5587
215.4897
0.0014
0.0004
0.0004
13.2322
15.1288
19.7981
03/04/2025 20

Table 4.5 Estimated MSE values of the estimators when p=8 and Binomial response
N ML Ridge PCR
50
0.75
0.85
0.95
0.99
161.0264
251.4752
522.0259
4608.8392
0.0027
0.0006
0.0005
0.0010
67.6362
124.4677
40.6199
0.5098
100
0.75
0.85
0.95
0.99
90.7808
147.3133
324.2545
2035.3495
0.0029
0.0009
0.0025
0.0015
24.6248
36.8634
22.2733
0.4041
200
0.75
0.85
0.95
0.99
76.7998
115.6691
322.8567
1719.4271
0.0008
0.0024
0.0025
0.0006
17.1233
26.6892
28.7476
0.3783
400
0.75
0.85
0.95
0.99
69.5682
108.1238
313.2324
1640.6706
0.0010
0.0003
0.0002
0.0017
15.9373
23.8345
29.1138
0.3532
600
0.75
0.85
0.95
0.99
67.5956
94.2755
304.6276
1570.9421
0.0006
0.0017
0.0120
0.0019
14.8812
20.9240
27.5979
0.3532
800
0.75
0.85
0.95
70.6118
106.0120
323.7606
0.0013
0.0011
0.0003
15.4955
23.3277
30.8585
03/04/2025 21

Table4. 6 Estimated MSE values of the estimators when p=4 and Poisson response
N ML Ridge PCR
50
0.75
0.85
0.95
0.99
5.8703
8.9103
27.5446
160.6970
0.0027
0.0219
0.0016
0.0006
4.2111
6.1693
7.2321
1.2465
100
0.75
0.85
0.95
0.99
6.9027
9.5563
21.5211
203.2916
0.0185
0.0112
0.0037
0.0011
5.9813
6.1911
5.4875
0.4825
200
0.75
0.85
0.95
0.99
7.8466
10.4445
26.3674
153.5938
0.0045
0.0122
0.0024
0.0003
6.4162
7.6284
11.3546
1.2875
400
0.75
0.85
0.95
0.99
7.4201
10.3342
28.8553
140.3262
0.0008
0.0108
0.0003
0.0014
5.7976
7.7587
13.7033
0.2718
600
0.75
0.85
0.95
0.99
7.2841
10.3102
29.3240
154.0565
0.0024
0.0017
0.0014
0.0002
6.2151
7.9933
11.2555
8.8921
800
0.75
0.85
0.95
7.2601
10.7023
31.2335
0.0046
0.0034
0.0005
5.9231
7.8649
10.2621
03/04/2025 22

Table 4.7 Estimated MSE values of the estimators when p=6 and Poisson response
N ML Ridge PCR
50
0.75
0.85
0.95
0.99
12.4254
18.5709
47.5762
265.5990
0.0007
0.0005
0.0143
0.0001
6.11885
7.6368
8.0257
0.1356
100
0.75
0.85
0.95
0.99
10.4484
14.6152
59.7272
284.5081
0.0074
0.0023
0.0005
0.0006
8.2106
9.8361
6.1725
0.1699
200
0.75
0.85
0.95
0.99
10.6856
16.5700
46.2691
302.0588
0.0049
0.0003
0.0017
0.0005
8.1323
10.8708
15.5362
0.2583
400
0.75
0.85
0.95
0.99
11.9427
20.6798
53.5402
252.3058
0.0258
0.0018
0.0129
0.0095
10.5445
14.8662
20.0278
0.1644
600
0.75
0.85
0.95
0.99
12.1469
17.9758
48.2305
259.9044
0.0124
0.0391
0.0119
0.0006
10.4575
13.1749
21.6331
0.5071
800
0.75
0.85
0.95
12.2739
17.4516
47.1687
0.0180
0.0056
0.0466
10.9758
12.6008
19.5004
03/04/2025 23

Table 4.8 Estimated MSE values of the estimators when p=8 and Poisson response
N ML Ridge PCR
50
0.75
0.85
0.95
0.99
17.1718
26.1310
72.3064
348.9901
0.0093
0.0042
0.0011
0.0016
10.0858
13.0456
20.6005
0.7934
100
0.75
0.85
0.95
0.99
14.3551
27.4195
70.7129
342.7475
0.0058
0.0552
0.0087
0.0004
10.5345
13.1810
17.4303
0.2653
200
0.75
0.85
0.95
0.99
14.8707
28.5718
86.1708
355.3073
0.0187
0.0056
0.0026
0.0001
11.5925
16.1257
19.8846
1.3481
400
0.75
0.85
0.95
0.99
17.0676
24.1639
71.4218
353.4743
0.0101
0.0089
0.0118
0.0006
13.3476
15.6384
24.7161
0.1139
600
0.75
0.85
0.95
0.99
15.0044
26.6676
58.5547
380.5351
0.0097
0.0031
0.0691
0.0027
12.3790
19.6386
24.2047
0.6782
800
0.75
0.85
0.95
15.9130
24.0228
72.4139
0.0168
0.0090
0.1495
13.4027
16.7514
24.5364
03/04/2025 24

Summary and Conclusions
 This study is aiming to give a smooth approach what to do when facing
the problem of multicollinearity. Specifically in generalized linear
models where the response variable may not be normally distributed.
 For the detection of this problem two methods were discussed in this
study that are condition number (CN) and variance inflation factor
(VIF) which suggest the level of multicollinearity. This problem can
overcome by using biased estimation methods. These methods are ridge,
PCR and ML estimation.
 The study includes two non-linear regression models for the estimation
of parameters that are Poisson and binomial regression models. For ML
estimation Iterative reweighted least square technique is used. Two
iterative procedures Newton-Raphson and Fisher’s scoring are used for
estimation.
03/04/2025 25

Continue
 Monte Carlo Simulation experiment also used in the study for
binomial and Poisson response for different sizes and different
independent variables.The performance evaluation criteria of this
study are the expected mean square error (EMSE).
 The results show that the ridge estimator obtains the smallest SMSE
as compared to ML and PCR estimators for numerical examples as
well as for the simulation studies. It is conclude that Ridge estimator
is the best amongst three for large value of k, while for smaller value
of k PCR performs better.
03/04/2025 26

REFERENCES
Abbasi, A., & Özkale, R. (2021). The r-k class estimator in generalized linear models
applicable with simulation and empirical study using a Poisson and Gamma
responses. Hacettepe Journal of Mathematics and Statistics, 50(2), 594-611..
Abdulkabir, M., Edem, U., Tunde, R., & Kemi, B. (2015). An empirical study of
generalized linear model for count data. Journal of Applied and Computational
Mathematics, 4, 253.
Agresti, A. (2015). Foundations of linear and generalized linear models. John Wiley
& Sons.
Akay, K. U., & Ertan, E. (2022). A new improved Liu-type estimator for Poisson
regression models. Hacettepe Journal of Mathematics and Statistics, 1-20.
Cessie, S. L., & Houwelingen, J. V. (1992). Ridge estimators in logistic
regression. Journal of the Royal Statistical Society Series C: Applied Statistics, 41(1),
191-201.
Ertan, E., & Akay, K. U. (2022). A new Liu-type estimator in binary logistic
regression models. Communications in Statistics-Theory and Methods, 51(13), 4370-
4394.
03/04/2025 27

Fox, J. (2015). Applied regression analysis and generalized linear models.
Sage Publications.
Hoerl, A. E., & Kennard, R. W. (1970). Ridge regression: Biased
estimation for nonorthogonal problems. Technometrics, 12(1), 55-
67.
Hubert, M. H., & Wijekoon, P. (2006). Improvement of Liu estimator in
linear regression model . Statistical papers, 47(3), 471-479.
Hussein, S. M., & Yousaf, H. M. (2015). A comparisons among some
biased estimators in generalized linear regression model in present
of multicolinearity. Al-Qadisiyah Journal For Administrative and
Economic sciences, 17(2).
Kurtoglu, F., & Özkale, M. R. (2016). Liu estimation in generalized linear
models : application on gamma distributed response variable.
Statistical papers, 57(4), 911-928.
03/04/2025 28

Mackinnon, M. J., & Puterman, M. L. (1989). Collinearity in generalized linear models.
Communications in statistics-theory and methods, 18(9), 3463- 3472.
McCullagh, P. (1973). Nelder. ja (1989), generalized linear models. CRC
Monographs on Statistics & Applied Probability, Springer Verlag, New York, 81.
McDonald, G. C., & Galarneau, D. I. (1975). A Monte Carlo evaluation of some ridge-
type estimators. Journal of the American Statistical Association, 70(350), 407-
416.
Nelder, J. A., & Wedderburn, R. W. (1972). Generalized linear models. Journal of the
Royal Statistical Society: Series A (General), 135(3), 370-384.
Sellers, K. F., &Shmueli, G. (2010). A flexible regression model for count data. The
Annals of Applied Statistics, 943-961.
Smith, E. P., & Marx, B. D. (1990). Ill conditioned information matrices,
‐
generalized linear models and estimation of the effects of acid rain. Environmetrics,
1(1), 57-71.
Weissfeld L.A., and Sereika S. M., A multicollinearity diagnostic for generalized
linear models. Commun. Stat. Theory Methods. 20:1183-1198,1991.
03/04/2025 29

correction maximum likelihood estimation method

More Related Content

Similar to correction maximum likelihood estimation method (20)

Recently uploaded (20)

correction maximum likelihood estimation method