Logistic Ordinal Regression

Logistic Ordinal Regression
Wendy C Wong

Michal K and Nidhi M

Table of Content
• Ordinal Regression

• Building Linear Models Ordinal Regression

• Linear Models used;

• model parameters updates;

• model predictions

• H2O implementations

• Example and results

What is Ordinal Regression?
• Ordinal regression/classification or ranking learning is a
regression analysis used to predict an ordinal variable (a
variable where the relative ordering between different
values is significant);

• Ordinal regression are used most often in social sciences
to model human levels of preference/satisfaction (levels
1-5 for very poor, poor, average, good, excellent)

Linear Models used for Ordinal Regression
• Let be our predictor of size p and be the associated
ordinal response. Note: takes value from 1 to K.

• A GLM is used to ﬁt ONE coeﬃcient vector for all classes of
the ordinal variable response and a set of thresholds to a data
set.

• model the CUMULATIVE PROBABILITY as the logistic function

• Note that the separating hyperplanes are parallel for all
classes. The non-decreasing vector is
used to separate all the classes.

• Ordered Probit-standard normal distribution and Proportional
Hazards:
xi
1 + exp(−exp(βT
xi + θj))
yj
θ1 < θ2 < . . . < θK−1
P(y < = j|xi) = σ(βT
xi + θj) = 1/(1 + exp(−βT
xi − θj)) = γij
yi

Model Parameters Updates
• The likelihood function:

• The log-likelihood function is

• The pdfs are:

• for j = 1

• for j = K

• To ﬁnd the model parameters, maximize the log-likelihood
function minus your favorite regularization penalties. Take
the derivatives and update each model parameter with a
learning rate*the derivative for that model parameter…..
N−1
∏
i=0
pdf(yi = yrespi)
N−1
∑
n=0
log(σ(βT
xi + θyj
) − σ(βT
xi + θyj−1))
pdf(yi = 1) = σ(βT
xi + θ1)
pdf(yi = K) = 1 − pdf(yi = K − 1)

Model Predictions
• The log proportional odds is:

• When the proportional odds > 1 (log(.) > 0), it implies that
it is more probable that the data point belongs to class
j or lower than belonging to classes j+1 and beyond.

• This implies that a data point is classiﬁed as:

• class K:

• class j (>=1 and <= K-1): and
log(
γij
1 − γij
) =
1
1 + exp(−βT xi − θj)
1 − 1
1 + exp(−βT xi − θj))
= βT
xi + θj
xi
xi
βT
xi + θK−1 > 0
βT
xi + θj > 0 βT
xi + θj+1 < = 0

Alternate Model Parameters Optimization
• I decided to modify the model parameters to directly
increase the probability of correct predictions.

• Hence, I will optimize the error function
where

• for correct prediction

• for incorrect predictionL(β, θ, xi, yrespi) = (βT
xi + θj)2
N−1
∑
i=0
L(β, θ, xi, yrespi)
L(β, θ, xi, yrespi) = 0
βT
xi + θj < = 0
j < yrespiβT
xi + θj > 0
j > = yrespi
βT
xi + θj > 0
j < yrespi
βT
xi + θj < = 0
j > = yrespi

H2O Implementation
• To use ordinal regression, set family=“ordinal”;

• To change model parameters using the likelihood function, do not set solver or
set solver to “GRADIENT_DESCENT_LH”

• To change model parameters using the other loss function, set solver to
“GRADIENT_DESCENT_SQERR”

• Gradient descent: first-order method, use gridsearch to find good learning rate,
regularization values (beta, alpha)….

• In R: ordinal.fit <- h2o.glm(y=Y, x=X, training_frame=
Dtrain, family="ordinal",
solver="GRADIENT_DESCENT_SQERR")
• In Python:
ordinal_fit = H2OGeneralizedLinearEstimator(family="ordinal",
solver=“GRADIENT_DESCENT_LH”)

ordinal_fit.train(y=Y, x=X, training_frame=Dtrain)

Summary/Results
Table 1
Dataset LH
performance
SQERR
performance
R ordinal
5 columns with enum 0.9959 0.99751
5 numerical columns 0.99968 0.999445
20 columns with enums 0.998 0.999155
Multinomial dataset 0.47372 0.45527
nidhi dataset 0.5675 0.58 0.5775

Reference
• Peter McCullagh, Regression Models for Ordinal Data, J.
R. Statist, Soc. B(1980), 42, No 2, pp.109-142

• Wikipedia, Ordinal Regression

• Alan Agresti, “Analysis of Ordinal Categorical data”, John
Wiley & Sons, Inc. July, 2012

Logistic Ordinal Regression

More Related Content

What's hot (20)

Similar to Logistic Ordinal Regression (20)

More from Sri Ambati (20)

Recently uploaded (20)

Logistic Ordinal Regression