Applied Logistic Regression 3rd David Hosmer Stanley Lemeshow
Applied Logistic Regression 3rd David Hosmer Stanley Lemeshow
Applied Logistic Regression 3rd David Hosmer Stanley Lemeshow
Applied Logistic Regression 3rd David Hosmer Stanley Lemeshow
1. Applied Logistic Regression 3rd David Hosmer
Stanley Lemeshow download
https://ebookbell.com/product/applied-logistic-regression-3rd-
david-hosmer-stanley-lemeshow-4338244
Explore and download more ebooks at ebookbell.com
2. Here are some recommended products that we believe you will be
interested in. You can click the link to download.
Applied Logistic Regression Analysis Quantitative Applications In The
Social Sciences V 106 2nd Dr Scott Menard
https://ebookbell.com/product/applied-logistic-regression-analysis-
quantitative-applications-in-the-social-sciences-v-106-2nd-dr-scott-
menard-2148528
Applied Logistic Regression Hosmer David W Jrlemeshow
Stanleysturdivant
https://ebookbell.com/product/applied-logistic-regression-hosmer-
david-w-jrlemeshow-stanleysturdivant-232130886
Applied Ordinal Logistic Regression Using Stata Xing Liu
https://ebookbell.com/product/applied-ordinal-logistic-regression-
using-stata-xing-liu-6999530
Applied Simulation And Optimization In Logistics Industrial And
Aeronautical Practice 1st Edition Miguel Mujica Mota
https://ebookbell.com/product/applied-simulation-and-optimization-in-
logistics-industrial-and-aeronautical-practice-1st-edition-miguel-
mujica-mota-5054052
3. Applied Simulation And Optimization 2 New Applications In Logistics
Industrial And Aeronautical Practice 1st Edition Miguel Mujica Mota
https://ebookbell.com/product/applied-simulation-and-
optimization-2-new-applications-in-logistics-industrial-and-
aeronautical-practice-1st-edition-miguel-mujica-mota-10305412
Simulationbased Case Studies In Logistics Education And Applied
Research 1st Edition Antoni Guasch
https://ebookbell.com/product/simulationbased-case-studies-in-
logistics-education-and-applied-research-1st-edition-antoni-
guasch-1188750
Applied Water Science Volume 1 Fundamentals And Applications Inamuddin
https://ebookbell.com/product/applied-water-science-
volume-1-fundamentals-and-applications-inamuddin-44870592
Applied Water Science Volume 2 Remediation Technologies Inamuddin
https://ebookbell.com/product/applied-water-science-
volume-2-remediation-technologies-inamuddin-44870596
Applied Soil Chemistry Inamuddin Mohd Imran Ahamed Rajender Boddula
https://ebookbell.com/product/applied-soil-chemistry-inamuddin-mohd-
imran-ahamed-rajender-boddula-44870598
7. Applied Logistic Regression
Third Edition
DAVID W. HOSMER, JR.
Professor of Biostatistics (Emeritus)
Division of Biostatistics and Epidemiology
Department of Public Health
School of Public Health and Health Sciences
University of Massachusetts
Amherst, Massachusetts
STANLEY LEMESHOW
Dean, College of Public Health
Professor of Biostatistics
College of Public Health
The Ohio State University
Columbus, Ohio
RODNEY X. STURDIVANT
Colonel, U.S. Army
Academy and Associate Professor
Department of Mathematical Sciences
United States Military Academy
West Point, New York
9. To our wives, Trina, Elaine, and Mandy,
and our sons, daughters,
and grandchildren
10. Contents
Preface to the Third Edition xiii
1 Introduction to the Logistic Regression Model 1
1.1 Introduction, 1
1.2 Fitting the Logistic Regression Model, 8
1.3 Testing for the Significance of the Coefficients, 10
1.4 Confidence Interval Estimation, 15
1.5 Other Estimation Methods, 20
1.6 Data Sets Used in Examples and Exercises, 22
1.6.1 The ICU Study, 22
1.6.2 The Low Birth Weight Study, 24
1.6.3 The Global Longitudinal Study of Osteoporosis
in Women, 24
1.6.4 The Adolescent Placement Study, 26
1.6.5 The Burn Injury Study, 27
1.6.6 The Myopia Study, 29
1.6.7 The NHANES Study, 31
1.6.8 The Polypharmacy Study, 31
Exercises, 32
2 The Multiple Logistic Regression Model 35
2.1 Introduction, 35
2.2 The Multiple Logistic Regression Model, 35
2.3 Fitting the Multiple Logistic Regression Model, 37
2.4 Testing for the Significance of the Model, 39
2.5 Confidence Interval Estimation, 42
2.6 Other Estimation Methods, 45
Exercises, 46
vii
11. viii contents
3 Interpretation of the Fitted Logistic Regression Model 49
3.1 Introduction, 49
3.2 Dichotomous Independent Variable, 50
3.3 Polychotomous Independent Variable, 56
3.4 Continuous Independent Variable, 62
3.5 Multivariable Models, 64
3.6 Presentation and Interpretation of the Fitted Values, 77
3.7 A Comparison of Logistic Regression and Stratified Analysis
for 2 × 2 Tables, 82
Exercises, 87
4 Model-Building Strategies and Methods for Logistic Regression 89
4.1 Introduction, 89
4.2 Purposeful Selection of Covariates, 89
4.2.1 Methods to Examine the Scale of a Continuous
Covariate in the Logit, 94
4.2.2 Examples of Purposeful Selection, 107
4.3 Other Methods for Selecting Covariates, 124
4.3.1 Stepwise Selection of Covariates, 125
4.3.2 Best Subsets Logistic Regression, 133
4.3.3 Selecting Covariates and Checking their Scale
Using Multivariable Fractional Polynomials, 139
4.4 Numerical Problems, 145
Exercises, 150
5 Assessing the Fit of the Model 153
5.1 Introduction, 153
5.2 Summary Measures of Goodness of Fit, 154
5.2.1 Pearson Chi-Square Statistic, Deviance,
and Sum-of-Squares, 155
5.2.2 The Hosmer–Lemeshow Tests, 157
5.2.3 Classification Tables, 169
5.2.4 Area Under the Receiver Operating Characteristic
Curve, 173
5.2.5 Other Summary Measures, 182
5.3 Logistic Regression Diagnostics, 186
5.4 Assessment of Fit via External Validation, 202
12. contents ix
5.5 Interpretation and Presentation of the Results from a Fitted
Logistic Regression Model, 212
Exercises, 223
6 Application of Logistic Regression with Different Sampling
Models 227
6.1 Introduction, 227
6.2 Cohort Studies, 227
6.3 Case-Control Studies, 229
6.4 Fitting Logistic Regression Models to Data from Complex
Sample Surveys, 233
Exercises, 242
7 Logistic Regression for Matched Case-Control Studies 243
7.1 Introduction, 243
7.2 Methods For Assessment of Fit in a 1–M Matched
Study, 248
7.3 An Example Using the Logistic Regression Model in a 1–1
Matched Study, 251
7.4 An Example Using the Logistic Regression Model in a 1–M
Matched Study, 260
Exercises, 267
8 Logistic Regression Models for Multinomial and Ordinal
Outcomes 269
8.1 The Multinomial Logistic Regression Model, 269
8.1.1 Introduction to the Model and Estimation of Model
Parameters, 269
8.1.2 Interpreting and Assessing the Significance of the
Estimated Coefficients, 272
8.1.3 Model-Building Strategies for Multinomial Logistic
Regression, 278
8.1.4 Assessment of Fit and Diagnostic Statistics for the
Multinomial Logistic Regression Model, 283
8.2 Ordinal Logistic Regression Models, 289
8.2.1 Introduction to the Models, Methods for Fitting, and
Interpretation of Model Parameters, 289
8.2.2 Model Building Strategies for Ordinal Logistic
Regression Models, 305
Exercises, 310
13. x contents
9 Logistic Regression Models for the Analysis of Correlated Data 313
9.1 Introduction, 313
9.2 Logistic Regression Models for the Analysis of Correlated
Data, 315
9.3 Estimation Methods for Correlated Data Logistic Regression
Models, 318
9.4 Interpretation of Coefficients from Logistic Regression
Models for the Analysis of Correlated Data, 323
9.4.1 Population Average Model, 324
9.4.2 Cluster-Specific Model, 326
9.4.3 Alternative Estimation Methods for the
Cluster-Specific Model, 333
9.4.4 Comparison of Population Average and
Cluster-Specific Model, 334
9.5 An Example of Logistic Regression Modeling with
Correlated Data, 337
9.5.1 Choice of Model for Correlated Data Analysis, 338
9.5.2 Population Average Model, 339
9.5.3 Cluster-Specific Model, 344
9.5.4 Additional Points to Consider when Fitting Logistic
Regression Models to Correlated Data, 351
9.6 Assessment of Model Fit, 354
9.6.1 Assessment of Population Average Model Fit, 354
9.6.2 Assessment of Cluster-Specific Model Fit, 365
9.6.3 Conclusions, 374
Exercises, 375
10 Special Topics 377
10.1 Introduction, 377
10.2 Application of Propensity Score Methods in Logistic
Regression Modeling, 377
10.3 Exact Methods for Logistic Regression Models, 387
10.4 Missing Data, 395
10.5 Sample Size Issues when Fitting Logistic Regression
Models, 401
10.6 Bayesian Methods for Logistic Regression, 408
10.6.1 The Bayesian Logistic Regression Model, 410
10.6.2 MCMC Simulation, 411
14. contents xi
10.6.3 An Example of a Bayesian Analysis and Its
Interpretation, 419
10.7 Other Link Functions for Binary Regression Models, 434
10.8 Mediation, 441
10.8.1 Distinguishing Mediators from Confounders, 441
10.8.2 Implications for the Interpretation of an Adjusted
Logistic Regression Coefficient, 443
10.8.3 Why Adjust for a Mediator? 444
10.8.4 Using Logistic Regression to Assess Mediation:
Assumptions, 445
10.9 More About Statistical Interaction, 448
10.9.1 Additive versus Multiplicative Scale–Risk
Difference versus Odds Ratios, 448
10.9.2 Estimating and Testing Additive Interaction, 451
Exercises, 456
References 459
Index 479
15. Preface to the Third Edition
This third edition of Applied Logistic Regression comes 12 years after the 2000
publication of the second edition. During this interval there has been considerable
effort researching statistical aspects of the logistic regression model—particularly
when the outcomes are correlated. At the same time, capabilities of computer soft-
ware packages to fit models grew impressively to the point where they now provide
access to nearly every aspect of model development a researcher might need. As is
well-recognized in the statistical community, the inherent danger of this easy-to-use
software is that investigators have at their disposal powerful computational tools,
about which they may have only limited understanding. It is our hope that this third
edition will help bridge the gap between the outstanding theoretical developments
and the need to apply these methods to diverse fields of inquiry.
As was the case in the first two editions, the primary objective of the third edition
is to provide an introduction to the underlying theory of the logistic regression
model, with a major focus on the application, using real data sets, of the available
methods to explore the relationship between a categorical outcome variable and a
set of covariates. The materials in this book have evolved over the past 12 years
as a result of our teaching and consulting experiences. We have used this book to
teach parts of graduate level survey courses, quarter- or semester-long courses, as
well as focused short courses to working professionals. We assume that students
have a solid foundation in linear regression methodology and contingency table
analysis. The positive feedback we have received from students or professionals
taking courses using this book or using it for self-learning or reference, provides
us with some assurance that the approach we used in the first two editions worked
reasonably well; therefore, we have followed that approach in this new edition.
The approach we take is to develop the logistic regression model from a regres-
sion analysis point of view. This is accomplished by approaching logistic regression
in a manner analogous to what would be considered good statistical practice for
linear regression. This differs from the approach used by other authors who have
begun their discussion from a contingency table point of view. While the contin-
gency table approach may facilitate the interpretation of the results, we believe
that it obscures the regression aspects of the analysis. Thus, discussion of the inter-
pretation of the model is deferred until the regression approach to the analysis is
firmly established.
xiii
16. xiv preface to the third edition
To a large extent, there are no major differences between the many software
packages that include logistic regression modeling. When a particular approach
is available in a limited number of packages, it will be noted in this text. In
general, analyses in this book have been performed using STATA [Stata Corp.
(2011)]. This easy-to-use package combines excellent graphics and analysis rou-
tines; is fast; is compatible across Macintosh, Windows and UNIX platforms; and
interacts well with Microsoft Word. Other major statistical packages employed
at various points during the preparation of this text include SAS [SAS Institute
Inc. (2009)], OpenBUGS [Lunn et al. (2009)] and R [R Development Core Team
(2010)]. For all intents and purposes the results produced were the same regard-
less of which package we used. Reported numeric results have been rounded from
figures obtained from computer output and thus may differ slightly from those that
would be obtained in a replication of our analyses or from calculations based on
the reported results. When features or capabilities of the programs differed in an
important way, we noted them by the names given rather than by their bibliographic
citation.
We feel that this new edition benefits greatly from the addition of a number of
key topics. These include the following:
1. An expanded presentation of numerous new techniques for model-building,
including methods for determining the scale of continuous covariates and
assessing model performance.
2. An expanded presentation of regression modeling of complex sample survey
data.
3. An expanded development of the use of logistic regression modeling in
matched studies, as well as with multinomial and ordinal scaled responses.
4. A new chapter dealing with models and methods for correlated categorical
response data.
5. A new chapter developing a number of important applications either miss-
ing or expanded from the previous editions. These include propensity score
methods, exact methods for logistic regression, sample size issues, Bayesian
logistic regression, and other link functions for binary outcome regression
models. This chapter concludes with sections dealing with the epidemiologic
concepts of mediation and additive interaction.
As was the case for the second edition, all of the data sets used in the text are
available at a web site at John Wiley & Sons, Inc.
http://wiley.mpstechnologies.com/wiley/BOBContent/searchLPBobContent.do
In addition, the data may also be found, by permission of John Wiley &
Sons Inc., in the archive of statistical data sets maintained at the University of
Massachusetts at http://www.umass.edu/statdata/statdata in the logistic regression
section.
We would like to express our sincere thanks and appreciation to our colleagues,
students, and staff at all of the institutions we have been fortunate to have been
affiliated with since the first edition was conceived more than 25 years ago. This
17. preface to the third edition xv
includes not only our primary university affiliations but also the locations where we
spent extended sabbatical leaves and special research assignments. For this edition
we would like to offer special thanks to Sharon Schwartz and Melanie Wall from
Columbia University who took the lead in writing the two final sections of the book
dealing with mediation and additive interaction. We benefited greatly from their
expertise in applying these methods in epidemiologic settings. We greatly appreci-
ate the efforts of Danielle Sullivan, a PhD candidate in biostatistics at Ohio State,
for assisting in the preparation of the index for this book. Colleagues in the Division
of Biostatistics and the Division of Epidemiology at Ohio State were helpful in
their review of selected sections of the book. These include Bo Lu for his insights
on propensity score methods and David Murray, Sigrún Alba Jóhannesdóttir, and
Morten Schmidt for their thoughts concerning the sections on mediation analysis
and additive interaction. Data sets form the basis for the way we present our mate-
rials and these are often hard to come by. We are very grateful to Karla Zadnik,
Donald O. Mutti, Loraine T. Sinnott, and Lisa A. Jones-Jordan from The Ohio
State University College of Optometry as well as to the Collaborative Longitudinal
Evaluation of Ethnicity and Refractive Error (CLEERE) Study Group for making
the myopia data available to us. We would also like to acknowledge Cynthia A.
Fontanella from the College of Social Work at Ohio State for making both the
Adolescent Placement and the Polypharmacy data sets available to us. A special
thank you to Gary Phillips from the Center for Biostatistics at OSU for helping
us identify these valuable data sets (that he was the first one to analyze) as well
as for his assistance with some programming issues with Stata. We thank Gordon
Fitzgerald of the Center for Outcomes Research (COR) at the University of Mas-
sachusetts / Worcester for his help in obtaining the small subset of data used in
this text from the Global Longitudinal Study of Osteoporosis in Women (GLOW)
Study’s main data set. In addition, we thank him for his many helpful comments
on the use of propensity scores in logistic regression modeling. We thank Turner
Osler for providing us with the small subset of data obtained from a large data set
he abstracted from the National Burn Repository 2007 Report, that we used for the
burn injury analyses. In many instances the data sets we used were modified from
the original data sets in ways to allow us to illustrate important modeling tech-
niques. As such, we issue a general disclaimer here, and do so again throughout
the text, that results presented in this text do not apply to the original data.
Before we began this revision, numerous individuals reviewed our proposal
anonymously and made many helpful suggestions. They confirmed that what we
planned to include in this book would be of use to them in their research and teach-
ing. We thank these individuals and, for the most part, addressed their comments.
Many of these reviewers suggested that we include computer code to run logistic
regression in a variety of packages, especially R. We decided not to do this for
two reasons: we are not statistical computing specialists and did not want to have
to spend time responding to email queries on our code. Also, capabilities of com-
puter packages change rapidly and we realized that whatever we decided to include
here would likely be out of date before the book was even published. We refer
readers interested in code specific to various packages to a web site maintained
18. xvi preface to the third edition
by Academic Technology Services (ATS) at UCLA where they use a variety of
statistical packages to replicate the analyses for the examples in the second edition
of this text as well as numerous other statistical texts. The link to this web site is
http://www.ats.ucla.edu/stat/.
Finally, we would like to thank Steve Quigley, Susanne Steitz-Filler, Sari Fried-
man and the production staff at John Wiley & Sons Inc. for their help in bringing
this project to completion.
David W. Hosmer, Jr.
Stanley Lemeshow
Rodney X. Sturdivant∗
Stowe, Vermont
Columbus, Ohio
West Point, New York
January 2013
∗
The views expressed in this book are those of the author and do not reflect the official policy or
position of the Department of the Army, Department of Defense, or the U.S. Government.
20. 2 introduction to the logistic regression model
Example 1: Table 1.1 lists the age in years (AGE), and presence or absence of
evidence of significant coronary heart disease (CHD) for 100 subjects in a hypo-
thetical study of risk factors for heart disease. The table also contains an identifier
variable (ID) and an age group variable (AGEGRP). The outcome variable is CHD,
which is coded with a value of “0” to indicate that CHD is absent, or “1” to indicate
that it is present in the individual. In general, any two values could be used, but
we have found it most convenient to use zero and one. We refer to this data set as
the CHDAGE data.
It is of interest to explore the relationship between AGE and the presence or
absence of CHD in this group. Had our outcome variable been continuous rather
than binary, we probably would begin by forming a scatterplot of the outcome
versus the independent variable. We would use this scatterplot to provide an impres-
sion of the nature and strength of any relationship between the outcome and the
independent variable. A scatterplot of the data in Table 1.1 is given in Figure 1.1.
In this scatterplot, all points fall on one of two parallel lines representing the
absence of CHD (y = 0) or the presence of CHD (y = 1). There is some tendency
for the individuals with no evidence of CHD to be younger than those with evidence
of CHD. While this plot does depict the dichotomous nature of the outcome variable
quite clearly, it does not provide a clear picture of the nature of the relationship
between CHD and AGE.
The main problem with Figure 1.1 is that the variability in CHD at all ages is
large. This makes it difficult to see any functional relationship between AGE and
CHD. One common method of removing some variation, while still maintaining
the structure of the relationship between the outcome and the independent variable,
is to create intervals for the independent variable and compute the mean of the
outcome variable within each group. We use this strategy by grouping age into the
categories (AGEGRP) defined in Table 1.1. Table 1.2 contains, for each age group,
the frequency of occurrence of each outcome, as well as the percent with CHD
present.
By examining this table, a clearer picture of the relationship begins to emerge. It
shows that as age increases, the proportion (mean) of individuals with evidence of
CHD increases. Figure 1.2 presents a plot of the percent of individuals with CHD
versus the midpoint of each age interval. This plot provides considerable insight
into the relationship between CHD and AGE in this study, but the functional form
for this relationship needs to be described. The plot in this figure is similar to what
one might obtain if this same process of grouping and averaging were performed
in a linear regression. We note two important differences.
The first difference concerns the nature of the relationship between the outcome
and independent variables. In any regression problem the key quantity is the mean
value of the outcome variable, given the value of the independent variable. This
quantity is called the conditional mean and is expressed as “E(Y|x)” where Y
denotes the outcome variable and x denotes a specific value of the independent
variable. The quantity E(Y|x) is read “the expected value of Y, given the value x”.
In linear regression we assume that this mean may be expressed as an equation
23. introduction 5
Table 1.1 (Continued)
ID AGE AGEGRP CHD
91 60 8 0
92 60 8 1
93 61 8 1
94 62 8 1
95 62 8 1
96 63 8 1
97 64 8 0
98 64 8 1
99 65 8 1
100 69 8 1
0
0.2
0.4
0.6
0.8
1
Coronary
heart
disease
20 30 40 50 60 70
Age (years)
Figure 1.1 Scatterplot of presence or absence of coronary heart disease (CHD) by AGE for 100
subjects.
linear in x (or some transformation of x or Y), such as
E(Y|x) = β0 + β1x.
This expression implies that it is possible for E(Y|x) to take on any value as x
ranges between −∞ and +∞.
The column labeled “Mean” in Table 1.2 provides an estimate of E(Y|x). We
assume, for purposes of exposition, that the estimated values plotted in Figure 1.2
are close enough to the true values of E(Y|x) to provide a reasonable assessment of
the functional relationship between CHD and AGE. With a dichotomous outcome
variable, the conditional mean must be greater than or equal to zero and less than
24. 6 introduction to the logistic regression model
Table 1.2 Frequency Table of Age Group by CHD
Coronary Heart Disease
Age Group n Absent Present Mean
20–29 10 9 1 0.100
30–34 15 13 2 0.133
35–39 12 9 3 0.250
40–44 15 10 5 0.333
45–49 13 7 6 0.462
50–54 8 3 5 0.625
55–59 17 4 13 0.765
60–69 10 2 8 0.800
Total 100 57 43 0.430
0
0.2
0.4
0.6
0.8
1
Coronary
heart
disease
(mean)
20 30 40 50 60 70
Age (years)
Figure 1.2 Plot of the percentage of subjects with CHD in each AGE group.
or equal to one (i.e., 0 ≤ E(Y|x) ≤ 1). This can be seen in Figure 1.2. In addition,
the plot shows that this mean approaches zero and one “gradually”. The change in
the E(Y|x) per unit change in x becomes progressively smaller as the conditional
mean gets closer to zero or one. The curve is said to be S-shaped and resembles a
plot of the cumulative distribution of a continuous random variable. Thus, it should
not seem surprising that some well-known cumulative distributions have been used
to provide a model for E(Y|x) in the case when Y is dichotomous. The model we
use is based on the logistic distribution.
Many distribution functions have been proposed for use in the analysis of a
dichotomous outcome variable. Cox and Snell (1989) discuss some of these. There
25. introduction 7
are two primary reasons for choosing the logistic distribution. First, from a mathe-
matical point of view, it is an extremely flexible and easily used function. Second,
its model parameters provide the basis for clinically meaningful estimates of effect.
A detailed discussion of the interpretation of the model parameters is given in
Chapter 3.
In order to simplify notation, we use the quantity π(x) = E(Y|x) to represent
the conditional mean of Y given x when the logistic distribution is used. The
specific form of the logistic regression model we use is:
π(x) =
eβ0+β1x
1 + eβ0+β1x
. (1.1)
A transformation of π(x) that is central to our study of logistic regression is the
logit transformation. This transformation is defined, in terms of π(x), as:
g(x) = ln
π (x)
1 − π(x)
= β0 + β1x.
The importance of this transformation is that g(x) has many of the desirable prop-
erties of a linear regression model. The logit, g(x), is linear in its parameters, may
be continuous, and may range from −∞ to +∞, depending on the range of x.
The second important difference between the linear and logistic regression
models concerns the conditional distribution of the outcome variable. In the linear
regression model we assume that an observation of the outcome variable may be
expressed as y = E(Y|x) + ε. The quantity ε is called the error and expresses an
observation’s deviation from the conditional mean. The most common assumption
is that ε follows a normal distribution with mean zero and some variance that is
constant across levels of the independent variable. It follows that the conditional
distribution of the outcome variable given x is normal with mean E(Y|x), and a
variance that is constant. This is not the case with a dichotomous outcome vari-
able. In this situation, we may express the value of the outcome variable given x
as y = π(x) + ε. Here the quantity ε may assume one of two possible values. If
y = 1 then ε = 1 − π(x) with probability π(x), and if y = 0 then ε = −π(x) with
probability 1 − π(x). Thus, ε has a distribution with mean zero and variance equal
to π(x)[1 − π(x)]. That is, the conditional distribution of the outcome variable
follows a binomial distribution with probability given by the conditional mean,
π(x).
In summary, we have shown that in a regression analysis when the outcome
variable is dichotomous:
1. The model for the conditional mean of the regression equation must be
bounded between zero and one. The logistic regression model, π(x), given
in equation (1.1), satisfies this constraint.
2. The binomial, not the normal, distribution describes the distribution of the
errors and is the statistical distribution on which the analysis is based.
26. 8 introduction to the logistic regression model
3. The principles that guide an analysis using linear regression also guide us in
logistic regression.
1.2 FITTING THE LOGISTIC REGRESSION MODEL
Suppose we have a sample of n independent observations of the pair (xi, yi),
i = 1, 2, . . . , n, where yi denotes the value of a dichotomous outcome variable and
xi is the value of the independent variable for the ith subject. Furthermore, assume
that the outcome variable has been coded as 0 or 1, representing the absence or the
presence of the characteristic, respectively. This coding for a dichotomous outcome
is used throughout the text. Fitting the logistic regression model in equation (1.1)
to a set of data requires that we estimate the values of β0 and β1, the unknown
parameters.
In linear regression, the method used most often for estimating unknown param-
eters is least squares. In that method we choose those values of β0 and β1 that
minimize the sum-of-squared deviations of the observed values of Y from the pre-
dicted values based on the model. Under the usual assumptions for linear regression
the method of least squares yields estimators with a number of desirable statistical
properties. Unfortunately, when the method of least squares is applied to a model
with a dichotomous outcome, the estimators no longer have these same properties.
The general method of estimation that leads to the least squares function under
the linear regression model (when the error terms are normally distributed) is
called maximum likelihood. This method provides the foundation for our approach
to estimation with the logistic regression model throughout this text. In a general
sense, the method of maximum likelihood yields values for the unknown parameters
that maximize the probability of obtaining the observed set of data. In order to apply
this method we must first construct a function, called the likelihood function. This
function expresses the probability of the observed data as a function of the unknown
parameters. The maximum likelihood estimators of the parameters are the values
that maximize this function. Thus, the resulting estimators are those that agree most
closely with the observed data. We now describe how to find these values for the
logistic regression model.
If Y is coded as 0 or 1 then the expression for π(x) given in equation (1.1)
provides (for an arbitrary value of β = (β0, β1), the vector of parameters) the
conditional probability that Y is equal to 1 given x. This is denoted as π(x).
It follows that the quantity 1 − π(x) gives the conditional probability that Y is
equal to zero given x, Pr(Y = 0|x). Thus, for those pairs (xi, yi), where yi = 1,
the contribution to the likelihood function is π(xi), and for those pairs where
yi = 0, the contribution to the likelihood function is 1 − π(xi), where the quantity
π(xi) denotes the value of π(x) computed at xi. A convenient way to express the
contribution to the likelihood function for the pair (xi, yi) is through the expression
π(xi)yi [1 − π(xi)]1−yi . (1.2)
27. fitting the logistic regression model 9
As the observations are assumed to be independent, the likelihood function is
obtained as the product of the terms given in equation (1.2) as follows:
l(β) =
n
i=1
π(xi)yi [1 − π(xi)]1−yi . (1.3)
The principle of maximum likelihood states that we use as our estimate of β
the value that maximizes the expression in equation (1.3). However, it is easier
mathematically to work with the log of equation (1.3). This expression, the log-
likelihood, is defined as
L(β) = ln[l(β)] =
n
i=1
{yi ln[π(xi)] + (1 − yi) ln[1 − π(xi)]}. (1.4)
To find the value of β that maximizes L(β) we differentiate L(β) with respect to
β0 and β1 and set the resulting expressions equal to zero. These equations, known
as the likelihood equations, are
[yi − π(xi)] = 0 (1.5)
and
xi[yi − π(xi)] = 0. (1.6)
In equations (1.5) and (1.6) it is understood that the summation is over i varying
from 1 to n. (The practice of suppressing the index and range of summation, when
these are clear, is followed throughout this text.)
In linear regression, the likelihood equations, obtained by differentiating the
sum-of-squared deviations function with respect to β are linear in the unknown
parameters and thus are easily solved. For logistic regression the expressions in
equations (1.5) and (1.6) are nonlinear in β0 and β1, and thus require special
methods for their solution. These methods are iterative in nature and have been
programmed into logistic regression software. For the moment, we need not be
concerned about these iterative methods and view them as a computational detail
that is taken care of for us. The interested reader may consult the text by McCullagh
and Nelder (1989) for a general discussion of the methods used by most programs.
In particular, they show that the solution to equations (1.5) and (1.6) may be
obtained using an iterative weighted least squares procedure.
The value of β given by the solution to equations (1.5) and (1.6) is called
the maximum likelihood estimate and is denoted as β̂. In general, the use of the
symbol “” denotes the maximum likelihood estimate of the respective quantity.
For example, π̂(xi) is the maximum likelihood estimate of π(xi). This quantity
provides an estimate of the conditional probability that Y is equal to 1, given that
x is equal to xi. As such, it represents the fitted or predicted value for the logistic
regression model. An interesting consequence of equation (1.5) is that
n
i=1
yi =
n
i=1
π̂(xi).
28. 10 introduction to the logistic regression model
Table 1.3 Results of Fitting the Logistic Regression Model
to the CHDAGE Data, n = 100
Variable Coeff. Std. Err. z p
Age 0.111 0.0241 4.61 0.001
Constant −5.309 1.1337 −4.68 0.001
Log-likelihood = −53.676546.
That is, the sum of the observed values of y is equal to the sum of the predicted
(expected) values. We use this property in later chapters when we discuss assessing
the fit of the model.
As an example, consider the data given in Table 1.1. Use of a logistic regres-
sion software package, with continuous variable AGE as the independent variable,
produces the output in Table 1.3.
The maximum likelihood estimates of β0 and β1 are β̂0 = −5.309 and β̂1 =
0.111. The fitted values are given by the equation
π̂(x) =
e−5.309+0.111×AGE
1 + e−5.309+0.111×AGE
(1.7)
and the estimated logit, ĝ(x), is given by the equation
ĝ(x) = −5.309 + 0.111 × AGE. (1.8)
The log-likelihood given in Table 1.3 is the value of equation (1.4) computed using
β̂0 and β̂1.
Three additional columns are present in Table 1.3. One contains estimates of the
standard errors of the estimated coefficients, the next column displays the ratios of
the estimated coefficients to their estimated standard errors, and the last column
displays a p-value. These quantities are discussed in the next section.
Following the fitting of the model we begin to evaluate its adequacy.
1.3 TESTING FOR THE SIGNIFICANCE OF THE COEFFICIENTS
In practice, the modeling of a set of data, as we show in Chapters 4, 7, and 8, is
a much more complex process than one of simply fitting and testing. The methods
we present in this section, while simplistic, do provide essential building blocks
for the more complex process.
After estimating the coefficients, our first look at the fitted model commonly
concerns an assessment of the significance of the variables in the model. This
usually involves formulation and testing of a statistical hypothesis to determine
whether the independent variables in the model are “significantly” related to the
outcome variable. The method for performing this test is quite general, and differs
from one type of model to the next only in the specific details. We begin by
29. testing for the significance of the coefficients 11
discussing the general approach for a single independent variable. The multivariable
case is considered in Chapter 2.
One approach to testing for the significance of the coefficient of a variable in
any model relates to the following question. Does the model that includes the vari-
able in question tell us more about the outcome (or response) variable than a model
that does not include that variable? This question is answered by comparing the
observed values of the response variable to those predicted by each of two models;
the first with, and the second without, the variable in question. The mathematical
function used to compare the observed and predicted values depends on the partic-
ular problem. If the predicted values with the variable in the model are better, or
more accurate in some sense, than when the variable is not in the model, then we
feel that the variable in question is “significant”. It is important to note that we are
not considering the question of whether the predicted values are an accurate rep-
resentation of the observed values in an absolute sense (this is called goodness of
fit). Instead, our question is posed in a relative sense. The assessment of goodness
of fit is a more complex question that is discussed in detail in Chapter 5.
The general method for assessing significance of variables is easily illustrated
in the linear regression model, and its use there motivates the approach used for
logistic regression. A comparison of the two approaches highlights the differences
between modeling continuous and dichotomous response variables.
In linear regression, one assesses the significance of the slope coefficient by
forming what is referred to as an analysis of variance table. This table partitions
the total sum-of-squared deviations of observations about their mean into two parts:
(1) the sum-of-squared deviations of observations about the regression line SSE
(or residual sum-of-squares) and (2) the sum-of-squares of predicted values, based
on the regression model, about the mean of the dependent variable SSR (or due
regression sum-of-squares). This is just a convenient way of displaying the com-
parison of observed to predicted values under two models. In linear regression, the
comparison of observed and predicted values is based on the square of the distance
between the two. If yi denotes the observed value and ŷi denotes the predicted
value for the ith individual under the model, then the statistic used to evaluate this
comparison is
SSE =
n
i=1
(yi − ŷi)2
.
Under the model not containing the independent variable in question the only
parameter is β0, and β̂0 = y, the mean of the response variable. In this case, ŷi = y
and SSE is equal to the total sum-of-squares. When we include the independent
variable in the model, any decrease in SSE is due to the fact that the slope coefficient
for the independent variable is not zero. The change in the value of SSE is due to
the regression source of variability, denoted SSR. That is,
SSR =
n
i=1
yi − yi
2
−
n
i=1
yi − ŷi
2
.
30. 12 introduction to the logistic regression model
In linear regression, interest focuses on the size of SSR. A large value suggests
that the independent variable is important, whereas a small value suggests that the
independent variable is not helpful in predicting the response.
The guiding principle with logistic regression is the same: compare observed
values of the response variable to predicted values obtained from models, with and
without the variable in question. In logistic regression, comparison of observed to
predicted values is based on the log-likelihood function defined in equation (1.4).
To better understand this comparison, it is helpful conceptually to think of an
observed value of the response variable as also being a predicted value resulting
from a saturated model. A saturated model is one that contains as many parameters
as there are data points. (A simple example of a saturated model is fitting a linear
regression model when there are only two data points, n = 2.)
The comparison of observed to predicted values using the likelihood function is
based on the following expression:
D = −2 ln
(likelihood of the fitted model)
(likelihood of the saturated model)
. (1.9)
The quantity inside the large brackets in the expression above is called the likelihood
ratio. Using minus twice its log is necessary to obtain a quantity whose distribution
is known and can therefore be used for hypothesis testing purposes. Such a test is
called the likelihood ratio test. Using equation (1.4), equation (1.9) becomes
D = −2
n
i=1
yi ln
π̂i
yi
+ (1 − yi) ln
1 − π̂i
1 − yi
, (1.10)
where π̂i = π̂(xi).
The statistic, D, in equation (1.10) is called the deviance, and for logistic
regression, it plays the same role that the residual sum-of-squares plays in lin-
ear regression. In fact, the deviance as shown in equation (1.10), when computed
for linear regression, is identically equal to the SSE.
Furthermore, in a setting as shown in Table 1.1, where the values of the outcome
variable are either 0 or 1, the likelihood of the saturated model is identically equal
to 1.0. Specifically, it follows from the definition of a saturated model that π̂i = yi
and the likelihood is
l(saturated model) =
n
i=1
y
yi
i × (1 − yi)(1−yi)
= 1.0.
Thus it follows from equation (1.9) that the deviance is
D = −2 ln(likelihood of the fitted model). (1.11)
Some software packages report the value of the deviance in equation (1.11) rather
than the log-likelihood for the fitted model. In the context of testing for the signif-
icance of a fitted model, we want to emphasize that we think of the deviance in
the same way that we think of the residual sum-of-squares in linear regression.
31. testing for the significance of the coefficients 13
In particular, to assess the significance of an independent variable we compare
the value of D with and without the independent variable in the equation. The
change in D due to the inclusion of the independent variable in the model is:
G = D(model without the variable) − D(model with the variable).
This statistic, G, plays the same role in logistic regression that the numerator of
the partial F-test does in linear regression. Because the likelihood of the satu-
rated model is always common to both values of D being differenced, G can be
expressed as
G = −2 ln
(likelihood without the variable)
(likelihood with the variable)
. (1.12)
For the specific case of a single independent variable, it is easy to show that
when the variable is not in the model, the maximum likelihood estimate of β0 is
ln(n1/n0) where n1 = yi and n0 = (1 − yi) and the predicted probability for
all subjects is constant, and equal to n1/n. In this setting, the value of G is:
G = −2 ln
⎡
⎢
⎢
⎢
⎢
⎣
n1
n
n1
n0
n
n0
n
i=1
π̂
yi
i (1 − π̂i)(1−yi)
⎤
⎥
⎥
⎥
⎥
⎦
, (1.13)
or
G = 2
n
i=1
yi ln
π̂i + (1 − yi) ln(1 − π̂i)
−
n1 ln
n1 + n0 ln(n0) − n ln(n)
. (1.14)
Under the hypothesis that β1 is equal to zero, the statistic G follows a chi-square
distribution with 1 degree of freedom. Additional mathematical assumptions are
needed; however, for the above case they are rather nonrestrictive, and involve
having a sufficiently large sample size, n, and enough subjects with both y = 0
and y = 1. We discuss in later chapters that, as far as sample size is concerned,
the key determinant is min(n0, n1).
As an example, we consider the model fit to the data in Table 1.1, whose
estimated coefficients and log-likelihood are given in Table 1.3. For these data the
sample size is sufficiently large as n1 = 43 and n0 = 57. Evaluating G as shown
in equation (1.14) yields
G = 2{−53.677 − [43 ln(43) + 57 ln(57) − 100 ln(100)]}
= 2[−53.677 − (−68.331)] = 29.31.
32. 14 introduction to the logistic regression model
The first term in this expression is the log-likelihood from the model contain-
ing age (see Table 1.3), and the remainder of the expression simply substitutes
n1 and n0 into the second part of equation (1.14). We use the symbol χ2(ν) to
denote a chi-square random variable with ν degrees of freedom. Using this nota-
tion, the p-value associated with this test is P[χ2(1) 29.31] 0.001; thus, we
have convincing evidence that AGE is a significant variable in predicting CHD.
This is merely a statement of the statistical evidence for this variable. Other impor-
tant factors to consider before concluding that the variable is clinically important
would include the appropriateness of the fitted model, as well as inclusion of other
potentially important variables.
As all logistic regression software report either the value of the log-likelihood
or the value of D, it is easy to check for the significance of the addition of new
terms to the model or to verify a reported value of G. In the simple case of a
single independent variable, we first fit a model containing only the constant term.
Next, we fit a model containing the independent variable along with the constant.
This gives rise to another log-likelihood. The likelihood ratio test is obtained by
multiplying the difference between these two values by −2.
In the current example, the log-likelihood for the model containing only a con-
stant term is −68.331. Fitting a model containing the independent variable (AGE)
along with the constant term results in the log-likelihood shown in Table 1.3 of
−53.677. Multiplying the difference in these log-likelihoods by −2 gives
−2 × [−68.331 − (−53.677)] = −2 × (−14.655) = 29.31.
This result, along with the associated p-value for the chi-square distribution, is
commonly reported in logistic regression software packages.
There are two other statistically equivalent tests: the Wald test and the Score test.
The assumptions needed for each of these is the same as those of the likelihood
ratio test in equation (1.14). A more complete discussion of these three tests and
their assumptions may be found in Rao (1973).
The Wald test is equal to the ratio of the maximum likelihood estimate of the
slope parameter, β̂1, to an estimate of its standard error. Under the null hypothesis
and the sample size assumptions, this ratio follows a standard normal distribution.
While we have not yet formally discussed how the estimates of the standard errors
of the estimated parameters are obtained, they are routinely printed out by computer
software. For example, the Wald test for the coefficient for AGE in Table 1.3 is
provided in the column headed z and is
W =
β̂1
SE(β̂1)
=
0.111
0.024
= 4.61.
The two-tailed p-value, provided in the last column of Table 1.3, is P(|z| 4.61)
0.001, where z denotes a random variable following the standard normal distribu-
tion. Some software packages display the statistic W2
= z2
, which is distributed
as chi-square with 1 degree of freedom. Hauck and Donner (1977) examined the
performance of the Wald test and found that it behaved in an aberrant manner, often
failing to reject the null hypothesis when the coefficient was significant using the
33. confidence interval estimation 15
likelihood ratio test. Thus, they recommended (and we agree) that the likelihood
ratio test is preferred. We note that while the assertions of Hauk and Donner are
true, we have never seen huge differences in the values of G and W2
. In prac-
tice, the more troubling situation is when the values are close, and one test has
p 0.05 and the other has p 0.05. When this occurs, we use the p-value from
the likelihood ratio test.
A test for the significance of a variable that does not require computing the
estimate of the coefficient is the score test. Proponents of the score test cite this
reduced computational effort as its major advantage. Use of the test is limited by
the fact that it is not available in many software packages. The score test is based
on the distribution theory of the derivatives of the log-likelihood. In general, this
is a multivariate test requiring matrix calculations that are discussed in Chapter 2.
In the univariate case, this test is based on the conditional distribution of
the derivative in equation (1.6), given the derivative in equation (1.5). In this
case, we can write down an expression for the Score test. The test uses the
value of equation (1.6) computed using β0 = ln(n1/n0) and β1 = 0. As noted
earlier, under these parameter values, π̂ = n1/n = y and the left-hand side of
equation (1.6) becomes xi(yi − y). It may be shown that the estimated variance
is y(1 − y) (xi − x)2
. The test statistic for the score test (ST) is
ST =
n
i=1
xi(yi − y)
y(1 − y)
n
i=1
(xi − x)2
.
As an example of the score test, consider the model fit to the data in Table 1.1.
The value of the test statistic for this example is
ST =
296.66
√
3333.742
= 5.14
and the two tailed p-value is P(|z| 5.14) 0.001. We note that, for this example,
the values of the three test statistics are nearly the same (note:
√
G = 5.41).
In summary, the method for testing the significance of the coefficient of a
variable in logistic regression is similar to the approach used in linear regression;
however, it is based on the likelihood function for a dichotomous outcome variable
under the logistic regression model.
1.4 CONFIDENCE INTERVAL ESTIMATION
An important adjunct to testing for significance of the model, discussed in
Section 1.3, is calculation and interpretation of confidence intervals for parameters
of interest. As is the case in linear regression we can obtain these for the slope,
intercept and the “line” (i.e., the logit). In some settings it may be of interest to
provide interval estimates for the fitted values (i.e., the predicted probabilities).
34. 16 introduction to the logistic regression model
The basis for construction of the interval estimators is the same statistical theory
we used to formulate the tests for significance of the model. In particular, the confi-
dence interval estimators for the slope and intercept are, most often, based on their
respective Wald tests and are sometimes referred to as Wald-based confidence inter-
vals. The endpoints of a 100(1 − α)% confidence interval for the slope coefficient
are
β̂1 ± z1−α/2
SE(β̂1) (1.15)
and for the intercept they are
β̂0 ± z1−α/2
SE(β̂0) (1.16)
where z1−α/2 is the upper 100(1 − α/2)% point from the standard normal dis-
tribution and
SE(·) denotes a model-based estimator of the standard error of the
respective parameter estimator. We defer discussion of the actual formula used for
calculating the estimators of the standard errors to Chapter 2. For the moment, we
use the fact that estimated values are provided in the output following the fit of a
model and, in addition, many packages also provide the endpoints of the interval
estimates.
As an example, consider the model fit to the data in Table 1.1 regressing
AGE on the presence or absence of CHD. The results are presented in Table 1.3.
The endpoints of a 95 percent confidence interval for the slope coefficient from
equation (1.15) are 0.111 ± 1.96 × 0.0241, yielding the interval (0.064, 0.158). We
defer a detailed discussion of the interpretation of these results to Chapter 3. Briefly,
the results suggest that the change in the log-odds of CHD per one year increase
in age is 0.111 and the change could be as little as 0.064 or as much as 0.158 with
95 percent confidence.
As is the case with any regression model, the constant term provides an estimate
of the response at x = 0 unless the independent variable has been centered at some
clinically meaningful value. In our example, the constant provides an estimate of
the log-odds ratio of CHD at zero years of age. As a result, the constant term, by
itself, has no useful clinical interpretation. In any event, from equation (1.16), the
endpoints of a 95 percent confidence interval for the constant are −5.309 ± 1.96 ×
1.1337, yielding the interval (−7.531, −3.087).
The logit is the linear part of the logistic regression model and, as such, is most
similar to the fitted line in a linear regression model. The estimator of the logit is
ĝ(x) = β̂0 + β̂1x. (1.17)
The estimator of the variance of the estimator of the logit requires obtaining the
variance of a sum. In this case it is
Var[ĝ(x)] =
Var(β̂0) + x2
Var(β̂1) + 2x
Cov(β̂0, β̂1). (1.18)
In general, the variance of a sum is equal to the sum of the variance of each
term and twice the covariance of each possible pair of terms formed from the
35. confidence interval estimation 17
Table 1.4 Estimated Covariance Matrix of the Estimated
Coefficients in Table 1.3
Age Constant
Age 0.000579
Constant −0.026677 1.28517
components of the sum. The endpoints of a 100(1 − α)% Wald-based confidence
interval for the logit are
ĝ(x) ± z1−α/2
SE[ĝ(x)], (1.19)
where
SE[ĝ(x)] is the positive square root of the variance estimator in
equation (1.18).
The estimated logit for the fitted model in Table 1.3 is shown in equation (1.8). In
order to evaluate equation (1.18) for a specific age we need the estimated covariance
matrix. This matrix can be obtained from the output from all logistic regression
software packages. How it is displayed varies from package to package, but the
triangular form shown in Table 1.4 is a common one.
The estimated logit from equation (1.8) for a subject of age 50 is
ĝ(50) = −5.31 + 0.111 × 50 = 0.240,
the estimated variance, using equation (1.18) and the results in Table 1.4, is
Var[ĝ(50)] = 1.28517 + (50)2
× 0.000579 + 2 × 50 × (−0.026677) = 0.0650
and the estimated standard error is
SE[ĝ(50)] = 0.2549. Thus the end points of a
95 percent confidence interval for the logit at age 50 are
0.240 ± 1.96 × 0.2550 = (−0.260, 0.740).
We discuss the interpretation and use of the estimated logit in providing estimates
of odds ratios in Chapter 3.
The estimator of the logit and its confidence interval provide the basis for the
estimator of the fitted value, in this case the logistic probability, and its associated
confidence interval. In particular, using equation (1.7) at age 50 the estimated
logistic probability is
π̂(50) =
eĝ(50)
1 + eĝ(50)
=
e−5.31+0.111×50
1+e−5.31+0.111×50
= 0.560 (1.20)
and the endpoints of a 95 percent confidence interval are obtained from the
respective endpoints of the confidence interval for the logit. The endpoints of the
100(1 − α)% Wald-based confidence interval for the fitted value are
eĝ(x)±z1−α/2
SE[ĝ(x)]
1 + eĝ(x)±z1−α/2
SE[ĝ(x)]
. (1.21)
36. 18 introduction to the logistic regression model
Using the example at age 50 to demonstrate the calculations, the lower limit is
e−0.260
1 + e−0.260
= 0.435,
and the upper limit is
e0.740
1 + e0.740
= 0.677.
We have found that a major mistake often made by data analysts new to logis-
tic regression modeling is to try and apply estimates on the probability scale to
individual subjects. The fitted value computed in equation (1.20) is analogous to a
particular point on the line obtained from a linear regression. In linear regression
each point on the fitted line provides an estimate of the mean of the dependent
variable in a population of subjects with covariate value “x”. Thus the value of
0.56 in equation (1.20) is an estimate of the mean (i.e., proportion) of 50-year-old
subjects in the population sampled that have evidence of CHD. An individual 50-
year-old subject either does or does not have evidence of CHD. The confidence
interval suggests that this mean could be between 0.435 and 0.677 with 95 percent
confidence. We discuss the use and interpretation of fitted values in greater detail
in Chapter 3.
One application of fitted logistic regression models that has received a lot of
attention in the subject matter literature is using model-based fitted values similar
to the one in equation (1.20) to predict the value of a binary dependent value in
individual subjects. This process is called classification and has a long history in
statistics where it is referred to as discriminant analysis. We discuss the classifica-
tion problem in detail in Chapter 4. We also discuss discriminant analysis within
the context of a method for obtaining estimators of the coefficients in the next
section.
The coverage∗†
of the Wald-based confidence interval estimators in
equations (1.15) and (1.16) depends on the assumption that the distribution of the
maximum likelihood estimators is normal. Potential sensitivity to this assumption
is the main reason that the likelihood ratio test is recommended over the Wald test
for assessing the significance of individual coefficients, as well as for the overall
model. In settings where the number of events (y = 1) and/or the sample size
is small the normality assumption is suspect and a log-likelihood function-based
confidence interval can have better coverage. Until recently routines to compute
these intervals were not available in most software packages. Cox and Snell
(1989, p. 179–183) discuss the theory behind likelihood intervals, and Venzon
and Moolgavkar (1988) describe an efficient way to calculate the end points.
∗
The remainder of this section is more advanced material that can be skipped on first reading of the
text.
†The term coverage of an interval estimator refers to the percent of time confidence intervals computed
in a similar manner contain the true parameter value. Research has shown that when the normality
assumption does not hold, Wald-based confidence intervals can be too narrow and thus contain the true
parameter with a smaller percentage than the stated confidence coefficient.
37. confidence interval estimation 19
Royston (2007) describes a STATA [StataCorp (2011)] routine that implements
the Venzon and Moolgavkar method that we use for the examples in this text. The
SAS package’s logistic regression procedure [SAS Institute Inc. (2009)] has the
option to obtain likelihood confidence intervals.
The likelihood-based confidence interval estimator for a coefficient can be con-
cisely described as the interval of values, β∗, for which the likelihood ratio test
would fail to reject the hypothesis, Ho : β = β∗
, at the stated 1 − α percent signif-
icance level. The two end points, βlower and βupper, of this interval for a coefficient
are defined as follows:
2[l(β̂) − lp(βupper)] = 2[l(β̂) − lp(βlower)] = χ2
1−α(1), (1.22)
where l(β̂) is the value of the log-likelihood of the fitted model and lp(β) is the
value of the profile log-likelihood. A value of the profile log-likelihood is computed
by first specifying/fixing a value for the coefficient of interest, for example the slope
coefficient for age, and then finding the value of the intercept coefficient, using the
Venzon and Moolgavkar method, that maximizes the log-likelihood. This process
is repeated over a grid of values of the specified coefficient, for example, values of
β∗
, until the solutions to equation (1.22) are found. The results can be presented
graphically or in standard interval form. We illustrate both in the example below.
As an example, we show in Figure 1.3 a plot of the profile log-likelihood for the
coefficient for AGE using the CHDAGE data in Table 1.1. The end points of the
95 percent likelihood interval are βlower = 0.067 and βupper = 0.162 and are shown
in the figure where the two vertical lines intersect the “x” axis. The horizontal line
in the figure is drawn at the value
−55.5964 = −53.6756 −
3.8416
2
,
where −53.6756 is the value of the log-likelihood of the fitted model from Table 1.3
and 3.8416 is the 95th percentile of the chi-square distribution with 1 degree of
freedom.
The quantity “Asymmetry” in Figure 1.3 is a measure of asymmetry of the
profile log-likelihood that is the difference between the lengths of the upper part
of the interval, βupper − β̂, to the lower part, β̂ − βlower, as a percent of the total
length, βupper − βlower. In the example the value is
A = 100 ×
(0.162 − 0.111) − (0.111 − 0.067)
(0.162 − 0.067)
∼
= 7.5%.
As the upper and lower endpoints of the Wald-based confidence interval in
equation (1.15) are equidistant from the maximum likelihood estimator, it has
asymmetry A = 0.
In this example, the Wald-based confidence interval for the coefficient for age
is (0.064, 0.158). The likelihood interval is (0.067, 0.162), which is only 1.1%
wider than the Wald-based interval. So there is not a great deal of pure numeric
difference in the two intervals and the asymmetry is small. In settings where there
38. 20 introduction to the logistic regression model
−57
−56
−55
−54
−53
Profile
log-likelihood
function
0.06 0.08 0.1 0.12 0.14 0.16
Coefficient for age
Asymmetry = 7.5%
Figure 1.3 Plot of the profile log-likelihood for the coefficient for AGE in the CHDAGE data.
is greater asymmetry in the likelihood-based interval there can be more substantial
differences between the two intervals. We return to this point in Chapter 3 where
we discuss the interpretation of estimated coefficients. In addition, we include an
exercise at the end of this chapter where there is a pronounced difference between
the Wald and likelihood confidence interval estimators.
Methods to extend the likelihood intervals to functions of more than one coef-
ficient such as the estimated logit function and probability are not available in
current software packages.
1.5 OTHER ESTIMATION METHODS
The method of maximum likelihood described in Section 1.2 is the estimation
method used in the logistic regression routines of the major software packages.
However, two other methods have been and may still be used for estimating the
coefficients. These methods are: (1) noniterative weighted least squares, and (2)
discriminant function analysis.
A linear models approach to the analysis of categorical data proposed by Grizzle
et al. (1969) [Grizzle, Starmer, and Koch (GSK) method] uses estimators based on
noniterative weighted least squares. They demonstrate that the logistic regression
model is an example of a general class of models that can be handled by their meth-
ods. We should add that the maximum likelihood estimators are usually calculated
using an iterative reweighted least squares algorithm, and are also technically “least
squares” estimators. The GSK method requires one iteration and is used in SAS’s
GENMOD procedure to fit a logistic regression model containing only categorical
covariates.
39. other estimation methods 21
A major limitation of the GSK method is that we must have an estimate of π(x)
that is not zero or 1 for most values of x. An example where we could use both
maximum likelihood and GSK’s noniterative weighted least squares is the data in
Table 1.2. In cases such as this, the two methods are asymptotically equivalent,
meaning that as n gets large, the distributional properties of the two estimators
become identical. The GSK method could not be used with the data in Table 1.1.
The discriminant function approach to estimation of the coefficients is of histor-
ical importance as it was popularized by Cornfield (1962) in some of the earliest
work on logistic regression. These estimators take their name from the fact that the
posterior probability in the usual discriminant function model is the logistic regres-
sion function given in equation (1.1). More precisely, if the independent variable,
X, follows a normal distribution within each of two groups (subpopulations) defined
by the two values of Y and has different means and the same variance, then the
conditional distribution of Y given X = x is the logistic regression model. That is,
if
X|Y ∼ N(μj , σ2
), j = 0, 1
then P(Y = 1|x) = π(x). The symbol “∼” is read “is distributed” and the
“N(μ, σ2)” denotes the normal distribution with mean equal to μ and variance
equal to σ2. Under these assumptions it is easy to show [Lachenbruch (1975)]
that the logistic coefficients are
β0 = ln
θ1
θ0
− 0.5(μ2
1 − μ2
0)/σ2
(1.23)
and
β1 = (μ1 − μ0)/σ2
, (1.24)
where θj = P(Y = j), j = 0, 1. The discriminant function estimators of β0 and β1
are obtained by substituting estimators for μj , θj , j = 0, 1 and σ2
into the above
equations. The estimators usually used are μ̂j = xj , the mean of x in the subgroup
defined by y = j, j = 0, 1, θ1 = n1/n the mean of y with θ̂0 = 1 − θ̂1 and
σ̂2
= [(n0 − 1)s2
0 + (n1 − 1)s2
1 ]/(n0 + n1 − 2),
where s2
j is the unbiased estimator of σ2
computed within the subgroup of the data
defined by y = j, j = 0, 1. The above expressions are for a single variable x and
multivariable expressions are presented in Chapter 2.
It is natural to ask why, if the discriminant function estimators are so easy to
compute, they are not used in place of the maximum likelihood estimators? Halpern
et al. (1971) and Hosmer et al. (1983) compared the two methods when the model
contains a mixture of continuous and discrete variables, with the general conclusion
that the discriminant function estimators are sensitive to the assumption of normal-
ity. In particular, the estimators of the coefficients for non-normally distributed vari-
ables are biased away from zero when the coefficient is, in fact, different from zero.
The practical implication of this is that for dichotomous independent variables (that
40. 22 introduction to the logistic regression model
occur in many situations), the discriminant function estimators overestimate the
magnitude of the coefficient. Lyles et al. (2009) describe a clever linear regression-
based approach to compute the discriminant function estimator of the coefficient
for a single continuous variable that, when their assumptions of normality hold,
has better statistical properties than the maximum likelihood estimator. We discuss
their multivariable extension and some of its practical limitations in Chapter 2.
At this point it may be helpful to delineate more carefully the various uses
of the term maximum likelihood, as it applies to the estimation of the logistic
regression coefficients. Under the assumptions of the discriminant function model
stated above, the estimators obtained from equations (1.23) and (1.24) are maximum
likelihood estimators. The estimators obtained from equations (1.5) and (1.6) are
based on the conditional distribution of Y given X and, as such, are technically
“conditional maximum likelihood estimators”. It is common practice to drop the
word “conditional” when describing the estimators given in equations (1.5) and
(1.6). In this text, we use the word conditional to describe estimators in logistic
regression with matched data as discussed in Chapter 7.
In summary there are alternative methods of estimation for some data configu-
rations that are computationally quicker; however, we use the maximum likelihood
method described in Section 1.2 throughout the rest of this text.
1.6 DATA SETS USED IN EXAMPLES AND EXERCISES
A number of different data sets are used in the examples as well as the exercises
for the purpose of demonstrating various aspects of logistic regression modeling.
Six of the data sets used throughout the text are described below. Other data sets
are introduced as needed in later chapters. Some of the data sets were used in
the previous editions of this text, for example the ICU and Low Birth Weight
data, while others are new to this edition. All data sets used in this text may be
obtained from links to web sites at John Wiley Sons Inc. and the University of
Massachusetts given in the Preface.
1.6.1 The ICU Study
The ICU study data set consists of a sample of 200 subjects who were part of a
much larger study on survival of patients following admission to an adult intensive
care unit (ICU). The major goal of this study was to develop a logistic regression
model to predict the probability of survival to hospital discharge of these patients.
A number of publications have appeared that have focused on various facets of
this problem. The reader wishing to learn more about the clinical aspects of this
study should start with Lemeshow et al. (1988). For a more up-to-date discussion
of modeling the outcome of ICU patients the reader is referred to Lemeshow and
Le Gall (1994) and to Lemeshow et al. (1993). The actual observed variable values
have been modified to protect subject confidentiality. A code sheet for the variables
to be considered in this text is given in Table 1.5. We refer to this data set as the
ICU data.
41. data sets used in examples and exercises 23
Table 1.5 Code Sheet for the Variables in the ICU Data
Variable Description Codes/Values Name
1 Identification code ID number ID
2 Vital status at hospital discharge 1 = Lived
0 = Died
STA
3 Age Years AGE
4 Gender 0 = Male
1 = Female
GENDER
5 Race 1 = White
2 = Black
3 = Other
RACE
6 Service at ICU admission 0 = Medical
1 = Surgical
SER
7 Cancer part of present problem 0 = No
1 = Yes
CAN
8 History of chronic renal failure 0 = No
1 = Yes
CRN
9 Infection probable at ICU
admission
0 = No
1 = Yes
INF
10 CPR prior to ICU admission 0 = No
1 = Yes
CPR
11 Systolic blood pressure at ICU
admission
mm Hg SYS
12 Heart rate at ICU admission Beats/min HRA
13 Previous admission to an ICU
within 6 months
0 = No
1 = Yes
PRE
14 Type of admission 0 = Elective
1 = Emergency
TYPE
15 Long bone, multiple, neck, single
area, or hip fracture
0 = No
1 = Yes
FRA
16 PO2 from initial blood gases 0 = 60
1 = ≤60
PO2
17 PH from initial blood gases 0 = ≥7.25
1 = 7.25
PH
18 PCO2 from initial blood gases 0 = ≤45
1 = 45
PCO
19 Bicarbonate from initial blood
gases
0 = ≥18
1 = 18
BIC
20 Creatinine from initial blood gases 0 = ≤2.0
1 = 2.0
CRE
21 Level of consciousness at ICU
admission
0 = No coma or
deep stupor
1 = Deep stupor
2 = Coma
LOC
42. 24 introduction to the logistic regression model
Table 1.6 Code Sheet for the Variables in the Low Birth Weight Data
Variable Description Codes/Values Name
1 Identification code 1–189 ID
2 Low birth weight 0 = ≥2500 g
1 = 2500 g
LOW
3 Age of mother Years AGE
4 Weight of mother at last menstrual period Pounds LWT
5 Race 1 = White
2 = Black
3 = Other
RACE
6 Smoking status during pregnancy 0 = No
1 = Yes
SMOKE
7 History of premature labor 0 = None
1 = One
2 = Two, etc.
PTL
8 History of hypertension 0 = No
1 = Yes
HT
9 Presence of uterine irritability 0 = No
1 = Yes
UI
10 Number of physician visits during the first
trimester
0 = None
1 = One
2 = Two, etc.
FTV
11 Recorded birth weight Grams BWT
1.6.2 The Low Birth Weight Study
Low birth weight, defined as birth weight less than 2500 grams, is an outcome
that has been of concern to physicians for years. This is because of the fact that
infant mortality rates and birth defect rates are higher for low birth weight babies.
A woman’s behavior during pregnancy (including diet, smoking habits, and receiv-
ing prenatal care) can greatly alter the chances of carrying the baby to term, and,
consequently, of delivering a baby of normal birth weight.
Data were collected as part of a larger study at Baystate Medical Center in
Springfield, Massachusetts. This data set contains information on 189 births to
women seen in the obstetrics clinic. Fifty-nine of these births were low birth weight.
The variables identified in the code sheet given in Table 1.6 have been shown
to be associated with low birth weight in the obstetrical literature. The goal of
the current study was to determine whether these variables were risk factors in
the clinic population being served by Baystate Medical Center. Actual observed
variable values have been modified to protect subject confidentiality. We refer to
this data set as the LOWBWT data.
1.6.3 The Global Longitudinal Study of Osteoporosis in Women
The Global Longitudinal Study of Osteoporosis in Women (GLOW) is an interna-
tional study of osteoporosis in women over 55 years of age being coordinated at the
43. data sets used in examples and exercises 25
Table 1.7 Code Sheet for Variables in the GLOW Study
Variable Description Codes/Values Name
1 Identification code 1–n SUB_ID
2 Study site 1–6 SITE_ID
3 Physician ID code 128 unique codes PHY_ID
4 History of prior fracture 1 = Yes
0 = No
PRIORFRAC
5 Age at enrollment Years AGE
6 Weight at enrollment Kilograms WEIGHT
7 Height at enrollment Centimeters HEIGHT
8 Body mass index kg/m2
BMI
9 Menopause before age 45 1 = Yes
0 = No
PREMENO
10 Mother had hip fracture 1 = Yes
0 = No
MOMFRAC
11 Arms are needed to stand from
a chair
1 = Yes
0 = No
ARMASSIST
12 Former or current smoker 1 = Yes
0 = No
SMOKE
13 Self-reported risk of fracture 1 = Less than others of the
same age
2 = Same as others of the same
age
3 = Greater than others of the
same age
RATERISK
14 Fracture risk score Composite risk scorea FRACSCORE
15 Any fracture in first year 1 = Yes
0 = No
FRACTURE
aFRACSCORE = 0 × (AGE ≤ 60) + 1 × (60 AGE ≤ 65) + 2 × (65 AGE ≤ 70) + 3 × (70
AGE ≤ 75) + 4 × (75 AGE ≤ 80) + 5 × (80 AGE ≤ 85) + 6 × (AGE 85) + (PRIORFRAC
= 1) + (MOMFRAC = 1) + (WEIGHT 56.8) + 2 × (ARMASSIST = 1) + (SMOKE = 1).
Center for Outcomes Research (COR) at the University of Massachusetts/Worcester
by its Director, Dr. Frederick Anderson, Jr. The study has enrolled over 60,000
women aged 55 and older in ten countries. The major goals of the study are to use
the data to provide insights into the management of fracture risk, patient experience
with prevention and treatment of fractures and distribution of risk factors among
older women on an international scale over the follow up period. Complete details
on the study as well as a list of GLOW publications may be found at the Center
for Outcomes Research web site, www.outcomes-umassmed.org/glow.
Data used here come from six sites in the United States and include a few
selected potential risk factors for fracture from the baseline questionnaire. The
outcome variable is any fracture in the first year of follow up. The incident first-
year fracture rate among the 21,000 subjects enrolled in these six sites is about 4
percent. In order to have a data set of a manageable size, n = 500, for this text
we have over sampled the fractures and under sampled the non-fractures. As a
44. 26 introduction to the logistic regression model
result associations and conclusions from modeling these data do not apply to the
study cohort as a whole. Data have been modified to protect subject confidentiality.
We thank Dr. Gordon Fitzgerald of COR for his help in obtaining these data sets.
A code sheet for the variables is shown in Table 1.7. This data set is named the
GLOW500 data.
1.6.4 The Adolescent Placement Study
Fontanella et al. (2008) present results from a study of determinants of aftercare
placement for psychiatrically hospitalized adolescents and have made the data, suit-
ably modified to protect confidentiality, available to us. It is not our intent to repeat
Table 1.8 Code Sheet for Variables in the Adolescent Placement Study
Variable Description Codes/Values Name
1 Identification code 1–508 ID
2 Placement 0 = Outpatient
1 = Day treatment
2 = Intermediate residential
3 = Residential
PLACE
3 Placement combined 0 = Outpatient or day treatment
1 = Intermediate residential
2 = Residential
PLACE3
3 Age at admission Years AGE
4 Race 0 = White
1 = Nonwhite
RACE
5 Gender 0 = Female
1 = Male
GENDER
6 Neuropsychiatric disturbance 0 = None
1 = Mild
2 = Moderate
3 = Severe
NEURO
7 Emotional disturbance 0 = Not severe
1 = Severe
EMOT
8 Danger to others 0 = Unlikely
1 = Possible
2 = Probable
3 = Likely
DANGER
9 Elopement risk 0 = No risk
1 = At risk
ELOPE
10 Length of hospitalization Days LOS
11 Behavioral symptoms scorea 0–9 BEHAV
12 State custody 0 = No
1 = Yes
CUSTD
13 History of violence 0 = No
1 = Yes
VIOL
aBehavioral symptom score is based on the sum of three symptom subscales (oppositional behavior,
impulsivity, and conduct disorder) from the CSPI.
45. data sets used in examples and exercises 27
the detailed analyses reported in their paper, but rather to use the data to motivate
and describe methods for modeling a multinomial or ordinal scaled outcome using
logistic regression models. As such, we selected a subset of variables, which are
described in Table 1.8. This data set is referred to as the APS data.
1.6.5 The Burn Injury Study
The April 2008 release (Version 4.0) of the National Burn Repository research
dataset (National Burn Repository 2007 Report, Dataset Version 4.0 accessed on
12/05/2008 at: http://www.ameriburn.org/2007NBRAnnualReport.pdf) includes
information on a total of 306,304 burn related hospitalizations that occurred
between 1973 and 2007. Available information included patient demographics,
total burn surface area, presence of inhalation injury, and blinded trauma center
identifiers. The outcome of interest is survival to hospital discharge. Osler et al.
(2010) selected a subset of approximately 40,000 subjects treated between 2000
and 2007 at 40 different burn facilities to develop a new predictive logistic
regression model (see the paper for the details on how this subset was selected).
To obtain a much smaller data set for use in this text we over sampled subjects
who died in hospital and under sampled subjects who lived to obtain a data set
with n = 1000 and achieve a sample with 15 percent in hospital mortality. As
such, all analyses and inferences contained in this text do not apply to the sample
of 40,000, the original data from the registry or the population of burn injury
patients as a whole. These data are used here to illustrate methods when prediction
is the final goal as well as to demonstrate various model building techniques. The
variables are described in Table 1.9 and the data are referred to as the BURN1000
data.
Table 1.9 Code Sheet for Variables in the Burn Study
Variable Description Codes/Values Name
1 Identification code 1–1000 ID
2 Burn facility 1–40 FACILITY
3 Hospital discharge status 0 = Alive
1 = Dead
DEATH
4 Age at admission Years AGE
5 Gender 0 = Female
1 = Male
GENDER
6 Race 0 = Non-White
1 = White
RACE
7 Total burn surface area 0–100% TBSA
8 Burn involved inhalation injury 0 = No
1 = Yes
INH_INJ
9 Flame involved in burn injury 0 = No
1 = Yes
FLAME
46. 28 introduction to the logistic regression model
Table 1.10 Code Sheet for Variables in the Myopia Study
Variable Variable Description Values/Labels Variable Name
1 Subject identifier Integer (range 1–1503) ID
2 Year subject entered the study Year STUDYYEAR
3 Myopia within the first 5 yr of
follow upa
0 = No
1 = Yes
MYOPIC
4 Age at first visit Years AGE
5 Gender 0 = Male
1 = Female
GENDER
6 Spherical equivalent refractionb
Diopter SPHEQ
7 Axial lengthc
mm AL
8 Anterior chamber depthd mm ACD
9 Lens thicknesse
mm LT
10 Vitreous chamber depthf
mm VCD
11 How many hours per week
outside of school the child
spent engaging in
sports/outdoor activities
Hours per week SPORTHR
12 How many hours per week
outside of school the child
spent reading for pleasure
Hours per week READHR
13 How many hours per week
outside of school the child
spent playing video/computer
games or working on the
computer
Hours per week COMPHR
14 How many hours per week
outside of school the child
spent reading or studying for
school assignments
Hours per week STUDYHR
15 How many hours per week
outside of school the child
spent watching television
Hours per week TVHR
16 Composite of near-work
activities
Hours per week DIOPTERHR
17 Was the subject’s mother
myopic?g
0 = No
1 = Yes
MOMMY
18 Was the subject’s father
myopic?
0 = No
1 = Yes
DADMY
aMYOPIC is defined as SPHEQ = −0.75D.
bA measure of the eye’s effective focusing power. Eyes that are “normal” (don’t require glasses or
contact lenses) have spherical equivalents between −0.25 diopters (D) and +1.00 D. The more negative
the spherical equivalent, the more myopic the subject.
cThe length of eye from front to back.
d The length from front to back of the aqueous-containing space of the eye between the cornea and the
iris.
eThe length from front to back of the crystalline lens.
f The length from front to back of the aqueous-containing space of the eye in front of the retina.
g
DIOPTERHR = 3 × (READHR + STUDYHR) + 2 × COMPHR + TVHR.
47. data sets used in examples and exercises 29
Table 1.11 Variables in the Modified NHANES Data Set
Variable Description Code/values Name
1 Identification code 1–6482 ID
2 Gender 0 = Male,
1 = Female
GENDER
3 Age at screening Years AGE
4 Marital status 1 = Married
2 = Widowed
3 = Divorced
4 = Separated
5 = Never married
6 = Living together
MARSTAT
5 Statistical weight 4084.478–153810.3 SAMPLEWT
6 Pseudo-PSU 1, 2 PSU
7 Pseudo-stratum 1–15 STRATA
8 Total cholesterol mg/dl TCHOL
9 HDL-cholesterol mg/dl HDL
10 Systolic blood pressure mm Hg SYSBP
11 Diastolic blood pressure mm Hg DBP
12 Weight kg WT
13 Standing height cm HT
14 Body mass index kg/m2
BMI
15 Vigorous work activity 0 = Yes,
1 = No
VIGWRK
16 Moderate work activity 0 = Yes,
1 = No
MODWRK
17 Walk or bicycle 0 = Yes,
1 = No
WLKBIK
18 Vigorous recreational activities 0 = Yes,
1 = No
VIGRECEXR
19 Moderate recreational activities 0 = Yes,
1 = No
MODRECEXR
20 Minutes of sedentary activity
per week
Minutes SEDMIN
21 BMI 35 0 = No,
1 = Yes
OBESE
1.6.6 The Myopia Study
Myopia, more commonly referred to as nearsightedness, is an eye condition where
an individual has difficulty seeing things at a distance. This condition is primarily
because the eyeball is too long. In an eye that sees normally, the image of what is
being viewed is transmitted to the back portion of the eye, or retina, and hits the
retina to form a clear picture. In the myopic eye, the image focuses in front of the
retina, so the resultant image on the retina itself is blurry. The blurry image creates
problems with a variety of distance viewing tasks (e.g., reading the blackboard,
48. 30 introduction to the logistic regression model
Table 1.12 Code Sheet for the Variables in the Polypharmacy Data Set
Variable Description Codes/Values Name
1 Subject ID ID number 1–500 ID
2 Outcome; taking drugs from
more than three different
classes
0 = Not taking drugs
from more than three
classes
1 = Taking drugs from
more than three classes
POLYPHARMACY
3 Number of outpatient mental
health visits (MHV)
0 = None
1 = One to five
2 = Six to fourteen
3 = Greater than 14
MHV4
4 Number of inpatient mental
health visits (MHV)
0 = None
1 = One
2 = More than one
INPTMHV3
5 Year 2002–2008 YEAR
6 Group 1 = Covered families and
children (CFC)
2 = Aged, blind or
disabled (ABD)
3 = Foster care (FOS)
GROUP
7 Location 0 = Urban
1 = Rural
URBAN
8 Comorbidity 0 = No
1 = Yes
COMORBID
9 Any primary diagnosis (bipolar,
depression, etc.)
0 = No
1 = Yes
ANYPRIM
10 Number of primary diagnosis 0 = None
1 = One
2 = More than one
NUMPRIMRC
11 Gender 0 = Female
1 = Male
GENDER
12 Race 0 = White
1 = Black
2 = Other
RACE
13 Ethnic category 0 = NonHispanic
1 = Hispanic
ETHNIC
14 Age Years and months (two
decimal places)
AGE
doing homework, driving, playing sports) and requires wearing glasses or contact
lenses to correct the problem. Myopia onset is typically between the ages of 8 and
12 years with cessation of the underlying eye growth that causes it by age 15–16
years.
The risk factors for the development of myopia have been debated for a long
time and include genetic factors (e.g., family history of myopia) and the amount
49. data sets used in examples and exercises 31
and type of visual activity that a child performs (e.g., studying, reading, TV watch-
ing, computer or video game playing, and sports/outdoor activity). There is strong
evidence that having myopic parents increases the chance that a child will become
myopic, and weaker evidence that certain types of visual activities (called near
work, e.g., reading) increase the chance that a child will become myopic.
These data are a subset of data from the Orinda Longitudinal Study of Myopia
(OLSM), a cohort study of ocular component development and risk factors for
the onset of myopia in children, which evolved into the Collaborative Longitudinal
Evaluation of Ethnicity and Refractive Error (CLEERE) Study, and both OLSM and
CLEERE were funded by the National Institutes of Health/National Eye Institute.
OLSM was based at the University of California, Berkeley [see Zadnik et al. (1993,
1994)]. Data collection began in the 1989–1990 school year and continued annually
through the 2000–2001 school year. All data about the parts that make up the eye
(the ocular components) were collected during an examination during the school
day. Data on family history and visual activities were collected yearly in a survey
completed by a parent or guardian.
The dataset used in this text is from 618 of the subjects who had at least five
years of followup and were not myopic when they entered the study. All data are
from their initial exam and includes 17 variables. In addition to the ocular data
there is information on age at entry, year of entry, family history of myopia and
hours of various visual activities. The ocular data come from a subject’s right eye.
A subject was coded as myopic if they became myopic at any time during the first
five years of followup. We refer to this data set, in Table 1.10, as the MYOPIA data.
1.6.7 The NHANES Study
The National Health and Nutrition Examination Survey (NHANES), a major effort
of the National Center for Health Statistics, was conceived in the early 1960s to
provide nationally representative and reliable data on the health and nutritional
status of adults and children in the United States. NHANES has since evolved
into a ongoing survey program that provides the best available national estimates
of the prevalence of, and risk factors for, targeted diseases in the United States
population. The survey collects interview and physical exam data on a nationally
representative, multistage probability sample of about 5,000 persons each year, who
are chosen to be representative of the civilian, non-institutionalized, population in
the US.
For purposes of illustrating fitting logistic regression models to sample sur-
vey data in Section 6.4 we chose selected variables, shown in Table 1.11, from
the 2009–2010 cycle of the National Health and Nutrition Examination Study
[NHANES III Reference Manuals and Reports (2012)] and made some modifica-
tions to the data. We refer to this data set as the NHANES data.
1.6.8 The Polypharmacy Study
In Chapter 9, we illustrate model building with correlated data using data on
polypharmacy described in Table 1.12. The outcome of interest is whether the
50. 32 introduction to the logistic regression model
patient is taking drugs from three or more different classes (POLYPHARMACY),
and researchers were interested in identifying factors associated with this outcome.
We selected a sample of 500 subjects from among only those subjects with obser-
vations in each of the seven years data were collected. Based on the suggestions of
the principal investigator, we initially treated the covariates for number of inpatient
and outpatient mental health visits (MHVs) with categories described in Table 1.12.
In addition we added a random number of months to the age, which was recorded
only in terms of the year in the original data set. As our data set is a sample, the
results in this section do not apply to the original study. We refer to this data set
as the POLYPHARM data.
EXERCISES
1. In the ICU data described in Section 1.6.1 the primary outcome variable is
vital status at hospital discharge, STA. Clinicians associated with the study felt
that a key determinant of survival was the patient’s age at admission, AGE.
(a) Write down the equation for the logistic regression model of STA on
AGE. Write down the equation for the logit transformation of this logistic
regression model. What characteristic of the outcome variable, STA, leads
us to consider the logistic regression model as opposed to the usual linear
regression model to describe the relationship between STA and AGE?
(b) Form a scatterplot of STA versus AGE.
(c) Using the intervals (15, 24), (25, 34), (35, 44), (45, 54), (55, 64), (65, 74),
(75, 84), (85, 94) for age, compute the STA mean over subjects within
each age interval. Plot these values of mean STA versus the midpoint of
the age interval using the same set of axes as was used in 1(b). Note: this
plot may done “by hand” on a printed copy of the plot from 1(b).
(d) Write down an expression for the likelihood and log-likelihood for the
logistic regression model in Exercise 1(a) using the ungrouped, n = 200,
data. Obtain expressions for the two likelihood equations.
(e) Using a logistic regression package of your choice obtain the maximum
likelihood estimates of the parameters of the logistic regression model in
Exercise 1(a). These estimates should be based on the ungrouped, n = 200,
data. Using these estimates, write down the equation for the fitted values,
that is, the estimated logistic probabilities. Plot the equation for the fitted
values on the axes used in the scatterplots in 1(b) and 1(c).
(f) Using the results of the output from the logistic regression package used
for 1(e), assess the significance of the slope coefficient for AGE using the
likelihood ratio test, the Wald test, and if possible, the score test. What
assumptions are needed for the p-values computed for each of these tests
to be valid? Are the results of these tests consistent with one another?
What is the value of the deviance for the fitted model?
(g) Using the results from 1(e) compute 95 percent confidence intervals for the
slope coefficient for AGE. Write a sentence interpreting this confidence.
51. exercises 33
(h) Obtain from the package used to fit the model in 1(e) the estimated covari-
ance matrix. Compute the logit and estimated logistic probability for a
60-year-old subject. Evaluate the endpoints of the 95 percent confidence
intervals for the logit and estimated logistic probability. Write a sentence
interpreting the estimated probability and its confidence interval.
2. In the Myopia Study described in Section 1.6.2, one variable that is clearly
important is the initial value of spherical equivalent refraction.
(SPHREQ).
Repeat steps (a)–(g) of Exercise 1, but for 2(c) use eight intervals containing
approximately equal numbers of subjects (i.e., cut points at 12.5%, 25%, . . . ,
etc.).
3. Using the data from the ICU study create a dichotomous variable NONWHITE
(NONWHITE = 1 if RACE = 2 or 3 and NONWHITE = 0 if RACE = 1).
Fit the logistic regression of STA on NONWHITE and show that the 95 per-
cent profile likelihood confidence interval for the coefficient for nonwhite has
asymmetry of −13% and that this interval is 26% wider than the Wald-based
interval. This example points out that even when the sample size and number
of events are large n = 200, and n1 = 40 there can be substantial asymmetry
and differences between the two interval estimators. Explain why this is the
case in this example.
53. 36 the multiple logistic regression model
where, for the multiple logistic regression model,
π(x) =
eg(x)
1 + eg(x)
. (2.2)
If some of the independent variables are discrete, nominal scale variables such
as race, sex, treatment group, and so forth, it is inappropriate to include them in
the model as if they were interval scale variables. The numbers used to represent
the various levels of these nominal scale variables are merely identifiers, and have
no numeric significance. In this situation, the method of choice is to use a col-
lection of design variables (or dummy variables). Suppose, for example, that one
of the independent variables is race, which has been coded as “white,” “black,”
and “other.” In this case, two design variables are necessary. One possible coding
strategy is that when the respondent is “white,” the two design variables, D1 and
D2, would both be set equal to zero; when the respondent is “black,” D1 would
be set equal to 1 while D2 would still equal 0; when the race of the respondent is
“other,” we would use D1 = 0 and D2 = 1. Table 2.1 illustrates this coding of the
design variables.
Every logistic regression software package we use has the capability to generate
design variables, and some provide a choice of several different methods. We
discuss different strategies for creation and interpretation of the coefficients for the
design variables in detail in Chapter 3.
In general, if a nominal scaled variable has k possible values, then k − 1 design
variables are needed. The reason for using one less than the number of values
is that, unless stated otherwise, our models have a constant term. To illustrate the
notation used for design variables in this text, suppose that the jth independent
variable xj has kj levels. The kj − 1 design variables will be denoted as Djl and the
coefficients for these design variables will be denoted as βjl , l = 1, 2, . . . , kj − 1.
Thus, the logit for a model with p variables, with the jth variable being
discrete is
g(x) = β0 + β1x1 + · · · +
kj −1
l=1
βjl Djl + βpxp.
With a few exceptions, we suppress the summation and double subscripting
needed to indicate when design variables are being used when discussing the
multiple logistic regression model.
Table 2.1 An Example of the Coding of the
Design Variables for Race, Coded at Three
Levels
RACE D1 D2
White 0 0
Black 1 0
Other 0 1
54. fitting the multiple logistic regression model 37
2.3 FITTING THE MULTIPLE LOGISTIC REGRESSION MODEL
Assume that we have a sample of n independent observations (xi, yi), i =
1, 2, . . . , n. As in the univariable case, fitting the model requires that we obtain
estimates of the vector β
= (β0, β1, . . . , βp). The method of estimation used in
the multivariable case is the same as in the univariable situation – maximum
likelihood. The likelihood function is nearly identical to that given in equation
(1.3) with the only change being that π(x) is now defined as in equation (2.1).
There will be p + 1 likelihood equations that are obtained by differentiating the
log-likelihood function with respect to the p + 1 coefficients. The likelihood
equations that result may be expressed as follows:
n
i=1
[yi − π(xi)] = 0
and
n
i=1
xij [yi − π(xi)] = 0
for j = 1, 2, . . . , p.
As in the univariable model, the solution of the likelihood equations requires
software that is available in virtually every statistical software package. Let β̂
denote the solution to these equations. Thus, the fitted values for the multiple
logistic regression model are π̂(xi), the value of the expression in equation (2.2)
computed using β̂ and xi.
In the previous chapter only a brief mention was made of the method for estimat-
ing the standard errors of the estimated coefficients. Now that the logistic regression
model has been generalized, both in concept and notation to the multivariable case,
we consider estimation of standard errors in more detail.
The method of estimating the variances and covariances of the estimated coeffi-
cients follows from well-developed theory of maximum likelihood estimation [see,
e.g., Rao, (1973)]. This theory states that the estimators are obtained from the
matrix of second partial derivatives of the log-likelihood function. These partial
derivatives have the following general form
∂2
L(β)
∂β2
j
= −
n
i=1
x2
ij πi(1 − πi) (2.3)
and
∂2
L(β)
∂βj ∂βl
= −
n
i=1
xij xil πi(1 − πi) (2.4)
for j, l = 0, 1, 2, . . . , p where πi denotes π(xi). Let the (p + 1) × (p + 1) matrix
containing the negative of the terms given in equations (2.3) and (2.4) be denoted
as I(β). This matrix is called the observed information matrix. The variances and
56. of your voice, and if in the country take him out on summer
evenings with a leading-rein and a pocketful of carrots. You will not
have much difficulty after a while, and it is quite worth the trouble,
even if you are disposed to think it such, which I never did.
There used long ago to be certain counties celebrated for good
hunters. Ireland was, and is, justly famous, both for breeding and
training youngsters of a style fit for any hunting-field: but posts,
telegraphs, and telephones have placed us far more on a level than
we used to be, and I don’t believe that there is now anything like
the advantage enjoyed by our fathers and grandfathers in
purchasing direct from a breeder.
I may wind up by saying that no horse is worthy of being called a
hunter that cannot be turned in a very small circle, that jumps with
his hind-legs stretched out behind him, or that won’t at all events
attempt any fence at which his owner may wish to turn him.
“COME ALONG, OLD MAN!”
page 216.
58. CHAPTER XIX.
IN THE HUNTING-FIELD.
A very tempting title truly, but before we can get there we must say
a word about the preparation for it, and also about the journey to be
taken to reach the desired goal.
To prepare, therefore, you should look first to your horse; you must
get him into good hard-fed condition some time before the opening
of the season, and either exercise him regularly yourself, or get
somebody to do it for you. The subject of feeding I hope to discuss
in a forthcoming chapter—as also that of shoeing, which is extremely
important. I may here say, however, that my system of feeding
hunters is in many points so widely different to that of others that I
shall not undertake to advocate it openly, but shall merely state that
I have found it answer most admirably in my own stable, and that
many private friends to whom I have recommended it have
endorsed my opinion of its excellence. The only portion of it to which
I shall in this chapter refer, is concerning the times at which I think
the meals ought to be given. I advise that hunters be accustomed all
the year round to do without a heavy midday meal: this practice to
be adhered to during the summer months, as well as in winter; in
fact, whether the animal is doing work or not. A good substantial
feed at 7 A.M., and another twelve hours later, with one of hay only
(but plenty of it, and of the best) at noon, is all the food that need
be given. Horses fed thus do not, on even very long days, miss
anything except their midday repast, whereas, if accustomed to a
solid feed of corn in the middle of the day, the vacuum created by
the want of it must certainly tell upon the animals, and render them
in a great degree unfitted for their tasks.
59. Let your horse, then, be fed as I have directed, and you will
(confidently speaking) find him quite able and ready for the long
days which are so trying to horses that are not in condition, as well
as to many that are.
Look to his shoeing a day or two before you want to use him, and
when I say “look,” I mean for you to do it yourself, and not merely
inquire of the servant whether it has been done, unless, indeed, he
is one of those treasures who are as rarely to be met with as the
proverbial four-leaved shamrock, or the horse that is a day over six
years old. Grooms will not, as a rule, trouble themselves much about
the shoeing department, except at the most inconvenient times;
when they don’t want you to go out, for instance, it is quite
surprising how quickly they contrive to discover that the horse must
go to the forge. I know all their little tricks perfectly well, and the
length of time, too, that they generally find it necessary to be absent
when that forge business is declared to be a necessity that cannot
be done without; therefore, it will be well to look to it always
yourself, a good bit in advance, in order that you may not in any
wise be taken unprepared.
Give a glance over your hunting-gear also, lest anything should be
astray. It is not at the last moment that such things ought ever to be
looked to. See that your gloves are in good order, and your riding-
breeches perfectly whole—for, remember, there is a great and
constant strain on this particular garment, and it will in consequence
stand in frequent need of repairs. Make certain also that your skirt is
neatly brushed, your hat in perfection, and your whip and spur in
perfect readiness for use.
Having made these preparations, you must turn your thoughts from
necessaries to possible contingencies, and hold yourself in readiness
for such. Procure a small, tidy valise, and in it place a complete
change of warm clothing. You can dispense with fashionable and
60. costly articles, and put in merely such things as will prove
convenient in the possible event of your being either dyked, or
subjected to such a wetting from above as would render it unsafe
for you to proceed homewards in your riding-habit. Of course, I am
now surmising that you either drive or rail to the hunt, and return
the same way.
If you ride a hack to covert, or jog your hunter at an easy pace, you
will not only find it impossible to carry a change of clothing, but you
will not have any need of such, because nobody ever catches cold,
even from wet clothes, so long as motion and circulation are kept
up; but if you have a long drive homewards after a hard and exciting
day, or a journey (even a short one) to perform by rail, I strongly
advocate the carrying of the valise. It will not prove a source of the
least trouble to you. You can leave it either in your vehicle or at the
railway station, and it is an inconceivable comfort to be able to get
into a dry suit when every stitch that you have on is clinging to your
body, heavy with wet and mud. I advise the labelling of the valise in
plain letters, if it is to be left in any waiting-room. To attend to this
may prevent a good deal of possible confusion. Many ladies think it a
trouble, I know, to carry such things about with them—just as men,
when they go out walking, consider it “a nuisance” to carry an
umbrella or an overcoat, even on the most uncertain and showery
days—paying the penalty, of course, in drenched garments,
rheumatism, and catarrh. The “trouble” in the first instance is very
small; in the second it may be serious.
Having then made all square and ready, we have next to consider in
what way you intend to proceed to covert. If by rail or vehicle, and
that you happen to have friends of an obliging sort living close to the
proposed meet, you may perhaps find them willing to give
accommodation to your mount for the preceding night. If so it will
be very pleasant, both for you and your horse, as the animal will be
as fresh as a daisy to carry you—a cheery thing for both parties. You
61. must, however, remember that you will be under a very decided
compliment—one which many may not desire to incur—to the friend
who shows you this favour, inasmuch as putting up a horse signifies
either putting up a servant also, or sending a groom to meet the
animal at the station; at all events it entails extra stable duties, and
these must be considered and paid for.
Supposing that you do not send your horse anywhere the night
before, see to it that he gets off betimes in the morning, and, if
going by road, give your servant directions to take him to some
quiet corner or laneway close to the meet, and to wait for you there
until you come. I regard this as a very much better plan than having
him led direct to the meet, and mounting him there in presence of
an assembled crowd. Ladies who like a little bit of show generally
prefer the latter way—but for true comfort, opportunity for
overlooking the general turning-out of your horse, lengthening or
shortening of stirrup-leathers, folding your muffling tidily away
(instead of flinging it anywhere or anyhow into the vehicle), giving
your groom directions where to meet you at the close of the day,
and so forth, commend me to the former.
If a hunter is to be railed, let him go to the station well clothed, and
send extra things along with him for coming home. Winter evenings
are usually chilly, if not downright cold, and are very frequently
damp as well; if, then, a heated animal, with every pore open from
exercise and excitement, is called upon, unprepared, to encounter
these combined atmospheric influences, coughs, catarrhs, rheumatic
affections, and sometimes the more serious evils of inflamed or
congested lungs, are certain to be the results.
Rise early yourself on a hunting morning; have a cold bath, if of a
robust temperament—if not, tepid. Eat a moderate breakfast of
white fish, cutlet, or steak, accompanied by dry toast or biscuit, and
partake of very little liquid. Fill your flask with cold tea: it is more
62. invigorating than either brandy or wine; and provide a small
sandwich, or a biscuit or two, to put in the pocket of your saddle.
This will be provision enough for the commissariat department.
If you have the luxury of riding a good hack to covert, and that the
distance is not very far—say, from five to eight miles—you will be
certain to enjoy it, and it will put you in fettle for the more serious
business of the day. This again, like the bath, means if you are
strong and hardy: in short, inured to long rides, and not by any
means easily fatigued. If it be not so with you, it will be better to
make arrangements to go by rail, or drive.
Some ladies ride their hunters quite long distances to meets, but as
a rule they are not among the straight-going sort, being satisfied
with seeing the first draw and the burst away over a good line of
country, where the two or three preliminary fences are not such as
to occasion many serious mishaps. I do not think that any lady who
rides even moderately straight ought to hack her hunter for a longer
distance than five or six miles of a good fair road, and the best way
to take him will be at a brisk walk, alternated pretty frequently with
a steady jog-trot, or a hand-gallop on the grass at the side. I do not
at all object to a hunter being allowed to drink a little water before
starting on his journey, although I know that very many disagree
with me on the point; nor do I object to his having a few mouthfuls
in the intervals of hunting; it will refresh him excessively, just as a
small goblet of water would refresh you, although a large one might
overload your stomach, or give you a chill.
On arriving at the meet, keep as quiet and as much in the
background as you possibly can. It is better taste by far than to push
forward in ever so small a degree. Do not trouble yourself with
thinking about your own appearance, be it what it may; in all
probability nobody will be minding you at all. If you are perfectly
well turned-out, feel happy in the consciousness that you are so, but
63. shun display; if indifferently, console yourself with the reflection that
each man and woman present is occupied in admiring him or herself,
and has neither time nor desire to admire you, or the reverse.
Do not expect that august personage, “the master,” to shake hands
with you, even if acquainted, or to stop and talk. Salute him as he
goes by, but nothing further.
Do not worry the huntsman with questions about the proposed
draws, or anything else. If you know him, salute him, and say a
word or two, if you like, about his hounds, but never expect him to
answer you; his mind is on other matters bent.
Do not indulge in loud talking, or conspicuous laughter, which will be
certain to render you remarkable and bring many eyes upon you. A
quiet, ladylike demeanour will always ensure admirers.
When the order is given to go, and the huntsman moves off in front
with his hounds, contrive to keep as close to him as you can,
without an appearance of “push.” This for the obvious reason that a
fox is very often found the moment (or nearly so) that hounds are
thrown into covert, and if you are on the spot, you may get well
away with the pack; whereas, at the end of a long cavalcade, on a
narrow and difficult roadway, it will be ten to one against your doing
anything better than hunting a stern chase for the remainder of the
run.
It has for long been a vexed question whether or not the hunting-
field is a suitable place for ladies, and I am certainly not going to
discuss it in extenso, especially in a necessarily limited space. One or
two things concerning it I may, however, be permitted to say.
Firstly, that timid ladies, those mounted on badly broken horses, and
others (a large community) who push for first place while in reality
only fitted to take third (in company with wheezy old gentlemen on
fat cobs, farmers on green colts, and the numerous company of
64. confessed road-riders), are a very decided nuisance in the field; and,
secondly, that ladies who possess courage (by which I do not mean
the effrontery of ignorance and vanity combined), who are
thoroughly well mounted, and who never get in anybody’s way, are,
in my opinion, a charming addition to the delightful pleasures of the
chase. If, then, you want to be considered an acquisition, be
contented—especially if a beginner—to take second place: that is,
not to force a way among the hard-riding lot, or expose yourself to
the numerous perils which really first-flight men and women go out
prepared to encounter—ay, and usually manage to get through
safely, too, if not interfered with or endangered by second and third-
class riders. By-and-by, when you have gained the knowledge and
experience which getting up from the ranks will assuredly bring you,
there will be an extra pleasure in finding yourself not only holding
first place in the most difficult runs, but in knowing that you are
qualified to hold it, and are justified in declining to yield it up to
others who may not have won their spurs.
If, however, you desire to render yourself thoroughly obnoxious to
everybody, you can set about it in this way. Select for your mount
something that is both fidgetty and showy, yet utterly “incapable.”
Whenever you attempt a fence keep your horse at it, whether you
have any chance of getting over or not, to the exclusion of half the
field. When you get on fair ground, gallop madly forward and
override the hounds, if you chance by a “fluke” to get near enough
to them to do so. When there is a check, and the pack fails in hitting
off the scent at once, slash at the nearest of them with your
hunting-whip, and tell the animal playfully that it is “a naughty dog
not to hunt better.” Always make a point of crowding at gaps and
gateways, when hounds and field are struggling to get through.
Never fail to effect an intimate acquaintance with the master, and be
sure to call the huntsman, when speaking of him, “Bill Simmonds” or
“Jim Brown,” although “Simmonds” or “Brown” may be quite enough
65. for other people. Always follow this last-mentioned functionary into
covert, and speak to him all the time that he is anxiously watching
his hounds. Should you happen to view the fox away, swell out your
lungs for a good bellow of Tally ho-oooo! and gallop full tilt at him
before ever a single hound has left covert, which wise proceeding
will be certain to turn him back, and gain for you the blessings of all
genuine lovers of sport. Finally, when the game at length breaks fair,
rush away in advance of everybody else, with your chin to the sky,
and your elbows flapping like the sails of a windmill; and when you
have half-killed your ill-conditioned steed, and frightened the wits
out of a score or two of old squires who have long ago lost their
nerves (together with their appreciation of such “hard riding” as
yours), then pull off, and dose everybody with whom you are
acquainted, for the next week or two, with glowing accounts of the
wonders that you performed on the opening day with the Dashshire
hounds, and the merits, beauties, and achievments of the exquisite
animal that carried you so brilliantly through the first run of the
season. By adopting this mode of proceeding you will be certain to
gain a host of admirers in the field, and will do much toward
disabusing the public mind of the idea (very deeply implanted in it)
that the hunting-field is not a place in which ladies ought to seek for
sport.
Now, in conclusion, allow me in all seriousness to lay down a few
maxims for your instruction. Never go to hunt without a good pilot.
Young lovers are very nice for this purpose, although not always the
safest. I recommend sharp old foxhunters, who know the country,
and who will give you a judicious lead. If you cannot secure a
trustworthy leader, dispense altogether with the services of one, and
cut out a line for yourself, provided that you are mounted on a really
first-class animal, one well up to your weight, and endowed with an
infallible knowledge of where to put his feet. Keep the hounds in
sight if you can, or, at any rate, within hearing, and ride rather wide
66. of them, to right or left; never in their actual wake. Keep your horse
well in hand all through, that he may not sprawl. Be quick at turning.
Avoid, so far as is possible, deep heavy lands; and if traversing
plough, keep along the headlands rather than pump your steed by
galloping over ridge and furrow, as others frequently do. When
obliged to get through gaps and gates put extra steam on when
coming up to them, in order to be first; and if there is a crowd, hold
your horse hard, and touch him lightly with your spur, that he may
keep up his mettle and be ready to bound into full speed the instant
you get clear of the ruck. If riding a young hot-blooded hunter, it will
as a rule be safer for you to put him at a very big jump than to trust
him in a crowded gangway. While riding hard, never so much as
glance at the remainder of the field. Keep your eyes for your horse
and for the leading hounds, so as to keep exactly with them, and
check the very instant that they do.
Never distress your mount by taking unnecessary jumps. Don’t be a
bit ashamed to make use of a convenient gate if you can get along
with equal quickness that way; it will save your horse, and will
enable you to hold your place much longer in the run; but, at the
same time, never shirk a practicable jump when you want to go
straight.
If riding a kicker, give warning to those in the rear of you to keep
out of the way. Never jump over a fallen horseman; select another
part of the fence to effect your leap. Do not continue to ride a
beaten animal; pull off the moment that he hangs out signals of
distress. When called upon to cross a ford, do so very cautiously,
and if your horse makes a kind of forward plunge, and an attempt at
swimming, throw him the reins at once or he will roll over. Do not on
any account interfere with his mouth at such a time. Keep your left
foot stuck well forward, or, better still, lift the leg right over the
leaping-head, that it may not be struck by the horse’s hind foot—and
at the same time take a firm grip of the up-pommel and the off-side
67. of the saddle, to avoid being unseated when he makes his second
plunge, which he will do the moment that he recovers his footing.
If the first run of the day be a good one, rest satisfied with it, and
do not attempt another, unless you have a second horse out. If your
mount should chance to lose a shoe, especially a fore one, make at
once for the nearest forge. If one of the hind feet has sustained the
loss, you may continue the run, provided the going is over soft
ground—but when a fore shoe happens to go, pull off without a
moment’s delay. I have always thought it an admirable plan to carry
a shoe, or slipper, slung on (in its neat leather case) cavalry-wise, to
the saddle. This, in case of accident, obviates the necessity of
waiting at the forge while the smith manufactures one—and of
course on arrival at home it can readily be changed for a more
durable foot-protector. I have even known some sage old sportsmen
carry in their pockets a little American hammer and nail-box in one,
and do their own shoeing when they found themselves in difficulties
and at a distance from professional aid.
Dismount when there is a check, if only for an instant; and, when
there is time, shift your saddle an inch back or forward, the first for
preference. This will prove a great refreshment to your mount.
Be uniformly kind and courteous to everybody. If you chance to
distinguish yourself by good riding, or good fortune, make no fuss
about it, or look for adulation. Always carry a yard or two of twine, a
pick, and a few shillings along with you; there may be uses for all.
When riding home, if you do ride, grasp the first opportunity of
getting your horse some warm gruel, and take him through a
shallow ford or pond to wash the mud from his legs and belly. When
you get him to his stable do not allow him to be tormented by
elaborate grooming; see that he is given an abundance of straw to
roll in, and a good bucket of linseed tea to drink; have his ears dried
by pulling them, bandage his legs with flannel, and give him an
68. abundance of fresh air, which is of far more consequence to an
exhausted hunter than either food or water. I greatly disapprove of
admitting draughts, especially thorough ones—but it is a dire
mistake to cram a horse into a close stable, with every chink stopped
up, and then put a huge quantity of hay and oats before him. Bad
air and improper feeding soon do their work. Some valuable animal
is taken ill, a farrier is sent for, he tries bleeding to stop the terrific
action of the heart, and before morning the horse is dead.
I shall have something useful to say on this and kindred subjects in
my chapter on “Doctoring,” later on.
THOROUGHLY OBNOXIOUS.
70. CHAPTER XX.
SHOEING.
There are three points concerning this important subject on which I
should like to thoroughly convince my readers. Firstly, that the
theory, sometimes put forward, of dispensing with shoes for horses
that are intended to work in paved cities and over rough roads, is a
fallacious one; secondly, that the shoeing done at ordinary forges is
practically all wrong; and, thirdly, that there is nothing at all
derogatory in going down one’s-self to the blacksmith’s, in company
with the animal to be shod, and not only giving directions about the
way in which it will be most advisable to do it, but standing by to
make certain that it is actually done. Common errors among smiths
are these; cutting down the frog until it cannot possibly come in
contact with the ground; paring the sole, until it is either bedewed
with blood, or so thin that the effort to walk on it causes the horse
to wince; opening the “bars” which join the frog to the outer wall of
the foot; putting on unnecessarily heavy shoes; having a strong
predilection in favour of calkins; rasping down the wall of the foot to
fit the shoe, instead of making the shoe to fit the foot; and removing
too much of the heel horn. These faults proceed, as a rule, more
from ignorance than obstinacy, and it would therefore be a good and
wise thing if every farrier were to be made thoroughly acquainted
with the anatomy of the horse’s foot and leg: he would then
perceive what dire mischief he was in reality doing while pursuing
the ordinary stereotyped course which his father and grandfather
probably followed before him.
To look at this list of errors in review. First of all, the frog should
never be interfered with; to pare it with a knife is ruinous; it ought
71. to touch the ground instead of being prevented from doing so:
nature intended that it should. It retains the hoof in proper shape at
the heels, prevents the tendency to slip, and in fact acts as the
natural buffer of the foot, giving it strength, security, and elasticity,
while its toughness enables it to travel over the roughest country
without shrinking or pain. It wards off concussion, being surrounded
by lateral cartilages which may be described as yielding sidewalls,
and is the contrivance supplied by nature for preserving the
superimposed structures from injury or passing hurt. Cutting into the
frog is, I am most firmly convinced, one of the chief causes of
thrush, and nothing can more clearly prove this than the fact that
diseased and wasted frogs, and thrushes of long and obstinate
standing, have been known to become completely cured by the
adoption of a proper system of shoeing—one that brought the frog
not only near the ground, but actually on it.
Navicular disease, that terror of every horse-owner, is without doubt
largely induced by improper shoeing, coupled with the pernicious
practice which I am now condemning, of cutting away the frog. This
valuable india-rubber-like substance should be jealously guarded,
and most carefully preserved from injury or waste; a knife ought no
more be allowed to touch it than permitted to penetrate the horse’s
eye; perhaps even with greater care ought it to be preserved, for
whereas some excellent goers have but one eye to see with, an
animal on three legs is of but little use to anybody, except the
knacker, into whose hands he is pretty certain soon to fall.
Second on the list of evils comes the paring away of the sole of the
foot, and so general is this most unwise operation, that grooms
absolutely prepare for it the night before their charges go to the
forge, by stopping their feet with cow-dung, or some other horrible
dirt. The practice is a cruel, useless, and highly deleterious one,
which owners of horses ought not in anywise to encourage or
permit.
72. Third, is the hideous habit of opening up the heels: which means
making a deep incision into the wall of the foot at the heel, just
where it is bent inward to form the bars. Nothing could possibly be
more injurious or injudicious than this detestable operation. It
weakens the wall of the foot, and occasions what all horse-fanciers
strive to guard against, the evil of contracted heels. The frog, sole,
and bars have each a separate and most important duty to perform,
and are, if unwisely interfered with, rendered absolutely incapable of
contributing to the carrying out of Nature’s exemplary plan.
Fourth, is the custom of putting on shoes that are too clumsy,
weighty, and thick. A thin, light shoe is in every respect preferable,
the lightness of the metal ensuring a firm foothold, while it likewise
brings the foot-proper in closer proximity to the ground.
Fifth, is the strong fancy for calkins,—things which I as strongly
decry, except for heavy draught horses, and for those accustomed to
trust to their assistance for backing weighty loads. Even where such
appendages are acknowledged to be necessary, a toe-piece should
be likewise added to the shoe and the forepart slightly thickened, in
order to ensure an evenness and steadiness of footing, together with
the keeping of the foot in its own natural position. A horse mounted
upon calkins without the addition of the toe-piece must feel quite as
uncomfortable as a vain belle when mounted upon a pair of tapering
high heels.
Another way of preventing injury in the form of contractions from
calkins is, to have the shoes forged of even thickness from heel to
toe, and then to remove a portion of metal from underneath the
quarters. A horse’s real weight is on his toes and heels: nature
shows this by weakening the hoofs at the quarters, and the law of
mechanics illustrates that if the extremities of any powerful
substance are equally and adequately sustained, the absolute body
73. which forms as it were a bridge over the space, may be trusted
without support.
Sixth, is a terrible evil: namely, employing the rasp to the outer wall
of the foot, in order to bring it down to the size of a shoe that is too
small for it. This cruelty is generally perpetrated by farriers who
consider themselves too hurried, but are in reality too lazy, to
undertake the forging of a properly fitting shoe, and so they lay
hands on one that happens to be lying by them, and having affixed
it, proceed to cut down the foot to its level. The wretchedness of the
animal, when set to walk upon this torturing protector, is precisely
like that which we should suffer were our feet to be crushed into
boots or shoes that were ever so many sizes too small for them. By
this cruel practice the horn of the foot is seriously injured, and
months elapse before it resumes its normal shape and condition.
74. FOOT MADE TO FIT SHOE.
SHOE MADE TO FIT FOOT.
Removing too much of the heel-horn is the seventh evil with which
we have to deal. This is a very usual practice, and is strongly
calculated to make a temporary cripple of the horse so operated
upon. It ought to be remembered that the ground face of the hoof
should be even, and justly proportioned from toe to heel, and that
75. the sides of it ought to be of equal depth. There is at times, indeed
very often, an excess of horny growth about the toe, but it is
impossible to lay down any precise rule with reference to the angle
to which the hoof ought to be brought: a competent eye will,
however, judge of it, and will be able to decide whether it is in
conformity with the natural formation and bearing of the limb.
76. LOW HEEL.
HIGH HEEL.
I have a great fancy for tips—otherwise half-shoes, nailed to the
toes only, and leaving both quarters free. I have known one or two
young horses shod in this way who have travelled quite safely, and
shown wonderfully healthy feet. For racers I particularly approve of
them, and for young light-weight hunters, especially when running in
77. a grass country. I am aware that there is a prejudice against them,
except for animals that are for awhile thrown up, but it is an entirely
ignorant one, and ought to be discarded. The late Duke of
Wellington was especially fond of tips, and for a long while rode his
horses with no other kind of foot-covering. He was at length,
however, induced to give it up, as he suffered torment from persons
perpetually informing him that his hack had cast a shoe. I have from
time to time been shown an immense variety of india-rubber shoes,
together with other novel kinds too numerous to mention, and to all
of them have found some grave faults. Lyons has, however, lately
produced a new specimen, which has been experimented with upon
French horses in a manner somewhat successful. It is made entirely
of sheep’s horn, and is said to be particularly adapted to such
animals as are known not to be steady-footed when going over
pavements. It is, moreover, excessively light and very durable—two
excellent qualities—and although at present somewhat more
expensive than the ordinary shoe, it will no doubt come down in
price when the novelty wears off, and will in all probability replace
the present style before the world is many years older. For horses
employed in towns it must be peculiarly valuable, as it is said to be
an effectual check against slipping.
78. FOOT WITH TIP.
Before closing the present chapter, I should like to warn horse-
owners still further against the ordinary uses of the smith’s drawing-
knife, rasp, and heated iron, all of which are, as I have said, most
lamentably abused. By the first, especially, numerous “accidents” are
made to occur. The sole of the foot being all pared away and
exposed close to the earth, induces it to assume a harshness of
texture totally opposed to its natural qualities, which are soft and
yielding—and this change of structure is a fruitful source of corns.
The outer portion of the sole rests upon the web of the shoe; the
coffin-bone descends, and not meeting with any yielding substance
to play upon, the flesh is pressed between the inferior surface of the
bone and the upper surface of the shoe, causing malignant corns.
Again, the educated smith, in order to give what he terms “a better
hold,” drives the fastening nails into the black or outer substance of
the wall of the foot; whereas the untutored Arab preserves his
horses’ feet by permitting the walls to descend about half-an-inch
below the sole, and then driving the nails through this portion of the
hoof. By so doing, he averts the evil consequences of inserting iron
into the brittle substance, and secures at the same time the
79. resistance and tough qualities of the complex covering of the foot.
While the English smith is labouring to give a tight hold, he is in
reality involving three distinct perils—firstly, pricking the sensitive
foot, should the nail chance to turn a little bit on one side—a thing
which very often happens; secondly, driving a nail too fine, or, in
other words, too near the white horn—the consequence of which is
that it, the nail, turns inward when the horse is worked, causing
lameness to ensue; and, thirdly, to avoid these evils, he points his
nails so far outward that the outer crust cracks, splits, and chips
away, in time occasioning a difficulty about finding any place at all
capable of affording holding properties for the necessary nails.
FOOT WITH FRACTURED HORN. SHOE HANGING PENDULOUS.
It is owing to this evil that riders are so frequently inconvenienced
by their horses’ shoes becoming partially detached from their feet.
The weakest portion of the chipped hoof yields first, the remaining
fastenings follow, the shoe wags, the nails lose their hold—with,
perhaps, the exception of one or two,—when the foot is raised its
covering hangs pendulous from it, and when again put down some
nail still remaining in the shoe pierces the plantar surface of the foot,
80. or, perhaps, even penetrates the coffin-bone, and prolonged
lameness follows. “This may be, and no doubt is, all very true,” I
fancy I can hear some reader say; “but what on earth am I to do? I
cannot shoe my horses myself, and smiths are so intolerably
conceited.” Just so; they certainly are, and I can entirely sympathise
with you; horse-owners are terribly dependent upon them, ladies in
particular. But I should advise you to do what I myself have found
effectual, namely, take your horses either to a thoroughly competent
farrier (there are, happily, such to be found), or, what I think better
still, to a complete duffer!—one who knows very little about his
trade, and who, being aware of his deficiencies, will be humble
enough to accept your directions, and also willing to act upon them
and thankful for being afforded an opportunity of doing so. I have
heard that railway companies seek for fools to act as pointsmen; by
all means, then, look out for an idiotic smith!
82. CHAPTER XXI.
FEEDING.
As already stated, I give my own ideas and opinions on this subject,
without any desire to thrust them forward, or the least expectation
of seeing them generally adopted. Old prejudices are hard to get rid
of; grooms are self-willed, obstinate, and ignorant to a degree, and
masters are too yielding, or too indolent to interfere. I therefore
regard it as probable that on many persons the advice contained in
this chapter will be thrown away, while on others—those who are
willing to break new ground—it will, I venture to say, have the
salutary effect of producing improvements in the stable, and
increasing the weight of the purse. By good management, which is
the true secret of all economy, a man, or woman, may keep a pair of
horses for the same yearly outlay that his or her less provident
neighbour will expend on keeping one—while the credit of the stable
will be quite as well, if not better, maintained.
I am most strongly in favour of cooked food, and opposed to the
giving of raw oats in any shape or form. The absurd theory that this
system of feeding is calculated to make horses “soft,” is about as
sensible as that which avers (or would do so) that a man fed upon
cooked rice and well-boiled potatoes, would be less capable of doing
a good day’s work than if compelled to eat the same materials raw.
Animals possessed of even the very best digestions lose a great
portion of the nutriment of their food when given in the ordinary
way—a large quantity of the oats passing through their bodies quite
as whole and unbroken as when swallowed; whereas every grain of
the cooked food is assimilated with the blood, and goes to nourish
the system,—consequently, nothing is lost.
83. A chief reason for the prejudice against cooked food is that it gives
trouble, and is a “bother” to prepare. This is always the groom’s
excuse; everything is a trouble to him, except thrusting a measure of
hard dry corn, accompanied by a bucket of water, at stated intervals
before his charge, and receiving his wages—at stated intervals also
—for so doing. Were he to understand, when being hired, that to
cook the food would form as much a portion of his business as to
groom and bed the horses, there would probably be very little
grumbling—especially when every convenient appliance would be
found ready to his hand; but the difficulty always lies with the old
and knowing ones—men who have been accustomed all their lives to
do things their own way, and have things just as they pleased.
These, as a rule, resent every innovation, and are only to be dealt
with by persons as knowing and determined as themselves.
Another source of objection is the idea that it will require some
special apparatus—some costly, difficult, complicated contrivance for
carrying out the proposed plan. There never was a greater mistake
made. In my next chapter, which will be entirely devoted to the
subject of stabling, I shall endeavour to show that the only
apparatus necessary is an exceedingly simple one,—certainly not by
any means of either a costly or extravagant nature.
To feed a horse four times a day, on any kind of food, is in my
opinion unnecessary; unless, indeed, he be an extremely delicate
feeder, in which case “little and often” should be the rule; but I
maintain that if fed but thrice he ought to be given as good a
proportion as is ordinarily divided into four. I like to see a hard-
working horse able to eat his five quarterns of mixed oats and
beans, varied with a good mash once or twice a week, and always
on a Saturday night. At the same time I am entirely against placing
an excess of food in the manger at one time; it is much better to
give an animal just what he will finish, than that he should not leave
his manger perfectly clean.
84. Corn ought to be boiled until every grain is swollen to nearly double
its normal size, and is capable of being bruised between the fingers;
it should then be turned out on big trays and left to cool. To suffer it
to grow quite cold is not only unnecessary, but is scarcely even
advisable; tepid food is much easier of digestion, both in the human
stomach and in that of the horse, than food that has become chilled.
Cold substances when swallowed, must rise to a temperature of
nearly 100° before the process of digestion can go healthily forward,
and that the food should be a step or two on the road to this degree
of warmth will materially assist the sanitary laws of animal nature.
There is not, at the same time, the very smallest necessity for
administering warm food at all periods when nourishment is given;
on the contrary, a change of diet will be found very beneficial, and
summer feeding ought to differ from that of winter, both in quantity
and temperature. In saying this, however, I do not for a moment
mean to convey that hunters, even when not in use, should ever be
allowed to drop out of condition. I don’t believe they should, unless
completely invalided and not likely to be able to do any work during
the ensuing season. I think they ought to be fed with a proportion of
oats, though somewhat less than in winter time, and be kept in
regular exercise every day. I have already said that I approve of
driving hunters in harness during the off season, and having seen it
tested, I can speak for the efficacy of it.
I have often been asked whether a horse ought to be given the
same quantity of boiled food as of unboiled; in other words, if the
process of cooking occasions the food to swell to twice its natural
size, and so to fill, say, two measures in place of one, ought the two
measures to be given to the horse? My answer is, certainly, if the
animal is a voracious feeder, and is able to make a complete
clearance of all that is in his manger, even after getting the two
measures,—but I do not believe that one horse out of a hundred will
be capable of doing so, or will show the least inclination to make use
85. Welcome to our website – the perfect destination for book lovers and
knowledge seekers. We believe that every book holds a new world,
offering opportunities for learning, discovery, and personal growth.
That’s why we are dedicated to bringing you a diverse collection of
books, ranging from classic literature and specialized publications to
self-development guides and children's books.
More than just a book-buying platform, we strive to be a bridge
connecting you with timeless cultural and intellectual values. With an
elegant, user-friendly interface and a smart search system, you can
quickly find the books that best suit your interests. Additionally,
our special promotions and home delivery services help you save time
and fully enjoy the joy of reading.
Join us on a journey of knowledge exploration, passion nurturing, and
personal growth every day!
ebookbell.com