SlideShare a Scribd company logo
measurement error in nonlinear models 2nd
Edition Raymond J. Carroll download
https://ebookgate.com/product/measurement-error-in-nonlinear-
models-2nd-edition-raymond-j-carroll/
Get Instant Ebook Downloads – Browse at https://ebookgate.com
Get Your Digital Files Instantly: PDF, ePub, MOBI and More
Quick Digital Downloads: PDF, ePub, MOBI and Other Formats
Radiation Risk Estimation Based on Measurement Error
Models 1st Edition Sergii Masiuk
https://ebookgate.com/product/radiation-risk-estimation-based-on-
measurement-error-models-1st-edition-sergii-masiuk/
The Basics of FMEA 2nd Edition Raymond J. Mikulak
https://ebookgate.com/product/the-basics-of-fmea-2nd-edition-
raymond-j-mikulak/
Precalculus Graphs Models 3rd Edition Raymond A.
Barnett
https://ebookgate.com/product/precalculus-graphs-models-3rd-
edition-raymond-a-barnett/
Margins of Error A Study of Reliability in Survey
Measurement Wiley Series in Survey Methodology 1st
Edition Duane F. Alwin
https://ebookgate.com/product/margins-of-error-a-study-of-
reliability-in-survey-measurement-wiley-series-in-survey-
methodology-1st-edition-duane-f-alwin/
Generalized Linear Models with Applications in
Engineering and the Sciences Wiley Series in
Probability and Statistics 2nd Edition Raymond H. Myers
https://ebookgate.com/product/generalized-linear-models-with-
applications-in-engineering-and-the-sciences-wiley-series-in-
probability-and-statistics-2nd-edition-raymond-h-myers/
Quantitative Ecology Second Edition Measurement Models
and Scaling David C. Schneider
https://ebookgate.com/product/quantitative-ecology-second-
edition-measurement-models-and-scaling-david-c-schneider/
An introduction to nonlinear partial differential
equations 2nd ed Edition J. David Logan
https://ebookgate.com/product/an-introduction-to-nonlinear-
partial-differential-equations-2nd-ed-edition-j-david-logan/
In Search of Christian Freedom 2nd Edition Raymond
Franz
https://ebookgate.com/product/in-search-of-christian-freedom-2nd-
edition-raymond-franz/
Behind Human Error 2nd Edition David D. Woods
https://ebookgate.com/product/behind-human-error-2nd-edition-
david-d-woods/
Measurement Error
in Nonlinear Models
A Modern Perspective
Second Edition
Monographs on Statistics and Applied Probability 105
© 2006 by Taylor & Francis Group, LLC
MONOGRAPHS ON STATISTICS AND APPLIED PROBABILITY
General Editors
V. Isham, N. Keiding, T. Louis, S. Murphy, R. L. Smith, and H. Tong
1 Stochastic Population Models in Ecology and Epidemiology M.S. Barlett (1960)
2 Queues D.R. Cox and W.L. Smith (1961)
3 Monte Carlo Methods J.M. Hammersley and D.C. Handscomb (1964)
4 The Statistical Analysis of Series of Events D.R. Cox and P.A.W. Lewis (1966)
5 Population Genetics W.J. Ewens (1969)
6 Probability, Statistics and Time M.S. Barlett (1975)
7 Statistical Inference S.D. Silvey (1975)
8 The Analysis of Contingency Tables B.S. Everitt (1977)
9 Multivariate Analysis in Behavioural Research A.E. Maxwell (1977)
10 Stochastic Abundance Models S. Engen (1978)
11 Some Basic Theory for Statistical Inference E.J.G. Pitman (1979)
12 Point Processes D.R. Cox and V. Isham (1980)
13 Identification of Outliers D.M. Hawkins (1980)
14 Optimal Design S.D. Silvey (1980)
15 Finite Mixture Distributions B.S. Everitt and D.J. Hand (1981)
16 Classification A.D. Gordon (1981)
17 Distribution-Free Statistical Methods, 2nd edition J.S. Maritz (1995)
18 Residuals and Influence in Regression R.D. Cook and S. Weisberg (1982)
19 Applications of Queueing Theory, 2nd edition G.F. Newell (1982)
20 Risk Theory, 3rd edition R.E. Beard, T. Pentikäinen and E. Pesonen (1984)
21 Analysis of Survival Data D.R. Cox and D. Oakes (1984)
22 An Introduction to Latent Variable Models B.S. Everitt (1984)
23 Bandit Problems D.A. Berry and B. Fristedt (1985)
24 Stochastic Modelling and Control M.H.A. Davis and R. Vinter (1985)
25 The Statistical Analysis of Composition Data J. Aitchison (1986)
26 Density Estimation for Statistics and Data Analysis B.W. Silverman (1986)
27 Regression Analysis with Applications G.B. Wetherill (1986)
28 Sequential Methods in Statistics, 3rd edition
G.B. Wetherill and K.D. Glazebrook (1986)
29 Tensor Methods in Statistics P. McCullagh (1987)
30 Transformation and Weighting in Regression
R.J. Carroll and D. Ruppert (1988)
31 Asymptotic Techniques for Use in Statistics
O.E. Bandorff-Nielsen and D.R. Cox (1989)
32 Analysis of Binary Data, 2nd edition D.R. Cox and E.J. Snell (1989)
33 Analysis of Infectious Disease Data N.G. Becker (1989)
34 Design and Analysis of Cross-Over Trials B. Jones and M.G. Kenward (1989)
35 Empirical Bayes Methods, 2nd edition J.S. Maritz and T. Lwin (1989)
36 Symmetric Multivariate and Related Distributions
K.T. Fang, S. Kotz and K.W. Ng (1990)
37 Generalized Linear Models, 2nd edition P. McCullagh and J.A. Nelder (1989)
© 2006 by Taylor & Francis Group, LLC
38 Cyclic and Computer Generated Designs, 2nd edition
J.A. John and E.R. Williams (1995)
39 Analog Estimation Methods in Econometrics C.F. Manski (1988)
40 Subset Selection in Regression A.J. Miller (1990)
41 Analysis of Repeated Measures M.J. Crowder and D.J. Hand (1990)
42 Statistical Reasoning with Imprecise Probabilities P. Walley (1991)
43 Generalized Additive Models T.J. Hastie and R.J. Tibshirani (1990)
44 Inspection Errors for Attributes in Quality Control
N.L. Johnson, S. Kotz and X. Wu (1991)
45 The Analysis of Contingency Tables, 2nd edition B.S. Everitt (1992)
46 The Analysis of Quantal Response Data B.J.T. Morgan (1992)
47 Longitudinal Data with Serial Correlation—A State-Space Approach
R.H. Jones (1993)
48 Differential Geometry and Statistics M.K. Murray and J.W. Rice (1993)
49 Markov Models and Optimization M.H.A. Davis (1993)
50 Networks and Chaos—Statistical and Probabilistic Aspects
O.E. Barndorff-Nielsen, J.L. Jensen and W.S. Kendall (1993)
51 Number-Theoretic Methods in Statistics K.-T. Fang and Y. Wang (1994)
52 Inference and Asymptotics O.E. Barndorff-Nielsen and D.R. Cox (1994)
53 Practical Risk Theory for Actuaries
C.D. Daykin, T. Pentikäinen and M. Pesonen (1994)
54 Biplots J.C. Gower and D.J. Hand (1996)
55 Predictive Inference—An Introduction S. Geisser (1993)
56 Model-Free Curve Estimation M.E. Tarter and M.D. Lock (1993)
57 An Introduction to the Bootstrap B. Efron and R.J. Tibshirani (1993)
58 Nonparametric Regression and Generalized Linear Models
P.J. Green and B.W. Silverman (1994)
59 Multidimensional Scaling T.F. Cox and M.A.A. Cox (1994)
60 Kernel Smoothing M.P. Wand and M.C. Jones (1995)
61 Statistics for Long Memory Processes J. Beran (1995)
62 Nonlinear Models for Repeated Measurement Data
M. Davidian and D.M. Giltinan (1995)
63 Measurement Error in Nonlinear Models
R.J. Carroll, D. Rupert and L.A. Stefanski (1995)
64 Analyzing and Modeling Rank Data J.J. Marden (1995)
65 Time Series Models—In Econometrics, Finance and Other Fields
D.R. Cox, D.V. Hinkley and O.E. Barndorff-Nielsen (1996)
66 Local Polynomial Modeling and its Applications J. Fan and I. Gijbels (1996)
67 Multivariate Dependencies—Models, Analysis and Interpretation
D.R. Cox and N. Wermuth (1996)
68 Statistical Inference—Based on the Likelihood A. Azzalini (1996)
69 Bayes and Empirical Bayes Methods for Data Analysis
B.P. Carlin and T.A Louis (1996)
70 Hidden Markov and Other Models for Discrete-Valued Time Series
I.L. Macdonald and W. Zucchini (1997)
71 Statistical Evidence—A Likelihood Paradigm R. Royall (1997)
72 Analysis of Incomplete Multivariate Data J.L. Schafer (1997)
73 Multivariate Models and Dependence Concepts H. Joe (1997)
© 2006 by Taylor & Francis Group, LLC
74 Theory of Sample Surveys M.E. Thompson (1997)
75 Retrial Queues G. Falin and J.G.C. Templeton (1997)
76 Theory of Dispersion Models B. Jørgensen (1997)
77 Mixed Poisson Processes J. Grandell (1997)
78 Variance Components Estimation—Mixed Models, Methodologies and
Applications P.S.R.S. Rao (1997)
79 Bayesian Methods for Finite Population Sampling
G. Meeden and M. Ghosh (1997)
80 Stochastic Geometry—Likelihood and computation
O.E. Barndorff-Nielsen, W.S. Kendall and M.N.M. van Lieshout (1998)
81 Computer-Assisted Analysis of Mixtures and Applications—
Meta-analysis, Disease Mapping and Others D. Böhning (1999)
82 Classification, 2nd edition A.D. Gordon (1999)
83 Semimartingales and their Statistical Inference B.L.S. Prakasa Rao (1999)
84 Statistical Aspects of BSE and vCJD—Models for Epidemics
C.A. Donnelly and N.M. Ferguson (1999)
85 Set-Indexed Martingales G. Ivanoff and E. Merzbach (2000)
86 The Theory of the Design of Experiments D.R. Cox and N. Reid (2000)
87 Complex Stochastic Systems
O.E. Barndorff-Nielsen, D.R. Cox and C. Klüppelberg (2001)
88 Multidimensional Scaling, 2nd edition T.F. Cox and M.A.A. Cox (2001)
89 Algebraic Statistics—Computational Commutative Algebra in Statistics
G. Pistone, E. Riccomagno and H.P. Wynn (2001)
90 Analysis of Time Series Structure—SSA and Related Techniques
N. Golyandina, V. Nekrutkin and A.A. Zhigljavsky (2001)
91 Subjective Probability Models for Lifetimes
Fabio Spizzichino (2001)
92 Empirical Likelihood Art B. Owen (2001)
93 Statistics in the 21st Century
Adrian E. Raftery, Martin A. Tanner, and Martin T. Wells (2001)
94 Accelerated Life Models: Modeling and Statistical Analysis
Vilijandas Bagdonavicius and Mikhail Nikulin (2001)
95 Subset Selection in Regression, Second Edition Alan Miller (2002)
96 Topics in Modelling of Clustered Data
Marc Aerts, Helena Geys, Geert Molenberghs, and Louise M. Ryan (2002)
97 Components of Variance D.R. Cox and P.J. Solomon (2002)
98 Design and Analysis of Cross-Over Trials, 2nd Edition
Byron Jones and Michael G. Kenward (2003)
99 Extreme Values in Finance, Telecommunications, and the Environment
Bärbel Finkenstädt and Holger Rootzén (2003)
100 Statistical Inference and Simulation for Spatial Point Processes
Jesper Møller and Rasmus Plenge Waagepetersen (2004)
101 Hierarchical Modeling and Analysis for Spatial Data
Sudipto Banerjee, Bradley P. Carlin, and Alan E. Gelfand (2004)
102 Diagnostic Checks in Time Series Wai Keung Li (2004)
103 Stereology for Statisticians Adrian Baddeley and Eva B. Vedel Jensen (2004)
104 Gaussian Markov Random Fields: Theory and Applications
Håvard Rue and Leonhard Held (2005)
105 Measurement Error in Nonlinear Models: A Modern Perspective, Second Edition
Raymond J. Carroll, David Ruppert, Leonard A. Stefanski,
and Ciprian M. Crainiceanu (2006)
© 2006 by Taylor & Francis Group, LLC
Measurement Error
in Nonlinear Models
Raymond J. Carroll
David Ruppert
Leonard A. Stefanski
Ciprian M. Crainiceanu
A Modern Perspective
Second Edition
Monographs on Statistics and Applied Probability 105
Boca Raton London New York
Chapman & Hall/CRC is an imprint of the
Taylor & Francis Group, an informa business
© 2006 by Taylor & Francis Group, LLC
Chapman & Hall/CRC
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742
© 2006 by Taylor and Francis Group, LLC
Chapman & Hall/CRC is an imprint of Taylor & Francis Group, an Informa business
No claim to original U.S. Government works
Printed in the United States of America on acid-free paper
10 9 8 7 6 5 4 3 2 1
International Standard Book Number-10: 1-58488-633-1 (Hardcover)
International Standard Book Number-13: 978-1-58488-633-4 (Hardcover)
This book contains information obtained from authentic and highly regarded sources. Reprinted mate-
rial is quoted with permission, and sources are indicated. A wide variety of references are listed. Reason-
able efforts have been made to publish reliable data and information, but the author and the publisher
cannot assume responsibility for the validity of all materials or for the consequences of their use.
No part of this book may be reprinted, reproduced, transmitted, or utilized in any form by any elec-
tronic, mechanical, or other means, now known or hereafter invented, including photocopying, micro-
filming, and recording, or in any information storage or retrieval system, without written permission
from the publishers.
For permission to photocopy or use material electronically from this work, please access www.copy-
right.com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC) 222
Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that provides
licenses and registration for a variety of users. For organizations that have been granted a photocopy
license by the CCC, a separate system of payment has been arranged.
Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are
used only for identification and explanation without intent to infringe.
Visit the Taylor & Francis Web site at
http://www.taylorandfrancis.com
and the CRC Press Web site at
http://www.crcpress.com
C331X_Discl.indd 1 5/15/06 9:50:24 AM
© 2006 by Taylor & Francis Group, LLC
To our families and friends
© 2006 by Taylor & Francis Group, LLC
Preface to the First Edition
This monograph is about analysis strategies for regression problems in
which predictors are measured with error. These problems are commonly
known as measurement error modeling or errors-in-variables. There is
an enormous literature on this topic in linear regression, as summarized
by Fuller (1987). Our interest lies almost exclusively in the analysis of
nonlinear regression models, defined generally enough to include gener-
alized linear models, transform-both-sides models, and quasilikelihood
and variance function problems.
The effects of measurement error are well known, and we basically
assume that the reader understands that measurement error in predic-
tors causes biases in estimated regression coefficients, and hence that the
field is largely about correcting for such effects. Chapter 3∗
summarizes
much of what is known about the consequences of measurement error
for estimating linear regression parameters, although the material is not
exhaustive.
Nonlinear errors-in-variables modeling began in earnest in the early
1980s with the publication of a series of papers on diverse topics: Prentice
(1982) on survival analysis; Carroll, Spiegelman, Lan, Bailey, and Abbott
(1984) and Stefanski and Carroll (1985) on binary regression; Armstrong
(1985) on generalized linear models; Amemiya (1985) on instrumental
variables; and Stefanski (1985) on estimating equations. David Byar and
Mitchell Gail organized a workshop on the topic in 1987 at the National
Institutes of Health, which in 1989 was published as a special issue of
Statistics in Medicine. Since these early papers, the field has grown dra-
matically, as evidenced by the bibliography at the end of this book.
Unlike the early 1980s, the literature is now so large that it is difficult to
understand the main ideas from individual papers. Indeed, a first draft
of this book, completed in late 1990, consisted only of the material in
four of the first five chapters. Essentially all the rest of the material has
been developed since 1990. In a field as rapidly evolving as this one, and
with the entrance of many new researchers into the area, we can present
but a snapshot of the current state of knowledge.
This book can be divided broadly into four main parts: Chapters 1–2,
∗ Chapter numbers in this preface refer to the first edition, not the present edition.
© 2006 by Taylor & Francis Group, LLC
3–6, 7–8, and 9–14. In addition, there is Appendix A, a review of relevant
fitting methods and statistical models.
The first part is introductory. Chapter 1 gives a number of applications
where measurement error is of concern, and defines basic terminology of
error structure, data sources and the distinction between functional and
structural models. Chapter 2 gives an overview of the important ideas
from linear regression, particularly the biases caused by measurement
error and some estimation techniques.
The second part gives the basic ideas and techniques of what we
call functional modeling, where the distribution of the true predictor
is not modeled parametrically. In addition, in these chapters it is as-
sumed that the true predictor is never observable. The focus is on the
additive measurement error model, although periodically we describe
modifications for the multiplicative error model. Chapters 3 and 4 dis-
cuss two broadly applicable functional methods, regression calibration
and simulation-extrapolation (SIMEX), which can be thought of as the
default approaches. Chapter 5 discusses a broadly based approach to
the use of instrumental variables. All three of these chapters focus on
estimators which are easily computed but yield only approximately con-
sistent estimates. Chapter 6 is still based on the assumption that the true
predictor is never observable, but here we provide functional techniques
which are fully and not just approximately consistent. This material is
somewhat more daunting in (algebraic) appearance than the approxi-
mate techniques, but even so the methods themselves are often easily
programmed. Throughout this part of the book, we use examples of
binary regression modeling.
The third part of the book concerns structural modeling, meaning that
the distribution of the true predictor is parametrically modeled. Chapter
7 describes the likelihood approach to estimation and inference in mea-
surement error models, while Chapter 8 briefly covers Bayesian model-
ing. Here we become more focused on the distinction between functional
and structural modeling, and also describe the measurement error prob-
lem as a missing data problem. We also allow for the possibility that
the true predictor can be measured in a subset of the study population.
The discussion is fully general and applies to categorical data as well as
to the additive and multiplicative measurement error models. While at
this point the use of structural modeling in measurement error models
is not very popular, we believe it will become more so in the very near
future.
The fourth part of the book is devoted to more specialized topics.
Chapter 9 takes up the study of functional techniques which are ap-
plicable when the predictor can be observed in a subset of the study.
Chapter 10 discusses functional estimation in models with generalized
© 2006 by Taylor & Francis Group, LLC
linear structure and an unknown link function. Chapter 11 describes the
effects that measurement error has on hypothesis testing. Nonparamet-
ric regression and density function estimation are addressed in Chapter
12. Errors in the response rather than in predictors are described in
Chapter 13. In Chapter 14, a variety of topics are addressed briefly:
case-control studies, differential measurement error, functional mixture
methods, design of two-stage studies and survival analysis.
We have tried to design the text so that it can be read at two levels.
Many readers will be interested only in the background material and
in the definition of the specific methods that can be employed. These
readers will find that the chapters in the middle two parts of the text
(functional and structural modeling) begin with preliminary discussion,
move into the definition of the methods, and are then followed by a
worked numerical example. The end of the example serves as a flag that
the material is about to become more detailed, with justifications of the
methods, derivations of estimated standard errors, etc. Those readers
who are not interested in such details should skip the material following
the examples at first (and perhaps last) reading.
It is our intention that the part of the book on functional models
(Chapters 3–6) can be understood at an overview level without an ex-
tensive background in theoretical statistics, at least through the numeri-
cal examples. The structural modeling approach requires that one knows
about likelihood and Bayesian methods, but with this exception the ma-
terial is not particularly specialized. The fourth part of the book (Chap-
ters 9–14) is more technical, and we suggest that those interested mainly
in an overview simply read the first section of each of those chapters.
A full appreciation of the text, especially its details, requires a strong
background in likelihood methods, estimating equations and quasilikeli-
hood and variance function models. For inference, we typically provide
estimated standard errors, as well as suggest use of “the” bootstrap.
These topics are all covered in Appendix A, albeit briefly. For more
background on the models used in this monograph, we highly recom-
mend reading Chapter 1 of Fuller (1987) for an introduction to linear
measurement error models and the first four chapters of McCullagh and
Nelder (1989) for further discussion of generalized linear models, includ-
ing logistic regression.
This is a book about general ideas and strategies of estimation and
inference, not a book about a specific problem. Our interest in the field
started with logistic regression, and many of our examples are based
upon this problem. However, our philosophy is that measurement error
occurs in many fields and in a variety of guises, and what is needed
is an outline of strategies for handling progressively more difficult prob-
lems. While logistic regression may well be the most important nonlinear
© 2006 by Taylor & Francis Group, LLC
measurement error model, the strategies here are applied to a hard-core
nonlinear regression bioassay problem (Chapter 3), a changepoint prob-
lem (Chapter 7), and a 2 × 2 table with misclassification (Chapter 8).
Our hope is that the strategies will be sufficiently clear that they can be
applied to new problems as they arise.
We have tried to represent the main themes in the field, and to ref-
erence as many research papers as possible. Obviously, as in any mono-
graph, the selection of topics and material to be emphasized reflects our
own interests. We apologize in advance to those workers whose work we
have neglected to cite, or whose work should have been better advertised.
Carroll’s research and the writing of this book were supported by
grants from the National Cancer Institute (CA–57030 and CA–61067).
After January 1, 1996, Splus and SAS computer programs (on SPARC
architecture SunOS versions 4 and 5 and for Windows on PCs), which
implement (for major generalized linear models) many of the functional
methods described in this book, can be obtained by sending a message
to qvf@stat.tamu.edu. The body of the text should contain only a valid
return email address. This will generate an automatic reply with instruc-
tions on how to get the software.
Much of Stefanski’s research on measurement error problems has been
supported by grants from the National Science Foundation (DMS–86136-
81 and DMS–9200915) and by funding from the Environmental Monitor-
ing and Assessment Program, U.S. Environmental Protection Agency.
We want to thank Jim Calvin, Bobby Gutierrez, Stephen Eckert, Joey
Lin, C. Y. Wang, and Naisyin Wang for helpful general comments; Donna
Spiegelman for a detailed reading of the manuscript; Jeff Buzas, John
Cook, Tony Olsen, and Scott Overton for ideas and comments related
to our research; and Viswanath Devanarayan for computing assistance
and comments. Rob Abbott stimulated our initial interest in the field
in 1981 with a question concerning the effects of measurement error
in the Framingham Heart Study; this example appears throughout our
discussion. Larry Freedman and Mitch Gail have commented on much
of our work and have been instrumental in guiding us to interesting
problems. Nancy Potischman introduced us to the world of nutritional
epidemiology, where measurement error is of fundamental concern. Our
friend Leon Gleser has been a source of support and inspiration for many
years and has been a great influence on our thinking.
This book uses data supplied by the National Heart, Lung, and Blood
Institute, NIH, DHHS from the Framingham Heart Study. The views
expressed in this paper are those of the authors and do not necessarily
reflect the views of the National Heart, Lung, and Blood Institute or of
the Framingham Study.
© 2006 by Taylor & Francis Group, LLC
Preface to the Second Edition
Since the first edition of Measurement Error in Nonlinear Models ap-
peared in 1995, the field of measurement error and exposure uncertainty
has undergone an explosion in research. Some of these areas are the
following:
• Bayesian computation via Markov Chain Monte Carlo techniques are
now widely used in practice. The first edition had a short and not
particularly satisfactory Chapter 9 on this topic. In this edition, we
have greatly expanded the material and also the applications. Even if
one is not a card-carrying Bayesian, Bayesian computation is a natural
way to handle what we call the structural approach to measurement
error modeling.
• A new chapter has been added on longitudinal data and mixed models,
areas that have seen tremendous growth since the first edition.
• Semiparametric and nonparametric methods are enjoying increased
application. The field of semiparametric and nonparametric regression
(Ruppert, Wand, and Carroll, 2003) has become extremely important
in the past 11 years, and in measurement error problems techniques
are now much better established. We have revamped the old chap-
ter on nonparametric regression and density estimation (Chapter 12)
and added a new chapter (Chapter 13) to reflect the changes in the
literature.
• Methods for handling covariate measurement error in survival anal-
ysis have been developing rapidly. The first edition had a section on
survival analysis in the final chapter, “Other Topics.” This section
has been greatly expanded and made into a separate Chapter 14.
• The area of missing data has also expanded vigorously over the last
11 years, especially due to the work of Robins and his colleagues.
This work and its connections with measurement error now needs a
book-length treatment of its own. Therefore, with some reluctance,
we decided to delete much of the old material on validation data as a
missing data problem.
• We have completely rewritten the score function chapter, both to keep
up with advances in this area and and to make the exposition more
transparent.
© 2006 by Taylor & Francis Group, LLC
The background material in Appendix A has been expanded to make
the book somewhat more self-contained. Technical material that ap-
peared as appendices to individual chapters in the first edition has now
been collected into a new Appendix B.
Carroll’s research has been supported since 1990 by a grant from the
National Cancer Institute (CA57030). The work of Raymond Carroll
partially occurred during multiple visits to Peter Hall at the Centre
of Excellence for Mathematics and Statistics of Complex Systems at
the Australian National University, whose support is gratefully acknowl-
edged, along with the opportunity to take thousands of photos of kanga-
roos (http://www.stat.tamu.edu/∼carroll/compressed kangaroo.jpg).
David Ruppert was supported by the National Science Foundation (DMS
04-538) and the National Institutes of Health (CA57030). Leonard Ste-
fanski also received support from the National Science Foundation and
the National Institutes of Health.
In this second edition, we especially acknowledge our colleagues with
whom we have discussed measurement error problems and worked since
1995, including Scott Berry, Dennis Boos, John Buonaccorsi, Jeff Buzas,
Josef Coresh, Marie Davidian, Eugene Demidenko, Laurence Freedman,
Wayne Fuller, Mitchell Gail, Bobby Gutierrez, Peter Hall, Victor Kipnis,
Liang Li, Xihong Lin, Jay Lubin, Yanyuna Ma, Doug Midthune, Sastry
Pantula, Dan Schafer, John Staudenmayer, Sally Thurston, Tor Toste-
son, Naisyin Wang, and Alan Welsh. Owen Hoffman introduced us to
the problem of radiation dosimetry and the ideas of shared Berkson and
classical uncertainties.
We once again acknowledge Robert Abbott for introducing us to the
problem in 1980, when he brought to Raymond Carroll a referee report
demanding that he explain the impact of measurement error on the
(logistic regression) Framingham data. We would love to acknowledge
that anonymous referee for starting us along the path of measurement
error in nonlinear models.
We also thank Mitchell Gail, one of the world’s great biostatisticians,
for his advice and friendship over the last 25 years.
We are extremely grateful to Rick Rossi for a detailed reading of the
manuscript, a reading that led to many changes in substance and ex-
position. Rick is the only head of a Department of Mathematics and
Statistics who is also a licensed trout-fishing guide.
Finally, and with gratitude, we acknowledge our good friend Leon
Gleser, who, to quote the first edition, has been a source of support
and inspiration for many years and has been a great influence on our
thinking.
Our book Web site is
http://www.stat.tamu.edu/∼carroll/eiv.SecondEdition.
© 2006 by Taylor & Francis Group, LLC
Guide to Notation
In this section we give brief explanations and representative examples
of the notation used in this monograph. For precise definitions, see the
text.
b
An, b
Bn components of the sandwich formula
α0 intercept in model for E(X|Z, W)
αw coefficient of W in model for E(X|Z, W)
αz coefficient of Z in model for E(X|Z, W)
β0 intercept in a model for E(Y|X, Z)
βx coefficient of X in model for E(Y|X, Z)
βz coefficient of Z in model for E(Y|X, Z)
β1ZX coefficient of 1 in generalized linear regresssion
∆ indicator of validation data, for example, where X is
observed
dim(β) dimension of the vector β
fX density of X
fY,W,T |Z density of (Y, W, T) given Z
F(·) unknown link function
σ2
g(Z, X, B, θ) var(Y|Z, X) in QVF model
G extrapolant function in SIMEX
GQ quadratic extrapolant function
GRL rational linear extrapolant function
γ0,cm intercept in a regression calibration model
γt
z,cm coefficient of Z in a regression calibration
model
γt
w,cm coefficient of W in a regression calibration
model
γ0,em intercept in an error model
γt
x,em coefficient of X in an error model
γt
w,em coefficient of W in an error model
H(v) (1 + exp(−v))−1
, for example, the logistic function
h bandwidth in nonparametric regression or
density estimation
In(Θ) Fisher information
© 2006 by Taylor & Francis Group, LLC
k With equal replication, number of replicates for all subjects
ki Number of replicates of ith
subject
K(·) kernel used in nonparametric regression or density estimation
κcm σ2
cm/σ2
Λ(·) likelihood ratio
L(·) generalized score function
mX(Z, W, γcm) E(X|Z, W)
mY(Z, X, β) E(Y|Z, X) in QVF (quasilikelihood variance function) model
mY,x(z, x, β) (∂/∂x)mY(z, x, β)
mY,xx(z, x, β) (∂2
/∂x2
)mY(z, x, β)
π(Y, Z, W, α) probability of selection into a validation study
Ψ, ψ estimating functions
S Y measured with error (S = Y + V)
si(y|Θ) score function
σ2
u variance of U
σ2
X|Z conditional variance of X given Z
σxy the covariance between random variables X and Y
ρxy the correlation between X and Y , which is defined as σxy/(σxσy)
ΣZX covariance matrix between the random vectors Z and X
T observation related to X
Θb(λ) simulated estimator used in SIMEX
Θ(λ) average of the Θb(λ)s
U observation error in an error model
Ub,k pseudo-error in SIMEX
V measurement error in the response
W observation related to X
X covariates measured with error
Y response
Yi· average of Yij over j
[Ỹ|Z̃, X̃, B] density of Ỹ given (Z̃, X̃, B) (Bayesian notation)
Z covariates measured without error
ζ parameter controlling amount of simulated extra
measurement error in SIMEX
If m(x) is any function, then m0
(x) and m00
(x) are its first and second
derivatives and m(m)
is its mth
derivative for m > 2.
For a vector or matrix A, At
is its transpose and if A is an invertible
matrix, then A−1
is its inverse.
If a = (a1, . . . , an) is a vector, then kak is its Euclidean norm, that is,
kak =
¡Pn
i=1 a2
i
¢1/2
.
© 2006 by Taylor & Francis Group, LLC
If X and Y are random variables, then [X] is the distribution or X
and [X|Y ] is the conditional distribution of X given Y. This notation is
becoming standard in the Bayesian literature.
© 2006 by Taylor & Francis Group, LLC
Contents
1 INTRODUCTION 1
1.1 The Double/Triple Whammy of Measurement Error 1
1.2 Classical Measurement Error: A Nutrition Example 2
1.3 Measurement Error Examples 3
1.4 Radiation Epidemiology and Berkson Errors 4
1.4.1 The Difference Between Berkson and Classical
Errors: How to Gain More Power Without Really
Trying 5
1.5 Classical Measurement Error Model Extensions 7
1.6 Other Examples of Measurement Error Models 9
1.6.1 NHANES 9
1.6.2 Nurses’ Health Study 10
1.6.3 The Atherosclerosis Risk in Communities Study 11
1.6.4 Bioassay in a Herbicide Study 11
1.6.5 Lung Function in Children 12
1.6.6 Coronary Heart Disease and Blood Pressure 12
1.6.7 A-Bomb Survivors Data 13
1.6.8 Blood Pressure and Urinary Sodium Chloride 13
1.6.9 Multiplicative Error for Confidentiality 14
1.6.10 Cervical Cancer and Herpes Simplex Virus 14
1.7 Checking the Classical Error Model 14
1.8 Loss of Power 18
1.8.1 Linear Regression Example 18
1.8.2 Radiation Epidemiology Example 20
1.9 A Brief Tour 23
Bibliographic Notes 23
2 IMPORTANT CONCEPTS 25
2.1 Functional and Structural Models 25
2.2 Models for Measurement Error 26
2.2.1 General Approaches: Berkson and Classical Models 26
2.2.2 Is It Berkson or Classical? 27
2.2.3 Berkson Models from Classical 28
2.2.4 Transportability of Models 29
xix
© 2006 by Taylor & Francis Group, LLC
2.2.5 Potential Dangers of Transporting Models 30
2.2.6 Semicontinuous Variables 32
2.2.7 Misclassification of a Discrete Covariate 32
2.3 Sources of Data 32
2.4 Is There an “Exact” Predictor? What Is Truth? 33
2.5 Differential and Nondifferential Error 36
2.6 Prediction 38
Bibliographic Notes 39
3 LINEAR REGRESSION AND ATTENUATION 41
3.1 Introduction 41
3.2 Bias Caused by Measurement Error 41
3.2.1 Simple Linear Regression with Additive Error 42
3.2.2 Regression Calibration: Classical Error as Berkson
Error 44
3.2.3 Simple Linear Regression with Berkson Error 45
3.2.4 Simple Linear Regression, More Complex Error
Structure 46
3.2.5 Summary of Simple Linear Regression 49
3.3 Multiple and Orthogonal Regression 52
3.3.1 Multiple Regression: Single Covariate Measured
with Error 52
3.3.2 Multiple Covariates Measured with Error 53
3.4 Correcting for Bias 55
3.4.1 Method of Moments 55
3.4.2 Orthogonal Regression 57
3.5 Bias Versus Variance 60
3.5.1 Theoretical Bias–Variance Tradeoff Calculations 61
3.6 Attenuation in General Problems 63
Bibliographic Notes 64
4 REGRESSION CALIBRATION 65
4.1 Overview 65
4.2 The Regression Calibration Algorithm 66
4.3 NHANES Example 66
4.4 Estimating the Calibration Function Parameters 70
4.4.1 Overview and First Methods 70
4.4.2 Best Linear Approximations Using Replicate Data 70
4.4.3 Alternatives When Using Partial Replicates 72
4.4.4 James–Stein Calibration 72
4.5 Multiplicative Measurement Error 72
4.5.1 Should Predictors Be Transformed? 73
4.5.2 Lognormal X and U 74
xx
© 2006 by Taylor & Francis Group, LLC
4.5.3 Linear Regression 77
4.5.4 Additive and Multiplicative Error 78
4.6 Standard Errors 79
4.7 Expanded Regression Calibration Models 79
4.7.1 The Expanded Approximation Defined 81
4.7.2 Implementation 83
4.7.3 Bioassay Data 85
4.8 Examples of the Approximations 90
4.8.1 Linear Regression 90
4.8.2 Logistic Regression 90
4.8.3 Loglinear Mean Models 93
4.9 Theoretical Examples 94
4.9.1 Homoscedastic Regression 94
4.9.2 Quadratic Regression with Homoscedastic Regres-
sion Calibration 94
4.9.3 Loglinear Mean Model 95
Bibliographic Notes and Software 95
5 SIMULATION EXTRAPOLATION 97
5.1 Overview 97
5.2 Simulation Extrapolation Heuristics 98
5.2.1 SIMEX in Simple Linear Regression 98
5.3 The SIMEX Algorithm 100
5.3.1 Simulation and Extrapolation Steps 100
5.3.2 Extrapolant Function Considerations 108
5.3.3 SIMEX Standard Errors 110
5.3.4 Extensions and Refinements 111
5.3.5 Multiple Covariates with Measurement Error 112
5.4 Applications 112
5.4.1 Framingham Heart Study 112
5.4.2 Single Covariate Measured with Error 113
5.4.3 Multiple Covariates Measured with Error 118
5.5 SIMEX in Some Important Special Cases 120
5.5.1 Multiple Linear Regression 120
5.5.2 Loglinear Mean Models 122
5.5.3 Quadratic Mean Models 122
5.6 Extensions and Related Methods 123
5.6.1 Mixture of Berkson and Classical Error 123
5.6.2 Misclassification SIMEX 125
5.6.3 Checking Structural Model Robustness via Re-
measurement 126
Bibliographic Notes 128
xxi
© 2006 by Taylor & Francis Group, LLC
6 INSTRUMENTAL VARIABLES 129
6.1 Overview 129
6.1.1 A Note on Notation 130
6.2 Instrumental Variables in Linear Models 131
6.2.1 Instrumental Variables via Differentiation 131
6.2.2 Simple Linear Regression with One Instrument 132
6.2.3 Linear Regression with Multiple Instruments 134
6.3 Approximate Instrumental Variable Estimation 137
6.3.1 IV Assumptions 137
6.3.2 Mean and Variance Function Models 138
6.3.3 First Regression Calibration IV Algorithm 139
6.3.4 Second Regression Calibration IV Algorithm 140
6.4 Adjusted Score Method 140
6.5 Examples 143
6.5.1 Framingham Data 143
6.5.2 Simulated Data 145
6.6 Other Methodologies 145
6.6.1 Hybrid Classical and Regression Calibration 145
6.6.2 Error Model Approaches 147
Bibliographic Notes 148
7 SCORE FUNCTION METHODS 151
7.1 Overview 151
7.2 Linear and Logistic Regression 152
7.2.1 Linear Regression Corrected and Conditional
Scores 152
7.2.2 Logistic Regression Corrected and Conditional
Scores 157
7.2.3 Framingham Data Example 159
7.3 Conditional Score Functions 162
7.3.1 Conditional Score Basic Theory 162
7.3.2 Conditional Scores for Basic Models 164
7.3.3 Conditional Scores for More Complicated Models 166
7.4 Corrected Score Functions 169
7.4.1 Corrected Score Basic Theory 170
7.4.2 Monte Carlo Corrected Scores 170
7.4.3 Some Exact Corrected Scores 172
7.4.4 SIMEX Connection 173
7.4.5 Corrected Scores with Replicate Measurements 173
7.5 Computation and Asymptotic Approximations 174
7.5.1 Known Measurement Error Variance 175
7.5.2 Estimated Measurement Error Variance 176
7.6 Comparison of Conditional and Corrected Scores 177
xxii
© 2006 by Taylor & Francis Group, LLC
7.7 Bibliographic Notes 178
Bibliographic Notes 178
8 LIKELIHOOD AND QUASILIKELIHOOD 181
8.1 Introduction 181
8.1.1 Step 1: The Likelihood If X Were Observable 183
8.1.2 A General Concern: Identifiable Models 184
8.2 Steps 2 and 3: Constructing Likelihoods 184
8.2.1 The Discrete Case 185
8.2.2 Likelihood Construction for General Error Models 186
8.2.3 The Berkson Model 188
8.2.4 Error Model Choice 189
8.3 Step 4: Numerical Computation of Likelihoods 190
8.4 Cervical Cancer and Herpes 190
8.5 Framingham Data 192
8.6 Nevada Test Site Reanalysis 193
8.6.1 Regression Calibration Implementation 195
8.6.2 Maximum Likelihood Implementation 196
8.7 Bronchitis Example 197
8.7.1 Calculating the Likelihood 198
8.7.2 Effects of Measurement Error on Threshold Models 199
8.7.3 Simulation Study and Maximum Likelihood 199
8.7.4 Berkson Analysis of the Data 201
8.8 Quasilikelihood and Variance Function Models 201
8.8.1 Details of Step 3 for QVF Models 202
8.8.2 Details of Step 4 for QVF Models 203
Bibliographic Notes 203
9 BAYESIAN METHODS 205
9.1 Overview 205
9.1.1 Problem Formulation 205
9.1.2 Posterior Inference 207
9.1.3 Bayesian Functional and Structural Models 208
9.1.4 Modularity of Bayesian MCMC 209
9.2 The Gibbs Sampler 209
9.3 Metropolis–Hastings Algorithm 211
9.4 Linear Regression 213
9.4.1 Example 216
9.5 Nonlinear Models 219
9.5.1 A General Model 219
9.5.2 Polynomial Regression 220
9.5.3 Multiplicative Error 221
9.5.4 Segmented Regression 222
xxiii
© 2006 by Taylor & Francis Group, LLC
9.6 Logistic Regression 223
9.7 Berkson Errors 225
9.7.1 Nonlinear Regression with Berkson Errors 225
9.7.2 Logistic Regression with Berkson Errors 227
9.7.3 Bronchitis Data 228
9.8 Automatic Implementation 230
9.8.1 Implementation and Simulations in WinBUGS 231
9.8.2 More Complex Models 234
9.9 Cervical Cancer and Herpes 235
9.10 Framingham Data 237
9.11 OPEN Data: A Variance Components Model 238
Bibliographic Notes 240
10 HYPOTHESIS TESTING 243
10.1 Overview 243
10.1.1 Simple Linear Regression, Normally Distributed
X 243
10.1.2 Analysis of Covariance 246
10.1.3 General Considerations: What Is a Valid Test? 248
10.1.4 Summary of Major Results 248
10.2 The Regression Calibration Approximation 249
10.2.1 Testing H0 : βx = 0 250
10.2.2 Testing H0 : βz = 0 250
10.2.3 Testing H0 : (βt
x, βt
z)t
= 0 250
10.3 Illustration: OPEN Data 251
10.4 Hypotheses about Subvectors of βx and βz 251
10.4.1 Illustration: Framingham Data 252
10.5 Efficient Score Tests of H0 : βx = 0 253
10.5.1 Generalized Score Tests 254
Bibliographic Notes 257
11 LONGITUDINAL DATA AND MIXED MODELS 259
11.1 Mixed Models for Longitudinal Data 259
11.1.1 Simple Linear Mixed Models 259
11.1.2 The General Linear Mixed Model 260
11.1.3 The Linear Logistic Mixed Model 261
11.1.4 The Generalized Linear Mixed Model 261
11.2 Mixed Measurement Error Models 262
11.2.1 The Variance Components Model Revisited 262
11.2.2 General Considerations 263
11.2.3 Some Simple Examples 263
11.2.4 Models for Within-Subject X-Correlation 265
11.3 A Bias-Corrected Estimator 265
xxiv
© 2006 by Taylor & Francis Group, LLC
11.4 SIMEX for GLMMEMs 267
11.5 Regression Calibration for GLMMs 267
11.6 Maximum Likelihood Estimation 268
11.7 Joint Modeling 268
11.8 Other Models and Applications 269
11.8.1 Models with Random Effects Multiplied by X 269
11.8.2 Models with Random Effects Depending Nonlin-
early on X 270
11.8.3 Inducing a True-Data Model from a Standard
Observed Data Model 270
11.8.4 Autoregressive Models in Longitudinal Data 271
11.9 Example: The CHOICE Study 272
11.9.1 Basic Model 273
11.9.2 Naive Replication and Sensitivity 273
11.9.3 Accounting for Biological Variability 274
Bibliographic Notes 276
12 NONPARAMETRIC ESTIMATION 279
12.1 Deconvolution 279
12.1.1 The Problem 279
12.1.2 Fourier Inversion 280
12.1.3 Methodology 280
12.1.4 Properties of Deconvolution Methods 281
12.1.5 Is It Possible to Estimate the Bandwidth? 282
12.1.6 Parametric Deconvolution 284
12.1.7 Estimating Distribution Functions 287
12.1.8 Optimal Score Tests 288
12.1.9 Framingham Data 289
12.1.10NHANES Data 290
12.1.11Bayesian Density Estimation by Normal Mixtures 291
12.2 Nonparametric Regression 293
12.2.1 Local-Polynomial, Kernel-Weighted Regression 293
12.2.2 Splines 294
12.2.3 QVF and Likelihood Models 295
12.2.4 SIMEX for Nonparametric Regression 296
12.2.5 Regression Calibration 297
12.2.6 Structural Splines 297
12.2.7 Taylex and Other Methods 298
12.3 Baseline Change Example 299
12.3.1 Discussion of the Baseline Change Controls Data 301
Bibliographic Notes 302
13 SEMIPARAMETRIC REGRESSION 303
xxv
© 2006 by Taylor & Francis Group, LLC
13.1 Overview 303
13.2 Additive Models 303
13.3 MCMC for Additive Spline Models 304
13.4 Monte Carlo EM-Algorithm 305
13.4.1 Starting Values 306
13.4.2 Metropolis–Hastings Fact 306
13.4.3 The Algorithm 306
13.5 Simulation with Classical Errors 309
13.6 Simulation with Berkson Errors 311
13.7 Semiparametrics: X Modeled Parametrically 312
13.8 Parametric Models: No Assumptions on X 314
13.8.1 Deconvolution Methods 314
13.8.2 Models Linear in Functions of X 315
13.8.3 Linear Logistic Regression with Replicates 316
13.8.4 Doubly Robust Parametric Modeling 317
Bibliographic Notes 318
14 SURVIVAL DATA 319
14.1 Notation and Assumptions 319
14.2 Induced Hazard Function 320
14.3 Regression Calibration for Survival Analysis 321
14.3.1 Methodology and Asymptotic Properties 321
14.3.2 Risk Set Calibration 322
14.4 SIMEX for Survival Analysis 323
14.5 Chronic Kidney Disease Progression 324
14.5.1 Regression Calibration for CKD Progression 325
14.5.2 SIMEX for CKD Progression 326
14.6 Semi and Nonparametric Methods 329
14.6.1 Nonparametric Estimation with Validation Data 330
14.6.2 Nonparametric Estimation with Replicated Data 332
14.6.3 Likelihood Estimation 333
14.7 Likelihood Inference for Frailty Models 336
Bibliographic Notes 337
15 RESPONSE VARIABLE ERROR 339
15.1 Response Error and Linear Regression 339
15.2 Other Forms of Additive Response Error 343
15.2.1 Biased Responses 343
15.2.2 Response Error in Heteroscedastic Regression 344
15.3 Logistic Regression with Response Error 345
15.3.1 The Impact of Response Misclassification 345
15.3.2 Correcting for Response Misclassification 347
15.4 Likelihood Methods 353
xxvi
© 2006 by Taylor & Francis Group, LLC
15.4.1 General Likelihood Theory and Surrogates 353
15.4.2 Validation Data 354
15.5 Use of Complete Data Only 355
15.5.1 Likelihood of the Validation Data 355
15.5.2 Other Methods 356
15.6 Semiparametric Methods for Validation Data 356
15.6.1 Simple Random Sampling 356
15.6.2 Other Types of Sampling 357
Bibliographic Notes 358
A BACKGROUND MATERIAL 359
A.1 Overview 359
A.2 Normal and Lognormal Distributions 359
A.3 Gamma and Inverse-Gamma Distributions 360
A.4 Best and Best Linear Prediction and Regression 361
A.4.1 Linear Prediction 361
A.4.2 Best Linear Prediction without an Intercept 363
A.4.3 Nonlinear Prediction 363
A.5 Likelihood Methods 364
A.5.1 Notation 364
A.5.2 Maximum Likelihood Estimation 364
A.5.3 Likelihood Ratio Tests 365
A.5.4 Profile Likelihood and Likelihood Ratio Confidence
Intervals 365
A.5.5 Efficient Score Tests 366
A.6 Unbiased Estimating Equations 367
A.6.1 Introduction and Basic Large Sample Theory 367
A.6.2 Sandwich Formula Example: Linear Regression
without Measurement Error 369
A.6.3 Sandwich Method and Likelihood-Type Inference 370
A.6.4 Unbiased, but Conditionally Biased, Estimating
Equations 372
A.6.5 Biased Estimating Equations 372
A.6.6 Stacking Estimating Equations: Using Prior Esti-
mates of Some Parameters 372
A.7 Quasilikelihood and Variance Function Models (QVF) 374
A.7.1 General Ideas 374
A.7.2 Estimation and Inference for QVF Models 375
A.8 Generalized Linear Models 377
A.9 Bootstrap Methods 377
A.9.1 Introduction 377
A.9.2 Nonlinear Regression without Measurement Error 378
A.9.3 Bootstrapping Heteroscedastic Regression Models 380
xxvii
© 2006 by Taylor & Francis Group, LLC
A.9.4 Bootstrapping Logistic Regression Models 380
A.9.5 Bootstrapping Measurement Error Models 381
A.9.6 Bootstrap Confidence Intervals 382
B TECHNICAL DETAILS 385
B.1 Appendix to Chapter 1: Power in Berkson and Classical
Error Models 385
B.2 Appendix to Chapter 3: Linear Regression and Attenua-
tion 386
B.3 Appendix to Chapter 4: Regression Calibration 387
B.3.1 Standard Errors and Replication 387
B.3.2 Quadratic Regression: Details of the Expanded
Calibration Model 391
B.3.3 Heuristics and Accuracy of the Approximations 391
B.4 Appendix to Chapter 5: SIMEX 392
B.4.1 Simulation Extrapolation Variance Estimation 393
B.4.2 Estimating Equation Approach to Variance Esti-
mation 395
B.5 Appendix to Chapter 6: Instrumental Variables 399
B.5.1 Derivation of the Estimators 399
B.5.2 Asymptotic Distribution Approximations 401
B.6 Appendix to Chapter 7: Score Function Methods 406
B.6.1 Technical Complements to Conditional Score
Theory 406
B.6.2 Technical Complements to Distribution Theory
for Estimated Σuu 406
B.7 Appendix to Chapter 8: Likelihood and Quasilikelihood 407
B.7.1 Monte Carlo Computation of Integrals 407
B.7.2 Linear, Probit, and Logistic Regression 408
B.8 Appendix to Chapter 9: Bayesian Methods 409
B.8.1 Code for Section 9.8.1 409
B.8.2 Code for Section 9.11 410
References 413
xxviii
© 2006 by Taylor & Francis Group, LLC
CHAPTER 1
INTRODUCTION
1.1 The Double/Triple Whammy of Measurement Error
Measurement error in covariates has three effects:
• It causes bias in parameter estimation for statistical models.
• It leads to a loss of power, sometimes profound, for detecting inter-
esting relationship among variables.
• It masks the features of the data, making graphical model analysis
difficult.
We call the first two the double whammy of measurement error. Most of
the statistical methods described in this book are aimed at the first prob-
lem, namely, to correct for biases of estimation caused by measurement
error. Later in this chapter, we will describe an example from radiation
dosimetry and the profound loss of power for detecting risks that occurs
with uncertainties in individual doses. Here, we briefly describe the third
issue, the masking of features.
Consider a regression of a response Y on a predictor X, uniformly
distributed on the interval [−2, 2]. Suppose that the mean is sin(2X)
and the variance σ2
² = 0.10. In the top panel of Figure 1.1, we plot 200
simulated observations from such a model that indicate quite clearly
the sinusoidal aspect of the regression function. However, suppose that
instead of observing X, we observe W, normally distributed with mean
X but with variance 4/9. As we will later describe in Section 3.2.1, this
is an attenuation coefficient of 0.75. Thus, what we observe is not X, but
an unbiased estimate of it, W. In the bottom panel of Figure 1.1, we
plot the observed data Y versus W. Note that the sinusoid is no longer
evident and the main feature of the data has been hidden.
It is also worth noting that the variability about the sinusoid is far
smaller when X is observed than the variability about any curve one
could reasonably guess at when only W is observed. This is one sub-
stantial cause of the loss of power. Finally, if one only observes (Y, W)
and hence the bottom panel of Figure 1.1, it would be essentially impos-
sible to reconstruct the sinusoid, and something different would certainly
be used. This is the bias caused by measurement error.
1
© 2006 by Taylor & Francis Group, LLC
−3 −2 −1 0 1 2 3
−2
−1
0
1
2
−3 −2 −1 0 1 2 3
−2
−1
0
1
2
Figure 1.1 Illustration of the bias, loss of power, and masking of features
caused by measurement error in predictors. Top panel regression on the true
covariate. Bottom panel regression on the observed covariate.
1.2 Classical Measurement Error: A Nutrition Example
Much of the measurement error literature is based around what is called
classical measurement error, in which the truth is measured with ad-
ditive error, usually with constant variance. We introduce the classical
measurement error model via an example from nutrition.
In the National Cancer Institute’s OPEN study, see Subar, Thomp-
son, Kipnis, et al. (2001), one interest is in measuring the logarithm of
dietary protein intake. True, long-term log-intake is denoted by X, but
this cannot be observed in practice. Instead, the investigators measured
a biomarker of log-protein intake, namely urinary nitrogen, denoted by
W. In this study, 297 subjects had replicated urinary nitrogen mea-
surements. If there were no measurement error, then of course the two
biomarker measurements would be equal, but then, since this is a book
about measurement error, we would not be wasting space. Indeed, in
Figure 1.2 we see that when we plot the second biomarker versus the
first, the correlation is relatively high (0.695), but there clearly is some
variability in the measurements.
In this context, there is evidence from feeding studies that the pro-
tein biomarker captures true protein intake with added variability. Such
situations are often called classical measurement error. In symbols, let
Xi be the true log-protein intake for individual i, and let Wij be the
2
© 2006 by Taylor & Francis Group, LLC
4.5 5 5.5 6 6.5 7
5
5.2
5.4
5.6
5.8
6
6.2
6.4
6.6
6.8
First Protein Biomarker, Attenuation = 0.694
Second
Protein
Biomarker
OPEN data, Protein, Log Scale, Correlation = 0.695
Figure 1.2 OPEN Study data, scatterplot of the logarithm of the first and sec-
ond protein biomarker measurements. The fact that there is scatter means that
the biomarker has measurement error.
jth
biomarker log-protein measurement. Then the classical measurement
error model states that
Wij = Xi + Uij. (1.1)
In this model, Wij is an unbiased measure of Xi, so that Uij must have
mean zero, that is, in symbols, E(Uij|Xi) = 0. The error structure of Uij
could be homoscedastic (constant variance) or heteroscedastic. In this
particular example, we will show later, in Section 1.7, that the measure-
ment error structure is approximately normal with constant variance, so
we can reasonably think that Uij|Xi ∼ Normal(0, σ2
u).
1.3 Measurement Error Examples
Nonlinear measurement error models commonly begin with an underly-
ing nonlinear model for the response Y in terms of the predictors. We
distinguish between two kinds of predictors: Z represents those predic-
tors that, for all practical purposes, are measured without error, and X
those that cannot be observed exactly for all study subjects. The distin-
guishing feature of a measurement error problem is that we can observe
a variable W, which is related to an unobservable X. The parameters in
the model relating Y and (Z, X) cannot, of course, be estimated directly
3
© 2006 by Taylor & Francis Group, LLC
by fitting Y to (Z, X), since X is not observed. The goal of measurement
error modeling is to obtain nearly unbiased estimates of these parame-
ters indirectly by fitting a model for Y in terms of (Z, W). Attainment
of this goal requires careful analysis. Substituting W for X, but making
no adjustments in the usual fitting methods for this substitution, leads
to estimates that are biased, sometimes seriously, see Figure 1.1. The
problem here is that the parameters of the regression of Y on (Z, W)
are different from those of Y on (Z, X).
In assessing measurement error, careful attention must be given to the
type and nature of the error, and the sources of data that allow modeling
of this error. The following examples illustrate some of the different types
of problems considered in this book.
1.4 Radiation Epidemiology and Berkson Errors
There are many studies relating radiation exposure to disease, including
the Nevada Test Site (NTS) Thyroid Disease Study and the Hanford
Thyroid Disease Study (HTDS). Full disclosure: One of us (RJC) was
involved in litigation concerning HTDS, and his expert report is avail-
able at http://www.downwinders.com/files/htds expert report.pdf, the
plaintiffs’ Web site, at least as of May 2005.
Stevens, Till, Thomas, et al. (1992); Kerber, Till, Simon, et al. (1993);
and Simon, Till, Lloyd, et al. (1995) described the Nevada test site study,
where radiation exposure largely came as the result of above-ground
nuclear testing in the 1950s. Similar statistical issues arise in the Hanford
Thyroid Disease Study: see Davis, Kopecky, Stram, et al. (2002); Stram
and Kopecky (2003); and Kopecky, Davis, Hamilton, et al. (2004), where
radiation was released in the 1950s and 1960s. In the Nevada study,
over 2, 000 individuals who were exposed to radiation as children were
examined for thyroid disease. The primary radiation exposure came from
milk and vegetables. The idea of the study was to relate various thyroid
disease outcomes to radiation exposure to the thyroid.
Of course, once again, since this is a book about measurement er-
ror, the main exposure of interest, radiation to the thyroid, cannot be
observed exactly. What is typical in these studies is to build a large
dosimetry model that attempts to convert the known data about the
above-ground nuclear tests to radiation actually absorbed into the thy-
roid. Dosimetry calculations in NTS were based on age at exposure,
gender, residence history, x-ray history, whether the individual was as a
child breast-fed, and a diet questionnaire filled out by the parent, focus-
ing on milk consumption and vegetables. The data were then input into
a complex model and, for each individual, the point estimate of thyroid
dose and an associated standard error for the measurement error were
4
© 2006 by Taylor & Francis Group, LLC
reported. Roughly similar considerations led to the dose estimates and
uncertainties in HTDS.
In both NTS and HTDS, the authors consider analyses taking into
account the uncertainties (measurement error) in dose estimates. Indeed,
both consider the classical measurement error situation in (1.1). The
HTDS study, though, also considered a different type of measurement
error, and based most of their power calculations on it. We will go into
detail on the power and analysis issues; see Section 1.8.2 of this chapter
for power and Section 8.6 for the analysis.
What we see in the classical measurement error model (1.1) is that the
observed dose equals the true dose plus (classical) measurement error.
This, of course, means that the variability of the observed doses will
be greater than the variability of true doses. In HTDS, in contrast, the
authors not only consider this classical measurement error, but they also
turn the issue around; namely, they assumed that the true dose is equal
to the estimated dose plus measurement error. In symbols, this is
Xi = Wi + Ui, (1.2)
where E(Ui|Wi) = 0, so that the true dose has more variability than
the estimated dose; contrast with (1.1). Model (1.2) is called a Berkson
measurement error model, see Berkson (1950).
1.4.1 The Difference Between Berkson and Classical Errors: How to
Gain More Power Without Really Trying
Measurement error modeling requires considerable care. In this section,
we discuss why it is crucial that one understands the seemingly subtle
differences between Berkson and classical errors, and we illustrate some
possible pitfalls when choosing between the two error models. As far as
we are aware, one cannot be put in jail for using the wrong model, but
an incorrect measurement error model often causes erroneous inferences,
which to a statistician is worse than going to jail (okay, we have exag-
gerated). In Section 2.2.2 we provide additional guidance so that the
reader can be confident of choosing the correct error model in his/her
own work.
The difference between Berkson and classical measurement error is
major when one is planning a study a priori, especially when one is
attempting power calculations. There are some technical similarities be-
tween classical and Berkson errors, see Section 3.2.2, but different issues
arise in power calculations. What we will indicate here is that for a given
measurement error variance, if you want to convince yourself that you
have lots of statistical power despite measurement error, just pretend
that the measurement error is Berkson and not classical.
5
© 2006 by Taylor & Francis Group, LLC
Suppose that the observed data have a normal distribution with mean
zero and variance σ2
w = 2.0. Suppose also that the measurement error
has variance σ2
u = 1.0. Then, if one assumes a Berkson model, the true
doses have mean zero and variance σ2
x = 3.0. This is so because the
variance of X in (1.2) is the sum of the variance of W (σ2
w = 2.0) and
the variance of the Berkson measurement error U (σ2
u = 1.0). Now, in
major contrast, if one assumes that the measurement error is classical
instead of Berkson, then the variance of X is, from (1.1), the difference
of the variance of W (2.0) and the variance of the classical measurement
error U (1.0), that is, 1.0. In other words, if we assume Berkson error, we
think that the true dose X has variance 3.0, while if we assume classical
measurement error, we think that the variance of the true dose equals
1.0, a feature reflected in Figure 1.3. Now, for a given set of parameter
values of risk, it is generally the case that the power increases when the
variance of true exposure X increases, Hence, assuming Berkson when
the error is classical leads to a grossly optimistic overstatement of power.
−5 −4 −3 −2 −1 0 1 2 3 4
0
10
20
30
40
50
60
Your Estimates Of True Doses If Error Is Berkson
−5 −4 −3 −2 −1 0 1 2 3 4
0
10
20
30
40
50
60
Your Estimates Of True Doses If Error Is Classical
Figure 1.3 A hypothetical example where the observed doses W have mean zero
and variance 2.0, while the measurement errors have mean zero and variance
1.0. Displayed are the distributions of true dose that you think you have if you
think that the errors are Berkson (top) or if you think the errors are classical
(bottom). The much smaller variability of true dose under the classical model
indicates that the power for detecting effects will be much smaller than if the
errors are Berkson.
6
© 2006 by Taylor & Francis Group, LLC
7.2 7.4 7.6 7.8 8 8.2 8.4 8.6
6
6.5
7
7.5
8
8.5
9
Energy Biomarker
Dietary
History
Questionnaire
OPEN data, Energy, Log Scale, Correlation = 0.28
Figure 1.4 OPEN Study data, scatterplot of the logarithm of energy (calories)
using a food frequency questionnaire and a biomarker.
Further discussion of differences and similarities between power in
classical and Berkson error models can be found in Section B.1.
1.5 Classical Measurement Error Model Extensions
It almost goes without saying, but we will say it, that measurement error
models can be more complex than the classical additive measurement
error model (1.1) or the classical Berkson error model (1.2). Here we
illustrate some of the complexities of measurement error modeling via
an important nutrition biomarker study.
The study of diet and disease has been a major motivation for nonlin-
ear measurement error modeling. In these studies, it is typical to mea-
sure diet via a self–report instrument, for example, a food frequency
questionnaire (FFQ), some sort of diary, or a 24-hour recall interview. It
has been appreciated for decades that these self-report instruments are
only imperfect measures of long-term dietary intakes, and hence that
measurement error is a major concern.
To understand the profound nature of measurement error in this con-
text, we consider the National Cancer Institute’s OPEN study, which
is one of the largest biomarker studies ever done; see Subar, Kipnis,
Troiano, et al. (2003) and Kipnis, Midthune, Freedman, et al. (2003).
We illustrate this measurement error with energy (caloric) intake mea-
7
© 2006 by Taylor & Francis Group, LLC
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
0
50
100
150
Biomarker for Calories
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
0
50
100
150
FFQ for Calories
Figure 1.5 OPEN Study data, histograms of energy (calories) using a
biomarker (top panel) and a food frequency questionnaire (bottom panel). Note
how individuals report far fewer calories than they actually consume.
sures. In the OPEN Study, energy intake was measured by the dietary
history questionnaire, an FFQ described in Subar, Thompson, Kipnis, et
al. (2001). In keeping with our notation, since the FFQ is not the truth,
we will denote by W the log energy intake as measured by the FFQ. In
addition, the investigators obtained a near-perfect biomarker measure
of energy intake using a technique called doubly-labeled water (DLW),
which we call X. DLW is basically what it sounds like: Participants drink
water that is enriched with respect to two isotopes, and urine samples
allow the measurement of energy expenditure.
That true intake X and observed intake W can be very different is
seen in Figure 1.4, where we plot the FFQ versus the biomarker along
with the associated least squares line. The correlation between truth
and observed is only 0.28, indicating that the FFQ is not a very good
measure of energy intake. It is also interesting to note the histograms for
these two instruments; see Figure 1.5. One can see there that the FFQ is
also clearly badly biased downward in general for energy intake, that is,
people eat more calories than they are willing to report (no surprise!).
In this example, because of the biases seen in Figures 1.4 and 1.5 the
FFQ is not an unbiased measure of true energy intake, and hence the
classical measurement error model (1.1) clearly does not hold. A more
reasonable model, promoted in a series of papers by Kipnis et al. (1999,
8
© 2006 by Taylor & Francis Group, LLC
2001, 2003), is to allow for bias as well as variance components
Wij = γ0 + γ1Xij + Uij, (1.3)
Uij = ri + ²ij,
where ri ∼ Normal(0, σ2
r ) and ²ij ∼ Normal(0, σ2
² ). In model (1.3), the
linear regression in true intake reflects the biases of the FFQ. The struc-
ture of the measurement error random variables Uij is that they have two
components: a shared component r and a random component ². Kipnis
et al. (1999, 2001, 2003) call the shared component person-specific bias,
reflecting the idea that two people who eat exactly the same foods will
nonetheless systematically report intakes differently when given multiple
FFQs. Fuller (1987) calls the person-specific bias an equation error.
Of course, if γ0 = 0, γ1 = 1, and ri ≡ 0, then we have the standard
classical measurement error model (1.1).
1.6 Other Examples of Measurement Error Models
1.6.1 NHANES
The NHANES-I Epidemiologic Study Cohort data set (Jones, Schatzen,
Green, et al., 1987) is a cohort study originally consisting of 8,596 women
who were interviewed about their nutrition habits and later examined for
evidence of cancer. We restrict attention to a subcohort of 3,145 women
aged 25–50 who have no missing data on the variables of interest.
The response Y indicates the presence of breast cancer. The predictor
variables Z, assumed to be measured without significant error, include
the following: age, poverty index ratio, body mass index, alcohol (Yes,
No), family history of breast cancer, age at menarche, and menopausal
status. We are primarily interested in the effects of nutrition variables
X that are known to be imprecisely measured, for example, “long-term”
saturated fat intake.
If all these underlying variables were observable, then a standard lo-
gistic regression analysis would be performed. However, it is both diffi-
cult and expensive to measure long-term diet in a large cohort. In the
NHANES data, instead of observing X, the measured W was a 24-hour
recall, that is, each participant’s diet in the previous 24 hours was re-
called and nutrition variables computed. That the measurement error is
large in 24-hour recalls has been documented previously (Beaton, Mil-
nor, & Little, 1979; Wu, Whittemore, & Jung, 1986). Indeed, there is
evidence to support the conclusion that more than half of the variability
in the observed data is due to measurement error.
There are several sources of the measurement error. First, there is
the error in the ascertainment of food consumption in the previous 24
hours, especially amounts. Some of this type of error is purely random,
9
© 2006 by Taylor & Francis Group, LLC
while another part is due to systematic bias, for example, some people
resist giving an accurate description of their consumption of snacks. The
size of potential systematic bias can be determined in some instances
(Freedman, Carroll, & Wax, 1991), but in the present study we have
available only the 24-hour recall information, and any systematic bias is
unidentifiable.
The major source of “error” is the fact that a single day’s diet does
not serve as an adequate measure of the previous year’s diet. There
are seasonaL differences in diet, as well as day-to-day variations. This
points out the fact that measurement error is much more than simple
recording or instrument error and encompasses many different sources
of variability.
There is insufficient information in the NHANES data to model mea-
surement error directly. Instead, the measurement error structure was
modeled using an external data set, the CSFII (Continuing Survey of
Food Intakes by Individuals) data (Thompson, Sowers, Frongillo, et al.,
1992). The CSFII data contain the 24-hour recall measures W, as well
as three additional 24-hour recall phone interviews. Using external data,
rather than assessing measurement error on an internal subset of the
primary study, entails certain risks that we discuss later in this chapter.
The basic problem is that parameters in the external study may differ
from parameters in the primary study, leading to bias when external
estimates are transported to the primary study.
1.6.2 Nurses’ Health Study
While the OPEN Study focused on the properties of instruments for
measuring nutrient intakes, the real interest is in relating disease and
nutrient intakes. A famous and still ongoing study concerning nutrition
and breast cancer has been considered by Rosner, Willett, & Spiegelman
(1989) and Rosner, Spiegelman, & Willett (1990), namely, the Nurses’
Health Study. The study has over 80,000 participants and includes many
breast cancer cases. The variables are much the same as in the OPEN
study, with the exceptions that (1) alcohol is assessed differently and
(2) a food-frequency questionnaire was used instead of 24-hour recall
interviews. The size of the measurement error in the nutrition variables
is still quite large. Here, X = (long-term average alcohol intake, long-
term average nutrient intake) and W = (alcohol intake measured by
FFQs, nutrient intake measured by FFQs). It is known that W is both
highly variable and biased as an estimator of X.
The Nurses’ Health Study was designed so that a direct assessment of
measurement error is possible. Specifically, 173 nurses recorded alcohol
and nutrient intakes in diary form for four different weeks over the course
10
© 2006 by Taylor & Francis Group, LLC
of a year. The average, T, of these diary entries is taken to be an unbiased
estimate of X. We will call T a second measure of X. Thus, in contrast
to NHANES, measurement error was assessed on data internal to the
primary study. Because T is unbiased for X, E(T|W) = E(X|W), so
we can estimate E(X|W) by regressing T on W. Estimating E(X|W)
is the crucial first step in regression calibration, a widely used method
of correcting for measurement error; see Chapter 4.
1.6.3 The Atherosclerosis Risk in Communities Study
The Atherosclerosis Risk in Communities (ARIC) study is a multipur-
pose prospective cohort study described in detail by The ARIC Investiga-
tors (1989). From 1987 through 1989, 15,792 male and female volunteers
were recruited from four U.S. communities (Forsyth County, NC; subur-
ban Minneapolis, MN; Washington County, MD; and Jackson, MS) for a
baseline visit including at-home interviews, clinic examination, and lab-
oratory measurements. Participants returned approximately every three
years for second (1990–1992), third (1993–1995), and fourth (1996–98)
visits. Time to event data were obtained from annual participant in-
terviews and review of local hospital discharge lists and county death
certificates. The “event” was primary coronary kidney disease (CKD).
One purpose of the study was to explain the race effect on the progres-
sion of CKD. In particular, African-Americans have maintained approx-
imately four times the age- and sex-adjusted rate of end-stage renal dis-
ease (ESRD) compared to whites during the last two decades (USRDS,
2003), while the prevalence of decreased kidney function (CKD Stage 3)
in the U.S. is lower among African-Americans than whites. These pat-
terns suggest that that African-Americans progress faster through the
different stages of kidney disease.
In Chapter 14 we investigate the race effect on the probability of
progression to CKD using a survival data approach. An important con-
founder is the baseline kidney function, which is typically measured by
the estimated glomerular filtration rate (eGFR), which is a noisy version
of GFR obtained from a prediction equation. The nature of the adjust-
ment is more complex because of the nonmonotonic relationship between
eGFR and progression probability.
1.6.4 Bioassay in a Herbicide Study
Rudemo, Ruppert, & Streibig (1989) consider a bioassay experiment
with plants, in which eight herbicides were applied. For each of these
eight combinations, six (common) nonzero doses were applied and the
dry weight Y of five plants grown in the same pot was measured. In
11
© 2006 by Taylor & Francis Group, LLC
this instance, the predictor variable X of interest is the amount of the
herbicide actually absorbed by the plant, a quantity that cannot be mea-
sured. Here the response is continuous, and if X were observable, then
a nonlinear regression model would have been fit, probably by nonlin-
ear least squares. The four-parameter logistic model (not to be confused
with logistic regression where the response is binary) is commonly used.
However, X is not observable; instead, we know only the nominal
concentration W of herbicide applied to the plant. The sources of error
include not only the error in diluting to the nominal concentration, but
also the fact that two plants receiving the same amount of herbicide may
absorb different amounts.
In this example, the measurement error was not assessed directly. In-
stead, the authors assumed that the true amount X was linearly related
to the nominal amount W with nonconstant variance. This error model,
combined with the approach discussed in Chapter 4, was used to con-
struct a new model for the observed data.
1.6.5 Lung Function in Children
Tosteson, Stefanski, & Schafer (1989) described an example in which
the response was the presence (Y = 1) or absence (Y = 0) of wheeze
in children, which is an indicator of lung dysfunction. The predictor
variable of interest is X = personal exposure to NO2. Since Y is a binary
variable, if X were observable, the authors would have used logistic or
probit regression to model the relationship of Y and X. However, X
was not available in their study. Instead, the investigators were able
to measure a bivariate variable W, consisting of observed kitchen and
bedroom concentrations of NO2 in the child’s home. School-aged children
spend only a portion of their time in their homes, and only a portion
of that time in their kitchens and bedrooms. Thus, it is clear that the
true NO2 concentration is not fully explained by what happens in the
kitchen and bedroom.
While X was not measured in the primary data set, two independent,
external studies were available in which both X and W were observed.
We will describe this example in more detail later in this chapter.
1.6.6 Coronary Heart Disease and Blood Pressure
The Framingham study (Kannel, Neaton, Wentworth, et al., 1986) is a
large cohort study following individuals for the development Y of coro-
nary heart disease. The main predictor of interest in the study is systolic
blood pressure, but other variables include age at first exam, body mass,
serum cholesterol, and whether or not the person is a smoker. In princi-
12
© 2006 by Taylor & Francis Group, LLC
ple at least, Z consists only of age, body mass, and smoking status, while
the variables X measured with error are serum cholesterol and systolic
blood pressure. It should be noted that in a related analysis MacMahon,
Peto, Cutler, et al. (1990) consider only the last as a variable measured
with error. We will follow this convention in our discussion.
Again, it is impossible to measure long-term systolic blood pressure
X. Instead, what is available is the blood pressure W observed during a
clinic visit. The reason that the long-term X and the single-visit W differ
is that blood pressure has major daily, as well as seasonal, variation.
Generally, the classical measurement error model (1.1) is used in this
context.
In this experiment, we have an extra measurement of blood pressure
T from a clinic visit taken 4 years before W was observed. Hence, unlike
any of the other studies we have discussed, in the Framingham study we
have information on measurement error for each individual. One can look
at T as simply a replicate of W. However, T may be a biased measure
of X because of temporal changes in the distribution of blood pressure
in the population. Each way of looking at the data is useful and leads
to different methods of analysis.
1.6.7 A-Bomb Survivors Data
Pierce, Stram, Vaeth, et al. (1992) considered analysis of A-bomb sur-
vivor data from the Hiroshima and Nagasaki explosions. They discuss
various responses Y, including the number of chromosomal aberrations.
The true radiation dose X cannot be measured; instead, estimates W
are available. They assume, as an approximation, that W = 0 if and
only if X = 0. They adopt a fully parametric approach, specifying that
when X and W are positive, then W is lognormal with median X and
coefficient of variation of 30%. They assume that if X is positive, it has a
Weibull distribution. In symbols, they propose the multiplicative model
W = X U, log(U) ∼ Normal(µu, σ2
u),
where log(U) is normally distributed with mean zero and variance 0.0862.
1.6.8 Blood Pressure and Urinary Sodium Chloride
Liu & Liang (1992) described a problem of logistic regression where the
response Y is the presence of high systolic blood pressure (greater than
140). However, in this particular study blood pressure was measured
many times and the average recorded, so that the amount of measure-
ment error in the average systolic blood pressure is reasonably small. The
predictors Z measured without error are age and body mass index. The
13
© 2006 by Taylor & Francis Group, LLC
predictor X subject to measurement error is urinary sodium chloride,
which is subject to error because of intra-individual variation over time
and also possibly due to measurement error in the chemical analyses. In
order to understand the effects of measurement error, 24-hour urinary
sodium chloride was measured on 6 consecutive days.
1.6.9 Multiplicative Error for Confidentiality
Hwang (1986) used survey data released by the U. S. Department of
Energy on energy consumption by U. S. households. The exact values
of certain variables, for example, heating and cooling degree days, were
not given since this information might allow the homeowners to be iden-
tified. Instead the Department of Energy multiplied these variables by
computer-generated random numbers. The Department of Energy re-
leased the method for generating the random errors, so this is a rare
case where the error distribution is known exactly.
1.6.10 Cervical Cancer and Herpes Simplex Virus
In this example, the question is whether exposure to herpes simplex virus
increases the risk of cervical cancer. The data are listed in Carroll, Gail,
& Lubin (1993). The response Y is the indicator of invasive cervical
cancer, X is exposure to herpes simplex virus, type 2 (HSV-2) measured
by a refined western blot procedure, and W is exposure to HSV-2 mea-
sured by the western blot procedure. See Hildesheim, Mann, Brinton, et
al. (1991) for biological background to this problem. There are 115 com-
plete observations where (Y, X, W) is observed and 1,929 incomplete
observations where only (Y, W) is observed. There are 39 cases (Y = 1)
among the complete data and 693 cases among the incomplete data.
Among the complete data, there is substantial misclassification, that is,
observations where X 6= W. Also, there is evidence of differential er-
ror, meaning that the probability of misclassification depends on the re-
sponse, that is, P(X = W|X = x, Y = 0) 6= P(X = W|X = x, Y = 1).
1.7 Checking the Classical Error Model
Suppose that the classical error additive measurement error model (1.1)
holds, and that the errors U are symmetric and have constant vari-
ance in both X and any covariates Z measured without error, that is,
var(U|Z, X) = σ2
(a constant). Then, if the instrument W can be repli-
cated, the sample standard deviation of the W-values for an individual
are uncorrelated with the individual means, and they are also uncorre-
lated with Z. Further, suppose that these errors are normally distributed.
14
© 2006 by Taylor & Francis Group, LLC
Then differences of the replicates within an individual are normally dis-
tributed. This leads to simple graphical devices:
• Plot the sample standard deviation of the W-values for an individual
against her/his sample mean, call it W. If there are no obvious trends,
this suggests that the measurement error variance does not depend
on X.
• Plot the sample standard deviation of the W-values for an individ-
ual against her/his covariates Z. If there are no obvious trends, this
suggests that the measurement error variance does not depend on Z.
• Form the differences between replications within an individual, and
then form a q-q plot of these differences across individuals. If the
q-q plot shows no evidence of nonnormality, this suggests that the
measurement errors are also roughly normally distributed.
100 200 300 400 500 600 700 800 900
0
20
40
60
80
100
120
140
160
180
Mean of the Protein Biomarkers
s.d.
of
the
Protein
Biomarkers
OPEN data, Protein, Original Scale, Constant Variance Plot
Figure 1.6 OPEN Study data, plot of the within-individual standard deviation
versus mean of the actual untransformed protein biomarkers. The obvious re-
gression slope indicates that the variance of the measurement error depends on
true protein intake.
For example, consider the protein biomarker in the OPEN study; see
Section 1.2. In Figure 1.6 we plot the standard deviation of the replicates
versus the mean in the original protein scale. The fact that there is an
obvious regression slope and the standard deviation of the biomarker
varies by a factor of four over the range of the biomarker’s mean is
strong evidence that, at the very least, the variance of the measurement
error depends on true intake.
15
© 2006 by Taylor & Francis Group, LLC
5 5.2 5.4 5.6 5.8 6 6.2 6.4 6.6 6.8
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Mean of the Protein Biomarkers
s.d.
of
the
Protein
Biomarkers
OPEN data, Protein, Log Scale, Constant Variance Plot
Figure 1.7 OPEN Study data, plot of the within-individual standard deviation
versus mean of the log protein biomarkers. The lack of any major regression
slope indicates approximately constant variance measurement error.
−3 −2 −1 0 1 2 3
−1.2
−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
Standard Normal Quantiles
QQ Plot of Log Protein Biomarker Differences, OPEN Study
Figure 1.8 OPEN Study data, q-q plot of the differences of the log protein
biomarkers. The nearly straight line of the data indicate nearly normally dis-
tributed measurement errors.
16
© 2006 by Taylor & Francis Group, LLC
−3 −2 −1 0 1 2 3
−40
−30
−20
−10
0
10
20
Standard Normal Quantiles
Figure 1.9 Normal q-q plot of the differences between independent Lognor-
mal(0,1) random variables, n = 200.
A standard way to remove nonconstant variability is via a transfor-
mation, and the obvious first attempt is to take logarithms. Figure 1.7
is the standard deviation versus the mean plot in this transformed scale.
In contrast to Figure 1.6, here we see no major trend, suggesting that
the transformation was successful in removing most of the nonconstant
variation. Figure 1.8 gives the q-q plot of the differences: this is not a
perfect straight line, but it is reasonably close to straight, suggesting
that the transformation has also helped make the data much closer to
normally distributed.
Using differences between replicates to assess normality has its pitfalls.
The difference between two iid random variables has a symmetric dis-
tribution even when the random variable themselves are highly skewed.
Thus, nonnormality of measurement errors is somewhat hidden by using
differences. For example, Figure 1.9 is a normal q-q plot of the differ-
ences between 200 pairs of Lognormal(0,1) random variables; see Section
A.2 for the lognormal distribution. Note that the q-q plot shows no sign
of asymmetry. Nonnormality is evident only in the presence of heavier-
than-Gaussian tails.
17
© 2006 by Taylor & Francis Group, LLC
1.8 Loss of Power
Classical measurement error causes loss of power, sometimes a profound
loss of power. We illustrate this in two situations: linear regression and
radiation epidemiology.
1.8.1 Linear Regression Example
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
0.4
0.5
0.6
0.7
0.8
0.9
1
Measurement Error Variance
Power
for
One−Sided
Test
Simulation: Loss of Power With Increasing Classical Measurement Error
Figure 1.10 An illustration of the loss of power when there is classical mea-
surement error. When X is observed, the measurement error variance = 0.0,
and the power is 90%. When X is not observed and the measurement error
variance = 1.0, 1/2 of the variability of the observed W is due to noise, and
the power is only 62%. When 2/3 of the variability of W is due to noise, the
power is only 44%.
Here we consider the simple linear regression model
Yi = β0 + βxXi + ²i,
where β0 = 0.0, βx = 0.69, var(X) = var(²) = 1.0, and the sample
size is n = 20. The results here are based on exact calculations using
the program nQuery Advisor. The slope was chosen so that, when X is
observed, there is approximately 90% power for a one-sided test of the
null hypothesis H0 : βx = 0.
We added classical measurement error to the true Xs using the model
(1.1), where we varied the variance of the measurement errors U from
0.0 to 2.0. When var(U) = 0.0, we are in the case that there is no clas-
sical measurement error, and the power is 90%. When the measurement
18
© 2006 by Taylor & Francis Group, LLC
error variance is var(U) = 1.0, this means that the observed predictors
have variance var(W) = var(X) + var(U) = 2.0, and hence 1/2 of the
variability in the observed predictors is due to noise. At the extreme
with var(U) = 2.0, 2/3 of the variability in the observed predictors is
due to noise.
The results are displayed in Figure 1.10. Here we see that while the
power would be 90% if X could be observed, when the measurement
error variance equals the variance of X, and hence 1/2 of the variability
in W is due to noise, the power crashes to 62%. Even worse, when 2/3
of the variability in the observed W is noise, the power falls below 50%.
This is the first of the double whammy of measurement error; see Section
1.1.
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
0
10
20
30
40
50
60
70
80
Measurement Error Variance
Sample
Size
Sample Size for 90% Power With Increasing Classical Measurement Error
Figure 1.11 The sample size version of Figure 1.10. When there is no mea-
surement error, the sample size needed for 90% power is n = 20. When X is
not observed and the measurement error variance = 1.0, 1/2 of the variability
of the observed W is due to noise, the necessary sample size for 90% power
more than doubles to n = 45. When 2/3 of the variability of W is due to noise,
the required sample size is n > 70.
The flip side of a loss of power due to classical measurement error is
that sample sizes necessary to gain a given power can increase dramati-
cally. The following power calculations were done assuming all variances
are known, and so should be interpreted qualitatively. In Figure 1.11, we
show that while only n = 20 is required for 90% power when there is no
measurement error, when 1/2 of the variability in the observed predictor
W is due to noise, we require at least n = 45 observations, an increase of
19
© 2006 by Taylor & Francis Group, LLC
Other documents randomly have
different content
FOOTNOTES
[1]
Dugald Stewart wrongly describes him as a Writer to the Signet,
confusing him with a contemporary of the same name.
[2]
See W. R. Scott’s Hutcheson (1900).
[3]
Even in 1763 there was but one stage-coach in Scotland “which
set out [from Edinburgh] once a month for London, and was
from twelve to fourteen days on the journey.”—George
Robertson’s Rural Recollections, p. 4.
[4]
See the Wealth of Nations, Book V. ch. i. art. 2.
[5]
See the Wealth of Nations, Book I. chap. ii.
[6]
The advertisement goes on to say: “It is long since he found it
necessary to abandon that plan as far too extensive; and these
parts of it lay beside him neglected till he was dead.”
[7]
First, Dugald Stewart declares that the History of Astronomy “was
one of Mr. Smith’s earliest compositions.” Second, in a letter
constituting Hume his literary executor, Smith describes it as a
fragment of an intended juvenile work. Thirdly, Stewart heard
him say more than once “that he had projected in the earlier part
of his life a history of the other sciences on the same plan.”
Fourthly, the work exactly fits in with all that we hear of his
youthful bent for the Greek geometry and natural philosophy.
Fifthly, it must have been written long before 1758, for he
mentions a prediction that a certain comet will appear in that
year.
[8]
“The author at the end of his essay,” says the advertisement, “left
some notes and memorandums from which it appears he
considered this last part of his History of Astronomy as imperfect
and needing several additions.” It consists of 135 pages, and the
imperfections are not obvious to the reader.
[9]
Moral Sentiments, Part III. chap. ii. p. 210 of the second, third,
and fourth editions; chap. iii. of the sixth edition.
[10]
Mr. Rae, usually the most accurate of authorities, states that the
first edition appeared “in two volumes 8vo.”
[11]
The crude theory that sympathy is the foundation of altruism was
noticed by Hutcheson. In his System of Moral Philosophy (B. I.
ch. iii.) he writes: “Others say that we regard the good of others,
or of societies ... as the means of some subtiler pleasures of our
own by sympathy with others in their happiness.” But this
sympathy, he adds, “can never account for all kind affections,
tho’ it is no doubt a natural principle and a beautiful part of our
constitution.”
[12]
Mr. Rae’s Life of Adam Smith, pp. 148-9. Mr. Rae also says that it
contained none of the alterations or additions that Hume
expected, and expresses surprise that the additions, etc., which
had been placed in the printer’s hands in 1760 were not
incorporated in the text until the publication of the sixth edition
thirty years afterwards. On the other hand, he says that the
Dissertation on the Origin of Languages was added. But the
Dissertation was first appended in the third edition (1767).
[13]
See Moral Sentiments, 1st edition, p. 464.
[14]
Origine de l’inégalité. Partie première, pp. 376, 377. Édition
d’Amsterdam des œuvres diverses de J. J. Rousseau. The
reference is from Moral Sentiments, 3rd ed. p. 440.
[15]
Millar adds: “The great Montesquieu pointed out the road. He
was the Lord Bacon in this branch of philosophy. Dr. Smith is the
Newton.”
[16]
Cp. Wealth of Nations, Book I. chap. iii.
[17]
And even Hume, as Smith warned his class, had not quite
emancipated himself from mercantilist misconceptions.
[18]
Lectures, p. 241: “Excise raises the price of commodities and
makes fewer people able to carry on business. If a man purchase
£1000 worth of tobacco he has a hundred pounds of tax to pay,
and therefore cannot deal to such an extent as he would
otherwise do. Thus, as it requires greater stock to carry on trade,
the dealers must be fewer, and the rich have, as it were, a
monopoly against the poor.”
[19]
Uztariz, Theory and Practice of Commerce and Maritime Affairs,
translated by John Kippax, 1751, vol. ii. p. 52. The allusion has
been discovered by Mr. Edwin Cannan. See Lectures, p. 246.
[20]
Wealth of Nations (1776), Book V. chap. i. art. 2.
[21]
Tytler’s Kames, i. p. 278.
[22]
See Faujas Saint-Fond, Travels in England and Scotland, vol. ii. p.
241.
[23]
See Garrick Correspondence, vol. ii. pp. 549, 550.
[24]
See letter from Adam Smith to T. Cadell printed in the Economic
Journal for September 1898. It appears that the last two books
he had ordered were Postlethwait’s Dictionary of Trade and
Anderson’s Deduction of the Origin of Commerce. Neither
appears in Mr. Bonar’s catalogue of his library.
[25]
At Kirkcaldy George Drysdale, for some time Provost of the town
and afterwards Collector of Customs, was a “steady and much
esteemed friend.” His more distinguished brother, Dr. John
Drysdale the minister, had been at school with Smith, and
“among all his numerous friends and acquaintances,” says Dalzel,
there was none “whom he loved with greater affection or spoke
of with greater tenderness.” They often met in Kirkcaldy and
Edinburgh. The death of James Oswald, who represented
Kirkcaldy, early in 1769, was a serious loss to the little society,
and particularly to Smith.
[26]
Steuart’s Political Economy, 1767.
[27]
The most important of these (in Book IV. chap, vii.) appear for the
first time in the third edition (1784).
[28]
Letter to Cullen, London, 20th September 1774.
[29]
Mr. Macpherson’s recent abridgment is the only tolerable one I
know of, and that solely because it carefully retains many of the
finest chapters, and leaves the flesh on the bones.
[30]
A public pawnshop.
[31]
Charles Butler, the learned Catholic lawyer, once mentioned to
Fox that he had never read the Wealth of Nations. “To tell you
the truth,” said Fox, “nor I either. There is something in all these
subjects which passes my comprehension; something so wide
that I could never embrace them myself or find any one who
did.”
[32]
See Book IV. chap. vii.
[33]
See Skarzinski’s Adam Smith (1878), quoted by Oncken,
Economic Journal, vol. vii. p. 445.
[34]
See Ruskin’s Fors Clavigera, letters 62 and 72.
[35]
Smith avoids the error so commonly committed in modern
doctrines of international trade, of regarding a nation as a trading
unit.
[36]
The second case is simple and uncontroversial. If there is an
excise duty upon a home product, it seems reasonable, says
Smith, that an equal tax should be imposed in the shape of an
import duty upon the same product imported from abroad.
[37]
The author of Douglas.
[38]
Written from Kirkcaldy, November 9, 1776.
[39]
In the Budget of 1778 North adopted two more important
recommendations: the inhabited house duty, which is still with
us, and the malt tax, which was commuted for the beer duty by
Mr. Gladstone in 1880. The house tax proved very productive, as
taxes went in those days, its yield rising from £26,000 in 1779 to
£108,000 in 1782.
[40]
Sir Gray Cooper was Secretary to the Treasury.
[41]
Rae’s Life of Adam Smith, p. 326.
[42]
See the Life of Smith by William Smellie, a contemporary.
[43]
See Sinclair’s Life of Sir John Sinclair, vol. i. p. 39.
[44]
Edinburgh, 15th December 1783. The letter is printed in the
Journals and Correspondence of Lord Auckland, vol. i. p. 64.
[45]
Sir Gilbert Elliot wrote from Edinburgh, July 25, 1782, to his wife:
—“I have found one just man in Gomorrah, Adam Smith, author
of the Wealth of Nations. He was the Duke of Buccleuch’s tutor, is
a wise and deep philosopher, and although made Commissioner
of the Customs here by the Duke and Lord Advocate, is what I
call an honest fellow. He wrote a most kind as well as elegant
letter to Burke on his resignation, as I believe I told you before,
and on my mentioning it to him he told me he was the only man
here who spoke out for the Rockinghams.”—Life of Lord Minto,
vol. i. p. 84.
[46]
Afterwards Lord Lauderdale, a finished economist, who passed
some ingenious criticisms on the Wealth of Nations.
[47]
See Dugald Stewart’s Memoir, section V.
[48]
Mr. Rae, the only one of Smith’s biographers, I think, who has
noticed Saint-Fond’s visit, dates it wrongly (in 1782), and says
the account was published in 1783. The journey took place in
1784, and the account was published in 1797. An English
translation appeared two years later.
[49]
This appeared in 1786 with a prefatory note expressing the
author’s grateful obligations to Mr. Henry Hope of Amsterdam, for
his information concerning the great Dutch Bank.
[50]
In his first will Gibbon left a legacy of £100 to Adam Smith.
[51]
In his Defence of Usury, “Letter XIII. to Dr. Smith,” Bentham had
written: “Instead therefore of pretending to owe you nothing, I
237
shall begin with acknowledging that, as far as your trade
coincides with mine, I should come much nearer the truth were I
to say I owed you everything.” Mr. Rae (Life of Adam Smith, p.
424) quotes a letter from George Wilson to Bentham, in the
Bentham MSS., British Museum. I may add to this the following
note which I find in Bentham’s Rationale of Reward (1825), p.
332, in chapter xvi. of Book IV., on Rates of Interest. “Adam
Smith, after having read the letter upon Projects, which was
addressed to him, and printed at the end of the first edition of
the Defence of Usury, declared to a gentleman, the common
friend of the two authors, that he had been deceived. With the
tidings of his death Mr. Bentham received a copy of his works,
which had been sent to him as a token of esteem.”
INDEX
A B C D E F G H I J K L M N O P Q R S T U
V W X Y Z
A
Alembert, 132, 139.
American colonies, 163, 176-9.
Aristotle, 6, 24-6, 37, 53, 74, 194.
Armaments, 172-4.
Astronomy, History of, 16-18.
B
Bacon, 5, 74 n., 118-19.
Bagehot (quoted), 204.
Balliol College, 9-12.
Banks (in Scotland), 101.
Beauclerk, 160-1.
Bee, The, 21.
Bentham, Jeremy, 12, 184, 216;
his Defence of Usury, 231-2.
Black, Joseph, 83, 96-7, 99, 208, 231, 233.
Bordeaux, 123, 141.
Boswell, James, 19, 161, 164.
Brougham, Lord, 14.
Buccleuch, Duke of, 111-14, 131, 135, 150, 153, 157, 163, 213.
Buchan, Lord, 21, 99.
Buckle, Henry Thomas, 63, 64.
Burke, Edmund, 20, 30, 47, 49, 67, 75, 112, 160-2, 171, 174, 221-3,
226, 235-6.
Butler, Bishop, 12, 51, 54.
238
C
Calas, Jean, the case of, 124-5.
Cannan, Edwin, 71, 78-9, 90 n., 169;
the Lectures, 182.
Carlyle, Dr. Alexander, 101, 104, 105, 151, 226.
Clubs—the Poker, 107-9;
the Literary, 160, 161, 212, 216;
the Oyster, 216-18, 230.
Cobden, Richard, 78, 175, 184, 189-91.
Cochrane, Andrew, 101-2.
Colbert, Abbé, 121-3.
Colliers, 76-7.
Colonies, 145-9, 175-80.
Condorcet, 133.
Cullen, Dr., 26-7, 157, 226.
Customs, 88 sqq., 196 sqq., 213-15.
D
Dalkeith House, 150-1.
Dalrymple, Sir John, 21, 95, 99, 101, 104-5, 216.
Degrees, medical, 157-60.
Descartes, 17, 55.
Douglas, David, 213, 234.
—— Jane, 213, 223, 229.
—— John, Bishop of Salisbury, 9, 160.
Drysdale, John, 3.
Dundas, Henry, 201, 228.
Dunlop, Alexander, 4.
E
Eden, William, 199, 201-20, 227, 237
Edinburgh, 4, 78, 100, 103, 105 sqq., 153, 206, 213 sqq.
Encyclopædia, the, 118-20.
England, wealth of, 139-42.
Enville, Duchess of, 128, 131.
Epictetus, 55, 56.
Excise, 88-91, 191 n.
Exports, theory of, 86 sqq., 190 sqq.
F
Ferguson, Adam, 128, 216.
Ferney, 127-8.
Foulis, Robert, printer, 21, 95, 97-9.
Fox, Charles James, 174, 212, 221-2.
France, 86-7, 118 sqq., 188, 235.
Franklin, Benjamin, 108, 161-2.
Free Trade, 88, 142, 176, 188 sqq.; (chapter x.), 220.
G
Garrick, David, 130, 160, 212.
Geneva, 126-8.
Gibbon, 12, 13, 131, 157, 160, 164, 212, 216, 229-31.
Gladstone, W. E., 165, 193.
Glasgow, 4-9, 11, 23, 27, 78, 95 sqq., 100-3, 222.
—— University of, 3-9, 94 sqq., 229.
Glassford, John, 101.
Grotius, 5, 71, 73, 92.
H
Hamilton of Bangour, 21.
Helvétius, 132.
Hobbes, Thomas, 36, 51, 71.
Holland, 90, 139, 172, 192.
Home, Henry (see Kames).
—— John, 103, 105.
Hume, David, 6, 11, 17, 20, 22, 26, 30, 36, 38, 43, 46 sqq., 51, 60
sqq., 73, 95, 96, 103, 106, 110-11, 113, 129, 130, 136-8, 150
sqq., 163-4, 181, 194, 205-11, 233, 235.
Hunter, Sir William, 157.
—— John, 227.
Hutcheson, Francis, 4, 5, 6, 7, 11, 30, 31, 36-8, 51, 57 n., 62, 64,
73, 97, 181, 229.
239
Hutton, Dr., 216, 217, 233.
I
Imitative Arts, 16, 17, 19-20, 33, 67.
Imports, theory of, 86 sqq., 192 sqq., 220.
Ireland, 200-3.
J
Jardine, George, 30-31.
Johnson, Samuel, 19, 109-10, 165.
Johnstone, William (see Pulteney).
Jurisprudence, 69-72, 78.
Justice, 68 sqq.
K
Kames, Lord, 18, 19, 77, 103.
Kant, 40, 58.
Kirkcaldy, 1-3, 16, 76, 150-6, 205 sqq.
Kraus, Christian Jakob, 185.
L
Labour, division of, 81, 194-5.
Languedoc, 124-6.
Law, international, 71, 92-3.
List, Friedrich, 185-6, 189, 196.
Locke, John, 5, 25, 73, 235-6.
Logan, John, 226.
Logic, chair of, 23, 30-3.
Logic and Metaphysics, History of, 18, 23-8, 31-3.
London, 78, 156 sqq., 227-8.
Lowe, Robert, 187.
M
Mackintosh, Sir James, 50, 132.
Malebranche, 25.
Malesherbes, 184.
Manchester School, 189-91.
Mandeville, 36-7, 53-4, 62.
Mathematics, 7, 8.
Maxims of Rochefoucauld, 54.
Mercantile system, 85-8, 197-8.
Metaphysics, 26, 32-3; see Logic.
Mill, John Stuart, 165, 186-7, 196.
Millar, Andrew (the publisher), 46-8, 138, 144.
—— John, 31, 33, 37, 68, 74, 99, 222.
Milton, 21, 36, 67, 184.
Mollien, Count, 143, 184-5.
Monopoly, 159, 220.
Montesquieu, 68, 73, 76, 215, 235.
Morals, Chair of, 26 sqq., 116-17.
Moral Sentiments, Theory of, 31, 37-9, 46 sqq., 232.
Morellet, 132, 142, 220.
N
Navigation Act, 4, 190-1.
Necker, 131-2.
Newton, Sir Isaac, 8, 17, 36, 235.
North, Lord, 199, 200, 212, 213.
O
Oswald, James, of Dunnikier, 3, 18, 22, 104.
Oxford, 9.
—— University of, 11-15.
P
Panmure House, 213-14.
Paris, 129 sqq., 136-9.
Peel, Sir Robert, 193.
Physics, History of Ancient, 18.
Pitt, the younger, 184, 188, 200, 222, 227.
Plato, 24-5, 37, 194.
Police, lectures on, 68-72, 78.
Pope, 13, 19, 56.
Population, 76.
Price, Dr. Richard, 161, 230.
Protection (see Free Trade).
Pulteney, Sir William, 19, 104, 154-5.
Q
Quesnai, 68, 71, 134-5, 142, 169.
R
Rae, John (quoted), 14, 28, 94, 101, 106, 111, 114, 129, 211, 212,
226 n., 228.
Raikes, Thomas, 228.
Ramsay, Allan, 105, 110.
—— John, of Ochtertyre, 38, 44, 96.
Religion, 183.
Review, Edinburgh, 109.
Revenue of France, 141-2.
Reynolds, Sir Joshua, 160.
Riccoboni, Madame, 130.
Richelieu, Duke of, 123, 127.
Rochefoucauld, 129, 131, 133.
Rockingham Ministry, 146-7.
Rogers, Dr. Charles (quoted), 155.
—— Samuel, 9, 127, 133, 230-1.
Rousseau, J. J., 65, 136-8, 150, 224, 235.
Ruskin, 183.
S
Saint-Fond, Faujas de, 127, 223-6.
Schmoller, Professor, 180.
Schools (public) in England, 12.
Scotland, 9-10, 139-41.
Scott, Sir Walter (quoted), 217.
Shaftesbury, 31, 36, 51.
240
Shelburne, Lord, 144, 148, 184, 188.
Simson, Robert, 4, 8, 96.
Smith, Adam (the elder), 2.
—— Margaret, 2, 8.
Snell Exhibition, 9, 10, 15.
Society, the Select, 105-7.
Spectator, Impartial, 56-60, 182.
Spain, 86, 87, 145, 175.
Stamp Act, 146, 147.
Stewart, Dugald (quoted), 2, 5, 13, 14, 21, 68, 102, 105, 131, 132,
139, 151, 214, 234.
—— Matthew, 7, 8.
Strahan, William, 61, 144, 164, 206 sqq., 226.
Strathendry, 2.
Sympathy, doctrine of, 57 sqq.
T
Taille, 142.
Tax, Land, 89, 142;
the French, 142.
Taxation, 88 sqq., 170-2, 176 sqq.
Theology, Natural, 7, 37.
Tocqueville, 125.
Tooke, Horne, 124.
Toulouse, 124-5, 144.
Townshend, Charles, 48-9, 104, 111-15, 135, 147-8.
Treaties, Commercial (with France), 200, 220, 227.
Turgot, 68, 71, 125, 126, 129, 132-4, 142, 184, 219, 235-6.
U
Union, Act of, 4, 36.
Uztariz (quoted), 90.
V
Vingtième, 142.
Voltaire, 20, 44, 48, 120, 125, 127, 128, 139, 224, 235.
W
Wages, 140.
Wakefield, E. G., 165-6.
Walpole, Sir Robert, 91.
War, 172-4.
Watt, James, 83, 96-7.
Wealth of Nations, 2, 12, 15, 22, 32, 33, 63, 69, 81 sqq., 139, 144,
156, 158, 161-2; (chapter ix.), 163 sqq., 213.
Wedderburn, Alexander, 19, 47, 109.
Wilberforce, William, 228-9.
Windham, William, 226.
Wordsworth, 20, 21.
Printed by T. and A. Constable, Printers to His Majesty
at the Edinburgh University Press
Transcriber’s Notes
Silently corrected a few typos.
Retained publication information from the printed edition: this
eBook is public-domain in the country of publication.
In the text versions only, text in italics is delimited by
_underscores_.
*** END OF THE PROJECT GUTENBERG EBOOK ADAM SMITH ***
Updated editions will replace the previous one—the old editions
will be renamed.
Creating the works from print editions not protected by U.S.
copyright law means that no one owns a United States
copyright in these works, so the Foundation (and you!) can copy
and distribute it in the United States without permission and
without paying copyright royalties. Special rules, set forth in the
General Terms of Use part of this license, apply to copying and
distributing Project Gutenberg™ electronic works to protect the
PROJECT GUTENBERG™ concept and trademark. Project
Gutenberg is a registered trademark, and may not be used if
you charge for an eBook, except by following the terms of the
trademark license, including paying royalties for use of the
Project Gutenberg trademark. If you do not charge anything for
copies of this eBook, complying with the trademark license is
very easy. You may use this eBook for nearly any purpose such
as creation of derivative works, reports, performances and
research. Project Gutenberg eBooks may be modified and
printed and given away—you may do practically ANYTHING in
the United States with eBooks not protected by U.S. copyright
law. Redistribution is subject to the trademark license, especially
commercial redistribution.
START: FULL LICENSE
THE FULL PROJECT GUTENBERG LICENSE
PLEASE READ THIS BEFORE YOU DISTRIBUTE OR USE THIS WORK
To protect the Project Gutenberg™ mission of promoting the
free distribution of electronic works, by using or distributing this
work (or any other work associated in any way with the phrase
“Project Gutenberg”), you agree to comply with all the terms of
the Full Project Gutenberg™ License available with this file or
online at www.gutenberg.org/license.
Section 1. General Terms of Use and
Redistributing Project Gutenberg™
electronic works
1.A. By reading or using any part of this Project Gutenberg™
electronic work, you indicate that you have read, understand,
agree to and accept all the terms of this license and intellectual
property (trademark/copyright) agreement. If you do not agree
to abide by all the terms of this agreement, you must cease
using and return or destroy all copies of Project Gutenberg™
electronic works in your possession. If you paid a fee for
obtaining a copy of or access to a Project Gutenberg™
electronic work and you do not agree to be bound by the terms
of this agreement, you may obtain a refund from the person or
entity to whom you paid the fee as set forth in paragraph 1.E.8.
1.B. “Project Gutenberg” is a registered trademark. It may only
be used on or associated in any way with an electronic work by
people who agree to be bound by the terms of this agreement.
There are a few things that you can do with most Project
Gutenberg™ electronic works even without complying with the
full terms of this agreement. See paragraph 1.C below. There
are a lot of things you can do with Project Gutenberg™
electronic works if you follow the terms of this agreement and
help preserve free future access to Project Gutenberg™
electronic works. See paragraph 1.E below.
1.C. The Project Gutenberg Literary Archive Foundation (“the
Foundation” or PGLAF), owns a compilation copyright in the
collection of Project Gutenberg™ electronic works. Nearly all the
individual works in the collection are in the public domain in the
United States. If an individual work is unprotected by copyright
law in the United States and you are located in the United
States, we do not claim a right to prevent you from copying,
distributing, performing, displaying or creating derivative works
based on the work as long as all references to Project
Gutenberg are removed. Of course, we hope that you will
support the Project Gutenberg™ mission of promoting free
access to electronic works by freely sharing Project Gutenberg™
works in compliance with the terms of this agreement for
keeping the Project Gutenberg™ name associated with the
work. You can easily comply with the terms of this agreement
by keeping this work in the same format with its attached full
Project Gutenberg™ License when you share it without charge
with others.
1.D. The copyright laws of the place where you are located also
govern what you can do with this work. Copyright laws in most
countries are in a constant state of change. If you are outside
the United States, check the laws of your country in addition to
the terms of this agreement before downloading, copying,
displaying, performing, distributing or creating derivative works
based on this work or any other Project Gutenberg™ work. The
Foundation makes no representations concerning the copyright
status of any work in any country other than the United States.
1.E. Unless you have removed all references to Project
Gutenberg:
1.E.1. The following sentence, with active links to, or other
immediate access to, the full Project Gutenberg™ License must
appear prominently whenever any copy of a Project
Gutenberg™ work (any work on which the phrase “Project
Gutenberg” appears, or with which the phrase “Project
Gutenberg” is associated) is accessed, displayed, performed,
viewed, copied or distributed:
This eBook is for the use of anyone anywhere in the United
States and most other parts of the world at no cost and with
almost no restrictions whatsoever. You may copy it, give it
away or re-use it under the terms of the Project Gutenberg
License included with this eBook or online at
www.gutenberg.org. If you are not located in the United
States, you will have to check the laws of the country where
you are located before using this eBook.
1.E.2. If an individual Project Gutenberg™ electronic work is
derived from texts not protected by U.S. copyright law (does not
contain a notice indicating that it is posted with permission of
the copyright holder), the work can be copied and distributed to
anyone in the United States without paying any fees or charges.
If you are redistributing or providing access to a work with the
phrase “Project Gutenberg” associated with or appearing on the
work, you must comply either with the requirements of
paragraphs 1.E.1 through 1.E.7 or obtain permission for the use
of the work and the Project Gutenberg™ trademark as set forth
in paragraphs 1.E.8 or 1.E.9.
1.E.3. If an individual Project Gutenberg™ electronic work is
posted with the permission of the copyright holder, your use and
distribution must comply with both paragraphs 1.E.1 through
1.E.7 and any additional terms imposed by the copyright holder.
Additional terms will be linked to the Project Gutenberg™
License for all works posted with the permission of the copyright
holder found at the beginning of this work.
1.E.4. Do not unlink or detach or remove the full Project
Gutenberg™ License terms from this work, or any files
containing a part of this work or any other work associated with
Project Gutenberg™.
1.E.5. Do not copy, display, perform, distribute or redistribute
this electronic work, or any part of this electronic work, without
prominently displaying the sentence set forth in paragraph 1.E.1
with active links or immediate access to the full terms of the
Project Gutenberg™ License.
1.E.6. You may convert to and distribute this work in any binary,
compressed, marked up, nonproprietary or proprietary form,
including any word processing or hypertext form. However, if
you provide access to or distribute copies of a Project
Gutenberg™ work in a format other than “Plain Vanilla ASCII” or
other format used in the official version posted on the official
Project Gutenberg™ website (www.gutenberg.org), you must,
at no additional cost, fee or expense to the user, provide a copy,
a means of exporting a copy, or a means of obtaining a copy
upon request, of the work in its original “Plain Vanilla ASCII” or
other form. Any alternate format must include the full Project
Gutenberg™ License as specified in paragraph 1.E.1.
1.E.7. Do not charge a fee for access to, viewing, displaying,
performing, copying or distributing any Project Gutenberg™
works unless you comply with paragraph 1.E.8 or 1.E.9.
1.E.8. You may charge a reasonable fee for copies of or
providing access to or distributing Project Gutenberg™
electronic works provided that:
• You pay a royalty fee of 20% of the gross profits you derive
from the use of Project Gutenberg™ works calculated using the
method you already use to calculate your applicable taxes. The
fee is owed to the owner of the Project Gutenberg™ trademark,
but he has agreed to donate royalties under this paragraph to
the Project Gutenberg Literary Archive Foundation. Royalty
payments must be paid within 60 days following each date on
which you prepare (or are legally required to prepare) your
periodic tax returns. Royalty payments should be clearly marked
as such and sent to the Project Gutenberg Literary Archive
Foundation at the address specified in Section 4, “Information
about donations to the Project Gutenberg Literary Archive
Foundation.”
• You provide a full refund of any money paid by a user who
notifies you in writing (or by e-mail) within 30 days of receipt
that s/he does not agree to the terms of the full Project
Gutenberg™ License. You must require such a user to return or
destroy all copies of the works possessed in a physical medium
and discontinue all use of and all access to other copies of
Project Gutenberg™ works.
• You provide, in accordance with paragraph 1.F.3, a full refund of
any money paid for a work or a replacement copy, if a defect in
the electronic work is discovered and reported to you within 90
days of receipt of the work.
• You comply with all other terms of this agreement for free
distribution of Project Gutenberg™ works.
1.E.9. If you wish to charge a fee or distribute a Project
Gutenberg™ electronic work or group of works on different
terms than are set forth in this agreement, you must obtain
permission in writing from the Project Gutenberg Literary
Archive Foundation, the manager of the Project Gutenberg™
trademark. Contact the Foundation as set forth in Section 3
below.
1.F.
1.F.1. Project Gutenberg volunteers and employees expend
considerable effort to identify, do copyright research on,
transcribe and proofread works not protected by U.S. copyright
law in creating the Project Gutenberg™ collection. Despite these
efforts, Project Gutenberg™ electronic works, and the medium
on which they may be stored, may contain “Defects,” such as,
but not limited to, incomplete, inaccurate or corrupt data,
transcription errors, a copyright or other intellectual property
infringement, a defective or damaged disk or other medium, a
computer virus, or computer codes that damage or cannot be
read by your equipment.
1.F.2. LIMITED WARRANTY, DISCLAIMER OF DAMAGES - Except
for the “Right of Replacement or Refund” described in
paragraph 1.F.3, the Project Gutenberg Literary Archive
Foundation, the owner of the Project Gutenberg™ trademark,
and any other party distributing a Project Gutenberg™ electronic
work under this agreement, disclaim all liability to you for
damages, costs and expenses, including legal fees. YOU AGREE
THAT YOU HAVE NO REMEDIES FOR NEGLIGENCE, STRICT
LIABILITY, BREACH OF WARRANTY OR BREACH OF CONTRACT
EXCEPT THOSE PROVIDED IN PARAGRAPH 1.F.3. YOU AGREE
THAT THE FOUNDATION, THE TRADEMARK OWNER, AND ANY
DISTRIBUTOR UNDER THIS AGREEMENT WILL NOT BE LIABLE
TO YOU FOR ACTUAL, DIRECT, INDIRECT, CONSEQUENTIAL,
PUNITIVE OR INCIDENTAL DAMAGES EVEN IF YOU GIVE
NOTICE OF THE POSSIBILITY OF SUCH DAMAGE.
1.F.3. LIMITED RIGHT OF REPLACEMENT OR REFUND - If you
discover a defect in this electronic work within 90 days of
receiving it, you can receive a refund of the money (if any) you
paid for it by sending a written explanation to the person you
received the work from. If you received the work on a physical
medium, you must return the medium with your written
explanation. The person or entity that provided you with the
defective work may elect to provide a replacement copy in lieu
of a refund. If you received the work electronically, the person
or entity providing it to you may choose to give you a second
opportunity to receive the work electronically in lieu of a refund.
If the second copy is also defective, you may demand a refund
in writing without further opportunities to fix the problem.
1.F.4. Except for the limited right of replacement or refund set
forth in paragraph 1.F.3, this work is provided to you ‘AS-IS’,
WITH NO OTHER WARRANTIES OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR ANY PURPOSE.
1.F.5. Some states do not allow disclaimers of certain implied
warranties or the exclusion or limitation of certain types of
damages. If any disclaimer or limitation set forth in this
agreement violates the law of the state applicable to this
agreement, the agreement shall be interpreted to make the
maximum disclaimer or limitation permitted by the applicable
state law. The invalidity or unenforceability of any provision of
this agreement shall not void the remaining provisions.
1.F.6. INDEMNITY - You agree to indemnify and hold the
Foundation, the trademark owner, any agent or employee of the
Foundation, anyone providing copies of Project Gutenberg™
electronic works in accordance with this agreement, and any
volunteers associated with the production, promotion and
distribution of Project Gutenberg™ electronic works, harmless
from all liability, costs and expenses, including legal fees, that
arise directly or indirectly from any of the following which you
do or cause to occur: (a) distribution of this or any Project
Gutenberg™ work, (b) alteration, modification, or additions or
deletions to any Project Gutenberg™ work, and (c) any Defect
you cause.
Section 2. Information about the Mission
of Project Gutenberg™
Project Gutenberg™ is synonymous with the free distribution of
electronic works in formats readable by the widest variety of
computers including obsolete, old, middle-aged and new
computers. It exists because of the efforts of hundreds of
volunteers and donations from people in all walks of life.
Volunteers and financial support to provide volunteers with the
assistance they need are critical to reaching Project
Gutenberg™’s goals and ensuring that the Project Gutenberg™
collection will remain freely available for generations to come. In
2001, the Project Gutenberg Literary Archive Foundation was
created to provide a secure and permanent future for Project
Gutenberg™ and future generations. To learn more about the
Project Gutenberg Literary Archive Foundation and how your
efforts and donations can help, see Sections 3 and 4 and the
Foundation information page at www.gutenberg.org.
Section 3. Information about the Project
Gutenberg Literary Archive Foundation
The Project Gutenberg Literary Archive Foundation is a non-
profit 501(c)(3) educational corporation organized under the
laws of the state of Mississippi and granted tax exempt status
by the Internal Revenue Service. The Foundation’s EIN or
federal tax identification number is 64-6221541. Contributions
to the Project Gutenberg Literary Archive Foundation are tax
deductible to the full extent permitted by U.S. federal laws and
your state’s laws.
The Foundation’s business office is located at 809 North 1500
West, Salt Lake City, UT 84116, (801) 596-1887. Email contact
links and up to date contact information can be found at the
Foundation’s website and official page at
www.gutenberg.org/contact
Section 4. Information about Donations to
the Project Gutenberg Literary Archive
Foundation
Project Gutenberg™ depends upon and cannot survive without
widespread public support and donations to carry out its mission
of increasing the number of public domain and licensed works
that can be freely distributed in machine-readable form
accessible by the widest array of equipment including outdated
equipment. Many small donations ($1 to $5,000) are particularly
important to maintaining tax exempt status with the IRS.
The Foundation is committed to complying with the laws
regulating charities and charitable donations in all 50 states of
the United States. Compliance requirements are not uniform
and it takes a considerable effort, much paperwork and many
fees to meet and keep up with these requirements. We do not
solicit donations in locations where we have not received written
confirmation of compliance. To SEND DONATIONS or determine
the status of compliance for any particular state visit
www.gutenberg.org/donate.
While we cannot and do not solicit contributions from states
where we have not met the solicitation requirements, we know
of no prohibition against accepting unsolicited donations from
donors in such states who approach us with offers to donate.
International donations are gratefully accepted, but we cannot
make any statements concerning tax treatment of donations
received from outside the United States. U.S. laws alone swamp
our small staff.
Please check the Project Gutenberg web pages for current
donation methods and addresses. Donations are accepted in a
number of other ways including checks, online payments and
credit card donations. To donate, please visit:
www.gutenberg.org/donate.
Section 5. General Information About
Project Gutenberg™ electronic works
Professor Michael S. Hart was the originator of the Project
Gutenberg™ concept of a library of electronic works that could
be freely shared with anyone. For forty years, he produced and
distributed Project Gutenberg™ eBooks with only a loose
network of volunteer support.
Project Gutenberg™ eBooks are often created from several
printed editions, all of which are confirmed as not protected by
copyright in the U.S. unless a copyright notice is included. Thus,
we do not necessarily keep eBooks in compliance with any
particular paper edition.
Most people start at our website which has the main PG search
facility: www.gutenberg.org.
This website includes information about Project Gutenberg™,
including how to make donations to the Project Gutenberg
Literary Archive Foundation, how to help produce our new
eBooks, and how to subscribe to our email newsletter to hear
about new eBooks.
Welcome to Our Bookstore - The Ultimate Destination for Book Lovers
Are you passionate about books and eager to explore new worlds of
knowledge? At our website, we offer a vast collection of books that
cater to every interest and age group. From classic literature to
specialized publications, self-help books, and children’s stories, we
have it all! Each book is a gateway to new adventures, helping you
expand your knowledge and nourish your soul
Experience Convenient and Enjoyable Book Shopping Our website is more
than just an online bookstore—it’s a bridge connecting readers to the
timeless values of culture and wisdom. With a sleek and user-friendly
interface and a smart search system, you can find your favorite books
quickly and easily. Enjoy special promotions, fast home delivery, and
a seamless shopping experience that saves you time and enhances your
love for reading.
Let us accompany you on the journey of exploring knowledge and
personal growth!
ebookgate.com

More Related Content

PDF
Statistical Methods for Spatio Temporal Systems 1st Edition Barbel Finkenstadt
PDF
Introduction To Time Series Modeling Chapman Hall Crc Monographs On Statistic...
PDF
Design and Analysis of Cross Over Trials Third Edition Byron Jones
PDF
Bayesian Inference For Partially Identified Models Exploring The Limits Of Li...
PDF
Components of Variance 1st Edition D.R. Cox (Author)
PDF
Statistical Methods for Spatio Temporal Systems 1st Edition Barbel Finkenstadt
PDF
Robust Nonparametric Statistical Methods 2nd Edition Thomas P. Hettmansperger...
PDF
Design and Analysis of Cross Over Trials Third Edition Byron Jones
Statistical Methods for Spatio Temporal Systems 1st Edition Barbel Finkenstadt
Introduction To Time Series Modeling Chapman Hall Crc Monographs On Statistic...
Design and Analysis of Cross Over Trials Third Edition Byron Jones
Bayesian Inference For Partially Identified Models Exploring The Limits Of Li...
Components of Variance 1st Edition D.R. Cox (Author)
Statistical Methods for Spatio Temporal Systems 1st Edition Barbel Finkenstadt
Robust Nonparametric Statistical Methods 2nd Edition Thomas P. Hettmansperger...
Design and Analysis of Cross Over Trials Third Edition Byron Jones

Similar to measurement error in nonlinear models 2nd Edition Raymond J. Carroll (20)

PDF
Statistical Analysis Of Spatial And Spatiotemporal Point Patterns Third Editi...
PDF
Robust Nonparametric Statistical Methods 2nd Edition Thomas P. Hettmansperger...
PDF
Roc Curves For Continuous Data 1st Edition Wojtek J Krzanowski
PDF
Joint modeling of longitudinal and time-to-event data 1st Edition Elashoff
PDF
Joint modeling of longitudinal and time-to-event data 1st Edition Elashoff
PDF
Robust nonparametric statistical methods 2nd ed Edition Thomas P Hettmansperger
PDF
Diagnostic Checks in Time Series 1st Edition Wai Keung Li
PDF
Asymptotic Analysis Of Mixed Effects Models Theory Applications And Open Prob...
PDF
Diagnostic Checks in Time Series 1st Edition Wai Keung Li
PDF
Diagnostic Checks in Time Series 1st Edition Wai Keung Li
PDF
Design and Analysis of Cross Over Trials Third Edition Byron Jones
PDF
Joint modeling of longitudinal and time-to-event data 1st Edition Elashoff al...
PDF
Gaussian Markov random fields theory and applications 1st Edition Havard Rue
PDF
Statistics in the 21st Century Ed 1st Edition Martin A. Tanner
PDF
Theory Of Factorial Design Single And Multistratum Experiments Cheng
PDF
Statistical Analysis of Spatial and Spatio Temporal Point Patterns Third Edit...
PDF
Multidimensional scaling 2nd ed Edition Trevor F. Cox
PDF
Bayesian Inference for Partially Identified Models Exploring the Limits of Li...
PDF
Multidimensional scaling 2nd ed Edition Trevor F. Cox
PDF
Quasileast Squares Regression Justine Shults Joseph M Hilbe
Statistical Analysis Of Spatial And Spatiotemporal Point Patterns Third Editi...
Robust Nonparametric Statistical Methods 2nd Edition Thomas P. Hettmansperger...
Roc Curves For Continuous Data 1st Edition Wojtek J Krzanowski
Joint modeling of longitudinal and time-to-event data 1st Edition Elashoff
Joint modeling of longitudinal and time-to-event data 1st Edition Elashoff
Robust nonparametric statistical methods 2nd ed Edition Thomas P Hettmansperger
Diagnostic Checks in Time Series 1st Edition Wai Keung Li
Asymptotic Analysis Of Mixed Effects Models Theory Applications And Open Prob...
Diagnostic Checks in Time Series 1st Edition Wai Keung Li
Diagnostic Checks in Time Series 1st Edition Wai Keung Li
Design and Analysis of Cross Over Trials Third Edition Byron Jones
Joint modeling of longitudinal and time-to-event data 1st Edition Elashoff al...
Gaussian Markov random fields theory and applications 1st Edition Havard Rue
Statistics in the 21st Century Ed 1st Edition Martin A. Tanner
Theory Of Factorial Design Single And Multistratum Experiments Cheng
Statistical Analysis of Spatial and Spatio Temporal Point Patterns Third Edit...
Multidimensional scaling 2nd ed Edition Trevor F. Cox
Bayesian Inference for Partially Identified Models Exploring the Limits of Li...
Multidimensional scaling 2nd ed Edition Trevor F. Cox
Quasileast Squares Regression Justine Shults Joseph M Hilbe
Ad

Recently uploaded (20)

PDF
VCE English Exam - Section C Student Revision Booklet
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PDF
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
PDF
Classroom Observation Tools for Teachers
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PDF
Computing-Curriculum for Schools in Ghana
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PPTX
master seminar digital applications in india
PDF
Chinmaya Tiranga quiz Grand Finale.pdf
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PDF
Complications of Minimal Access Surgery at WLH
PPTX
Lesson notes of climatology university.
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PPTX
Institutional Correction lecture only . . .
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PDF
01-Introduction-to-Information-Management.pdf
VCE English Exam - Section C Student Revision Booklet
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
Classroom Observation Tools for Teachers
O5-L3 Freight Transport Ops (International) V1.pdf
Computing-Curriculum for Schools in Ghana
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
master seminar digital applications in india
Chinmaya Tiranga quiz Grand Finale.pdf
Final Presentation General Medicine 03-08-2024.pptx
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
Complications of Minimal Access Surgery at WLH
Lesson notes of climatology university.
STATICS OF THE RIGID BODIES Hibbelers.pdf
Institutional Correction lecture only . . .
102 student loan defaulters named and shamed – Is someone you know on the list?
human mycosis Human fungal infections are called human mycosis..pptx
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
01-Introduction-to-Information-Management.pdf
Ad

measurement error in nonlinear models 2nd Edition Raymond J. Carroll

  • 1. measurement error in nonlinear models 2nd Edition Raymond J. Carroll download https://ebookgate.com/product/measurement-error-in-nonlinear- models-2nd-edition-raymond-j-carroll/ Get Instant Ebook Downloads – Browse at https://ebookgate.com
  • 2. Get Your Digital Files Instantly: PDF, ePub, MOBI and More Quick Digital Downloads: PDF, ePub, MOBI and Other Formats Radiation Risk Estimation Based on Measurement Error Models 1st Edition Sergii Masiuk https://ebookgate.com/product/radiation-risk-estimation-based-on- measurement-error-models-1st-edition-sergii-masiuk/ The Basics of FMEA 2nd Edition Raymond J. Mikulak https://ebookgate.com/product/the-basics-of-fmea-2nd-edition- raymond-j-mikulak/ Precalculus Graphs Models 3rd Edition Raymond A. Barnett https://ebookgate.com/product/precalculus-graphs-models-3rd- edition-raymond-a-barnett/ Margins of Error A Study of Reliability in Survey Measurement Wiley Series in Survey Methodology 1st Edition Duane F. Alwin https://ebookgate.com/product/margins-of-error-a-study-of- reliability-in-survey-measurement-wiley-series-in-survey- methodology-1st-edition-duane-f-alwin/
  • 3. Generalized Linear Models with Applications in Engineering and the Sciences Wiley Series in Probability and Statistics 2nd Edition Raymond H. Myers https://ebookgate.com/product/generalized-linear-models-with- applications-in-engineering-and-the-sciences-wiley-series-in- probability-and-statistics-2nd-edition-raymond-h-myers/ Quantitative Ecology Second Edition Measurement Models and Scaling David C. Schneider https://ebookgate.com/product/quantitative-ecology-second- edition-measurement-models-and-scaling-david-c-schneider/ An introduction to nonlinear partial differential equations 2nd ed Edition J. David Logan https://ebookgate.com/product/an-introduction-to-nonlinear- partial-differential-equations-2nd-ed-edition-j-david-logan/ In Search of Christian Freedom 2nd Edition Raymond Franz https://ebookgate.com/product/in-search-of-christian-freedom-2nd- edition-raymond-franz/ Behind Human Error 2nd Edition David D. Woods https://ebookgate.com/product/behind-human-error-2nd-edition- david-d-woods/
  • 4. Measurement Error in Nonlinear Models A Modern Perspective Second Edition Monographs on Statistics and Applied Probability 105 © 2006 by Taylor & Francis Group, LLC
  • 5. MONOGRAPHS ON STATISTICS AND APPLIED PROBABILITY General Editors V. Isham, N. Keiding, T. Louis, S. Murphy, R. L. Smith, and H. Tong 1 Stochastic Population Models in Ecology and Epidemiology M.S. Barlett (1960) 2 Queues D.R. Cox and W.L. Smith (1961) 3 Monte Carlo Methods J.M. Hammersley and D.C. Handscomb (1964) 4 The Statistical Analysis of Series of Events D.R. Cox and P.A.W. Lewis (1966) 5 Population Genetics W.J. Ewens (1969) 6 Probability, Statistics and Time M.S. Barlett (1975) 7 Statistical Inference S.D. Silvey (1975) 8 The Analysis of Contingency Tables B.S. Everitt (1977) 9 Multivariate Analysis in Behavioural Research A.E. Maxwell (1977) 10 Stochastic Abundance Models S. Engen (1978) 11 Some Basic Theory for Statistical Inference E.J.G. Pitman (1979) 12 Point Processes D.R. Cox and V. Isham (1980) 13 Identification of Outliers D.M. Hawkins (1980) 14 Optimal Design S.D. Silvey (1980) 15 Finite Mixture Distributions B.S. Everitt and D.J. Hand (1981) 16 Classification A.D. Gordon (1981) 17 Distribution-Free Statistical Methods, 2nd edition J.S. Maritz (1995) 18 Residuals and Influence in Regression R.D. Cook and S. Weisberg (1982) 19 Applications of Queueing Theory, 2nd edition G.F. Newell (1982) 20 Risk Theory, 3rd edition R.E. Beard, T. Pentikäinen and E. Pesonen (1984) 21 Analysis of Survival Data D.R. Cox and D. Oakes (1984) 22 An Introduction to Latent Variable Models B.S. Everitt (1984) 23 Bandit Problems D.A. Berry and B. Fristedt (1985) 24 Stochastic Modelling and Control M.H.A. Davis and R. Vinter (1985) 25 The Statistical Analysis of Composition Data J. Aitchison (1986) 26 Density Estimation for Statistics and Data Analysis B.W. Silverman (1986) 27 Regression Analysis with Applications G.B. Wetherill (1986) 28 Sequential Methods in Statistics, 3rd edition G.B. Wetherill and K.D. Glazebrook (1986) 29 Tensor Methods in Statistics P. McCullagh (1987) 30 Transformation and Weighting in Regression R.J. Carroll and D. Ruppert (1988) 31 Asymptotic Techniques for Use in Statistics O.E. Bandorff-Nielsen and D.R. Cox (1989) 32 Analysis of Binary Data, 2nd edition D.R. Cox and E.J. Snell (1989) 33 Analysis of Infectious Disease Data N.G. Becker (1989) 34 Design and Analysis of Cross-Over Trials B. Jones and M.G. Kenward (1989) 35 Empirical Bayes Methods, 2nd edition J.S. Maritz and T. Lwin (1989) 36 Symmetric Multivariate and Related Distributions K.T. Fang, S. Kotz and K.W. Ng (1990) 37 Generalized Linear Models, 2nd edition P. McCullagh and J.A. Nelder (1989) © 2006 by Taylor & Francis Group, LLC
  • 6. 38 Cyclic and Computer Generated Designs, 2nd edition J.A. John and E.R. Williams (1995) 39 Analog Estimation Methods in Econometrics C.F. Manski (1988) 40 Subset Selection in Regression A.J. Miller (1990) 41 Analysis of Repeated Measures M.J. Crowder and D.J. Hand (1990) 42 Statistical Reasoning with Imprecise Probabilities P. Walley (1991) 43 Generalized Additive Models T.J. Hastie and R.J. Tibshirani (1990) 44 Inspection Errors for Attributes in Quality Control N.L. Johnson, S. Kotz and X. Wu (1991) 45 The Analysis of Contingency Tables, 2nd edition B.S. Everitt (1992) 46 The Analysis of Quantal Response Data B.J.T. Morgan (1992) 47 Longitudinal Data with Serial Correlation—A State-Space Approach R.H. Jones (1993) 48 Differential Geometry and Statistics M.K. Murray and J.W. Rice (1993) 49 Markov Models and Optimization M.H.A. Davis (1993) 50 Networks and Chaos—Statistical and Probabilistic Aspects O.E. Barndorff-Nielsen, J.L. Jensen and W.S. Kendall (1993) 51 Number-Theoretic Methods in Statistics K.-T. Fang and Y. Wang (1994) 52 Inference and Asymptotics O.E. Barndorff-Nielsen and D.R. Cox (1994) 53 Practical Risk Theory for Actuaries C.D. Daykin, T. Pentikäinen and M. Pesonen (1994) 54 Biplots J.C. Gower and D.J. Hand (1996) 55 Predictive Inference—An Introduction S. Geisser (1993) 56 Model-Free Curve Estimation M.E. Tarter and M.D. Lock (1993) 57 An Introduction to the Bootstrap B. Efron and R.J. Tibshirani (1993) 58 Nonparametric Regression and Generalized Linear Models P.J. Green and B.W. Silverman (1994) 59 Multidimensional Scaling T.F. Cox and M.A.A. Cox (1994) 60 Kernel Smoothing M.P. Wand and M.C. Jones (1995) 61 Statistics for Long Memory Processes J. Beran (1995) 62 Nonlinear Models for Repeated Measurement Data M. Davidian and D.M. Giltinan (1995) 63 Measurement Error in Nonlinear Models R.J. Carroll, D. Rupert and L.A. Stefanski (1995) 64 Analyzing and Modeling Rank Data J.J. Marden (1995) 65 Time Series Models—In Econometrics, Finance and Other Fields D.R. Cox, D.V. Hinkley and O.E. Barndorff-Nielsen (1996) 66 Local Polynomial Modeling and its Applications J. Fan and I. Gijbels (1996) 67 Multivariate Dependencies—Models, Analysis and Interpretation D.R. Cox and N. Wermuth (1996) 68 Statistical Inference—Based on the Likelihood A. Azzalini (1996) 69 Bayes and Empirical Bayes Methods for Data Analysis B.P. Carlin and T.A Louis (1996) 70 Hidden Markov and Other Models for Discrete-Valued Time Series I.L. Macdonald and W. Zucchini (1997) 71 Statistical Evidence—A Likelihood Paradigm R. Royall (1997) 72 Analysis of Incomplete Multivariate Data J.L. Schafer (1997) 73 Multivariate Models and Dependence Concepts H. Joe (1997) © 2006 by Taylor & Francis Group, LLC
  • 7. 74 Theory of Sample Surveys M.E. Thompson (1997) 75 Retrial Queues G. Falin and J.G.C. Templeton (1997) 76 Theory of Dispersion Models B. Jørgensen (1997) 77 Mixed Poisson Processes J. Grandell (1997) 78 Variance Components Estimation—Mixed Models, Methodologies and Applications P.S.R.S. Rao (1997) 79 Bayesian Methods for Finite Population Sampling G. Meeden and M. Ghosh (1997) 80 Stochastic Geometry—Likelihood and computation O.E. Barndorff-Nielsen, W.S. Kendall and M.N.M. van Lieshout (1998) 81 Computer-Assisted Analysis of Mixtures and Applications— Meta-analysis, Disease Mapping and Others D. Böhning (1999) 82 Classification, 2nd edition A.D. Gordon (1999) 83 Semimartingales and their Statistical Inference B.L.S. Prakasa Rao (1999) 84 Statistical Aspects of BSE and vCJD—Models for Epidemics C.A. Donnelly and N.M. Ferguson (1999) 85 Set-Indexed Martingales G. Ivanoff and E. Merzbach (2000) 86 The Theory of the Design of Experiments D.R. Cox and N. Reid (2000) 87 Complex Stochastic Systems O.E. Barndorff-Nielsen, D.R. Cox and C. Klüppelberg (2001) 88 Multidimensional Scaling, 2nd edition T.F. Cox and M.A.A. Cox (2001) 89 Algebraic Statistics—Computational Commutative Algebra in Statistics G. Pistone, E. Riccomagno and H.P. Wynn (2001) 90 Analysis of Time Series Structure—SSA and Related Techniques N. Golyandina, V. Nekrutkin and A.A. Zhigljavsky (2001) 91 Subjective Probability Models for Lifetimes Fabio Spizzichino (2001) 92 Empirical Likelihood Art B. Owen (2001) 93 Statistics in the 21st Century Adrian E. Raftery, Martin A. Tanner, and Martin T. Wells (2001) 94 Accelerated Life Models: Modeling and Statistical Analysis Vilijandas Bagdonavicius and Mikhail Nikulin (2001) 95 Subset Selection in Regression, Second Edition Alan Miller (2002) 96 Topics in Modelling of Clustered Data Marc Aerts, Helena Geys, Geert Molenberghs, and Louise M. Ryan (2002) 97 Components of Variance D.R. Cox and P.J. Solomon (2002) 98 Design and Analysis of Cross-Over Trials, 2nd Edition Byron Jones and Michael G. Kenward (2003) 99 Extreme Values in Finance, Telecommunications, and the Environment Bärbel Finkenstädt and Holger Rootzén (2003) 100 Statistical Inference and Simulation for Spatial Point Processes Jesper Møller and Rasmus Plenge Waagepetersen (2004) 101 Hierarchical Modeling and Analysis for Spatial Data Sudipto Banerjee, Bradley P. Carlin, and Alan E. Gelfand (2004) 102 Diagnostic Checks in Time Series Wai Keung Li (2004) 103 Stereology for Statisticians Adrian Baddeley and Eva B. Vedel Jensen (2004) 104 Gaussian Markov Random Fields: Theory and Applications Håvard Rue and Leonhard Held (2005) 105 Measurement Error in Nonlinear Models: A Modern Perspective, Second Edition Raymond J. Carroll, David Ruppert, Leonard A. Stefanski, and Ciprian M. Crainiceanu (2006) © 2006 by Taylor & Francis Group, LLC
  • 8. Measurement Error in Nonlinear Models Raymond J. Carroll David Ruppert Leonard A. Stefanski Ciprian M. Crainiceanu A Modern Perspective Second Edition Monographs on Statistics and Applied Probability 105 Boca Raton London New York Chapman & Hall/CRC is an imprint of the Taylor & Francis Group, an informa business © 2006 by Taylor & Francis Group, LLC
  • 9. Chapman & Hall/CRC Taylor & Francis Group 6000 Broken Sound Parkway NW, Suite 300 Boca Raton, FL 33487-2742 © 2006 by Taylor and Francis Group, LLC Chapman & Hall/CRC is an imprint of Taylor & Francis Group, an Informa business No claim to original U.S. Government works Printed in the United States of America on acid-free paper 10 9 8 7 6 5 4 3 2 1 International Standard Book Number-10: 1-58488-633-1 (Hardcover) International Standard Book Number-13: 978-1-58488-633-4 (Hardcover) This book contains information obtained from authentic and highly regarded sources. Reprinted mate- rial is quoted with permission, and sources are indicated. A wide variety of references are listed. Reason- able efforts have been made to publish reliable data and information, but the author and the publisher cannot assume responsibility for the validity of all materials or for the consequences of their use. No part of this book may be reprinted, reproduced, transmitted, or utilized in any form by any elec- tronic, mechanical, or other means, now known or hereafter invented, including photocopying, micro- filming, and recording, or in any information storage or retrieval system, without written permission from the publishers. For permission to photocopy or use material electronically from this work, please access www.copy- right.com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC) 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that provides licenses and registration for a variety of users. For organizations that have been granted a photocopy license by the CCC, a separate system of payment has been arranged. Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe. Visit the Taylor & Francis Web site at http://www.taylorandfrancis.com and the CRC Press Web site at http://www.crcpress.com C331X_Discl.indd 1 5/15/06 9:50:24 AM © 2006 by Taylor & Francis Group, LLC
  • 10. To our families and friends © 2006 by Taylor & Francis Group, LLC
  • 11. Preface to the First Edition This monograph is about analysis strategies for regression problems in which predictors are measured with error. These problems are commonly known as measurement error modeling or errors-in-variables. There is an enormous literature on this topic in linear regression, as summarized by Fuller (1987). Our interest lies almost exclusively in the analysis of nonlinear regression models, defined generally enough to include gener- alized linear models, transform-both-sides models, and quasilikelihood and variance function problems. The effects of measurement error are well known, and we basically assume that the reader understands that measurement error in predic- tors causes biases in estimated regression coefficients, and hence that the field is largely about correcting for such effects. Chapter 3∗ summarizes much of what is known about the consequences of measurement error for estimating linear regression parameters, although the material is not exhaustive. Nonlinear errors-in-variables modeling began in earnest in the early 1980s with the publication of a series of papers on diverse topics: Prentice (1982) on survival analysis; Carroll, Spiegelman, Lan, Bailey, and Abbott (1984) and Stefanski and Carroll (1985) on binary regression; Armstrong (1985) on generalized linear models; Amemiya (1985) on instrumental variables; and Stefanski (1985) on estimating equations. David Byar and Mitchell Gail organized a workshop on the topic in 1987 at the National Institutes of Health, which in 1989 was published as a special issue of Statistics in Medicine. Since these early papers, the field has grown dra- matically, as evidenced by the bibliography at the end of this book. Unlike the early 1980s, the literature is now so large that it is difficult to understand the main ideas from individual papers. Indeed, a first draft of this book, completed in late 1990, consisted only of the material in four of the first five chapters. Essentially all the rest of the material has been developed since 1990. In a field as rapidly evolving as this one, and with the entrance of many new researchers into the area, we can present but a snapshot of the current state of knowledge. This book can be divided broadly into four main parts: Chapters 1–2, ∗ Chapter numbers in this preface refer to the first edition, not the present edition. © 2006 by Taylor & Francis Group, LLC
  • 12. 3–6, 7–8, and 9–14. In addition, there is Appendix A, a review of relevant fitting methods and statistical models. The first part is introductory. Chapter 1 gives a number of applications where measurement error is of concern, and defines basic terminology of error structure, data sources and the distinction between functional and structural models. Chapter 2 gives an overview of the important ideas from linear regression, particularly the biases caused by measurement error and some estimation techniques. The second part gives the basic ideas and techniques of what we call functional modeling, where the distribution of the true predictor is not modeled parametrically. In addition, in these chapters it is as- sumed that the true predictor is never observable. The focus is on the additive measurement error model, although periodically we describe modifications for the multiplicative error model. Chapters 3 and 4 dis- cuss two broadly applicable functional methods, regression calibration and simulation-extrapolation (SIMEX), which can be thought of as the default approaches. Chapter 5 discusses a broadly based approach to the use of instrumental variables. All three of these chapters focus on estimators which are easily computed but yield only approximately con- sistent estimates. Chapter 6 is still based on the assumption that the true predictor is never observable, but here we provide functional techniques which are fully and not just approximately consistent. This material is somewhat more daunting in (algebraic) appearance than the approxi- mate techniques, but even so the methods themselves are often easily programmed. Throughout this part of the book, we use examples of binary regression modeling. The third part of the book concerns structural modeling, meaning that the distribution of the true predictor is parametrically modeled. Chapter 7 describes the likelihood approach to estimation and inference in mea- surement error models, while Chapter 8 briefly covers Bayesian model- ing. Here we become more focused on the distinction between functional and structural modeling, and also describe the measurement error prob- lem as a missing data problem. We also allow for the possibility that the true predictor can be measured in a subset of the study population. The discussion is fully general and applies to categorical data as well as to the additive and multiplicative measurement error models. While at this point the use of structural modeling in measurement error models is not very popular, we believe it will become more so in the very near future. The fourth part of the book is devoted to more specialized topics. Chapter 9 takes up the study of functional techniques which are ap- plicable when the predictor can be observed in a subset of the study. Chapter 10 discusses functional estimation in models with generalized © 2006 by Taylor & Francis Group, LLC
  • 13. linear structure and an unknown link function. Chapter 11 describes the effects that measurement error has on hypothesis testing. Nonparamet- ric regression and density function estimation are addressed in Chapter 12. Errors in the response rather than in predictors are described in Chapter 13. In Chapter 14, a variety of topics are addressed briefly: case-control studies, differential measurement error, functional mixture methods, design of two-stage studies and survival analysis. We have tried to design the text so that it can be read at two levels. Many readers will be interested only in the background material and in the definition of the specific methods that can be employed. These readers will find that the chapters in the middle two parts of the text (functional and structural modeling) begin with preliminary discussion, move into the definition of the methods, and are then followed by a worked numerical example. The end of the example serves as a flag that the material is about to become more detailed, with justifications of the methods, derivations of estimated standard errors, etc. Those readers who are not interested in such details should skip the material following the examples at first (and perhaps last) reading. It is our intention that the part of the book on functional models (Chapters 3–6) can be understood at an overview level without an ex- tensive background in theoretical statistics, at least through the numeri- cal examples. The structural modeling approach requires that one knows about likelihood and Bayesian methods, but with this exception the ma- terial is not particularly specialized. The fourth part of the book (Chap- ters 9–14) is more technical, and we suggest that those interested mainly in an overview simply read the first section of each of those chapters. A full appreciation of the text, especially its details, requires a strong background in likelihood methods, estimating equations and quasilikeli- hood and variance function models. For inference, we typically provide estimated standard errors, as well as suggest use of “the” bootstrap. These topics are all covered in Appendix A, albeit briefly. For more background on the models used in this monograph, we highly recom- mend reading Chapter 1 of Fuller (1987) for an introduction to linear measurement error models and the first four chapters of McCullagh and Nelder (1989) for further discussion of generalized linear models, includ- ing logistic regression. This is a book about general ideas and strategies of estimation and inference, not a book about a specific problem. Our interest in the field started with logistic regression, and many of our examples are based upon this problem. However, our philosophy is that measurement error occurs in many fields and in a variety of guises, and what is needed is an outline of strategies for handling progressively more difficult prob- lems. While logistic regression may well be the most important nonlinear © 2006 by Taylor & Francis Group, LLC
  • 14. measurement error model, the strategies here are applied to a hard-core nonlinear regression bioassay problem (Chapter 3), a changepoint prob- lem (Chapter 7), and a 2 × 2 table with misclassification (Chapter 8). Our hope is that the strategies will be sufficiently clear that they can be applied to new problems as they arise. We have tried to represent the main themes in the field, and to ref- erence as many research papers as possible. Obviously, as in any mono- graph, the selection of topics and material to be emphasized reflects our own interests. We apologize in advance to those workers whose work we have neglected to cite, or whose work should have been better advertised. Carroll’s research and the writing of this book were supported by grants from the National Cancer Institute (CA–57030 and CA–61067). After January 1, 1996, Splus and SAS computer programs (on SPARC architecture SunOS versions 4 and 5 and for Windows on PCs), which implement (for major generalized linear models) many of the functional methods described in this book, can be obtained by sending a message to qvf@stat.tamu.edu. The body of the text should contain only a valid return email address. This will generate an automatic reply with instruc- tions on how to get the software. Much of Stefanski’s research on measurement error problems has been supported by grants from the National Science Foundation (DMS–86136- 81 and DMS–9200915) and by funding from the Environmental Monitor- ing and Assessment Program, U.S. Environmental Protection Agency. We want to thank Jim Calvin, Bobby Gutierrez, Stephen Eckert, Joey Lin, C. Y. Wang, and Naisyin Wang for helpful general comments; Donna Spiegelman for a detailed reading of the manuscript; Jeff Buzas, John Cook, Tony Olsen, and Scott Overton for ideas and comments related to our research; and Viswanath Devanarayan for computing assistance and comments. Rob Abbott stimulated our initial interest in the field in 1981 with a question concerning the effects of measurement error in the Framingham Heart Study; this example appears throughout our discussion. Larry Freedman and Mitch Gail have commented on much of our work and have been instrumental in guiding us to interesting problems. Nancy Potischman introduced us to the world of nutritional epidemiology, where measurement error is of fundamental concern. Our friend Leon Gleser has been a source of support and inspiration for many years and has been a great influence on our thinking. This book uses data supplied by the National Heart, Lung, and Blood Institute, NIH, DHHS from the Framingham Heart Study. The views expressed in this paper are those of the authors and do not necessarily reflect the views of the National Heart, Lung, and Blood Institute or of the Framingham Study. © 2006 by Taylor & Francis Group, LLC
  • 15. Preface to the Second Edition Since the first edition of Measurement Error in Nonlinear Models ap- peared in 1995, the field of measurement error and exposure uncertainty has undergone an explosion in research. Some of these areas are the following: • Bayesian computation via Markov Chain Monte Carlo techniques are now widely used in practice. The first edition had a short and not particularly satisfactory Chapter 9 on this topic. In this edition, we have greatly expanded the material and also the applications. Even if one is not a card-carrying Bayesian, Bayesian computation is a natural way to handle what we call the structural approach to measurement error modeling. • A new chapter has been added on longitudinal data and mixed models, areas that have seen tremendous growth since the first edition. • Semiparametric and nonparametric methods are enjoying increased application. The field of semiparametric and nonparametric regression (Ruppert, Wand, and Carroll, 2003) has become extremely important in the past 11 years, and in measurement error problems techniques are now much better established. We have revamped the old chap- ter on nonparametric regression and density estimation (Chapter 12) and added a new chapter (Chapter 13) to reflect the changes in the literature. • Methods for handling covariate measurement error in survival anal- ysis have been developing rapidly. The first edition had a section on survival analysis in the final chapter, “Other Topics.” This section has been greatly expanded and made into a separate Chapter 14. • The area of missing data has also expanded vigorously over the last 11 years, especially due to the work of Robins and his colleagues. This work and its connections with measurement error now needs a book-length treatment of its own. Therefore, with some reluctance, we decided to delete much of the old material on validation data as a missing data problem. • We have completely rewritten the score function chapter, both to keep up with advances in this area and and to make the exposition more transparent. © 2006 by Taylor & Francis Group, LLC
  • 16. The background material in Appendix A has been expanded to make the book somewhat more self-contained. Technical material that ap- peared as appendices to individual chapters in the first edition has now been collected into a new Appendix B. Carroll’s research has been supported since 1990 by a grant from the National Cancer Institute (CA57030). The work of Raymond Carroll partially occurred during multiple visits to Peter Hall at the Centre of Excellence for Mathematics and Statistics of Complex Systems at the Australian National University, whose support is gratefully acknowl- edged, along with the opportunity to take thousands of photos of kanga- roos (http://www.stat.tamu.edu/∼carroll/compressed kangaroo.jpg). David Ruppert was supported by the National Science Foundation (DMS 04-538) and the National Institutes of Health (CA57030). Leonard Ste- fanski also received support from the National Science Foundation and the National Institutes of Health. In this second edition, we especially acknowledge our colleagues with whom we have discussed measurement error problems and worked since 1995, including Scott Berry, Dennis Boos, John Buonaccorsi, Jeff Buzas, Josef Coresh, Marie Davidian, Eugene Demidenko, Laurence Freedman, Wayne Fuller, Mitchell Gail, Bobby Gutierrez, Peter Hall, Victor Kipnis, Liang Li, Xihong Lin, Jay Lubin, Yanyuna Ma, Doug Midthune, Sastry Pantula, Dan Schafer, John Staudenmayer, Sally Thurston, Tor Toste- son, Naisyin Wang, and Alan Welsh. Owen Hoffman introduced us to the problem of radiation dosimetry and the ideas of shared Berkson and classical uncertainties. We once again acknowledge Robert Abbott for introducing us to the problem in 1980, when he brought to Raymond Carroll a referee report demanding that he explain the impact of measurement error on the (logistic regression) Framingham data. We would love to acknowledge that anonymous referee for starting us along the path of measurement error in nonlinear models. We also thank Mitchell Gail, one of the world’s great biostatisticians, for his advice and friendship over the last 25 years. We are extremely grateful to Rick Rossi for a detailed reading of the manuscript, a reading that led to many changes in substance and ex- position. Rick is the only head of a Department of Mathematics and Statistics who is also a licensed trout-fishing guide. Finally, and with gratitude, we acknowledge our good friend Leon Gleser, who, to quote the first edition, has been a source of support and inspiration for many years and has been a great influence on our thinking. Our book Web site is http://www.stat.tamu.edu/∼carroll/eiv.SecondEdition. © 2006 by Taylor & Francis Group, LLC
  • 17. Guide to Notation In this section we give brief explanations and representative examples of the notation used in this monograph. For precise definitions, see the text. b An, b Bn components of the sandwich formula α0 intercept in model for E(X|Z, W) αw coefficient of W in model for E(X|Z, W) αz coefficient of Z in model for E(X|Z, W) β0 intercept in a model for E(Y|X, Z) βx coefficient of X in model for E(Y|X, Z) βz coefficient of Z in model for E(Y|X, Z) β1ZX coefficient of 1 in generalized linear regresssion ∆ indicator of validation data, for example, where X is observed dim(β) dimension of the vector β fX density of X fY,W,T |Z density of (Y, W, T) given Z F(·) unknown link function σ2 g(Z, X, B, θ) var(Y|Z, X) in QVF model G extrapolant function in SIMEX GQ quadratic extrapolant function GRL rational linear extrapolant function γ0,cm intercept in a regression calibration model γt z,cm coefficient of Z in a regression calibration model γt w,cm coefficient of W in a regression calibration model γ0,em intercept in an error model γt x,em coefficient of X in an error model γt w,em coefficient of W in an error model H(v) (1 + exp(−v))−1 , for example, the logistic function h bandwidth in nonparametric regression or density estimation In(Θ) Fisher information © 2006 by Taylor & Francis Group, LLC
  • 18. k With equal replication, number of replicates for all subjects ki Number of replicates of ith subject K(·) kernel used in nonparametric regression or density estimation κcm σ2 cm/σ2 Λ(·) likelihood ratio L(·) generalized score function mX(Z, W, γcm) E(X|Z, W) mY(Z, X, β) E(Y|Z, X) in QVF (quasilikelihood variance function) model mY,x(z, x, β) (∂/∂x)mY(z, x, β) mY,xx(z, x, β) (∂2 /∂x2 )mY(z, x, β) π(Y, Z, W, α) probability of selection into a validation study Ψ, ψ estimating functions S Y measured with error (S = Y + V) si(y|Θ) score function σ2 u variance of U σ2 X|Z conditional variance of X given Z σxy the covariance between random variables X and Y ρxy the correlation between X and Y , which is defined as σxy/(σxσy) ΣZX covariance matrix between the random vectors Z and X T observation related to X Θb(λ) simulated estimator used in SIMEX Θ(λ) average of the Θb(λ)s U observation error in an error model Ub,k pseudo-error in SIMEX V measurement error in the response W observation related to X X covariates measured with error Y response Yi· average of Yij over j [Ỹ|Z̃, X̃, B] density of Ỹ given (Z̃, X̃, B) (Bayesian notation) Z covariates measured without error ζ parameter controlling amount of simulated extra measurement error in SIMEX If m(x) is any function, then m0 (x) and m00 (x) are its first and second derivatives and m(m) is its mth derivative for m > 2. For a vector or matrix A, At is its transpose and if A is an invertible matrix, then A−1 is its inverse. If a = (a1, . . . , an) is a vector, then kak is its Euclidean norm, that is, kak = ¡Pn i=1 a2 i ¢1/2 . © 2006 by Taylor & Francis Group, LLC
  • 19. If X and Y are random variables, then [X] is the distribution or X and [X|Y ] is the conditional distribution of X given Y. This notation is becoming standard in the Bayesian literature. © 2006 by Taylor & Francis Group, LLC
  • 20. Contents 1 INTRODUCTION 1 1.1 The Double/Triple Whammy of Measurement Error 1 1.2 Classical Measurement Error: A Nutrition Example 2 1.3 Measurement Error Examples 3 1.4 Radiation Epidemiology and Berkson Errors 4 1.4.1 The Difference Between Berkson and Classical Errors: How to Gain More Power Without Really Trying 5 1.5 Classical Measurement Error Model Extensions 7 1.6 Other Examples of Measurement Error Models 9 1.6.1 NHANES 9 1.6.2 Nurses’ Health Study 10 1.6.3 The Atherosclerosis Risk in Communities Study 11 1.6.4 Bioassay in a Herbicide Study 11 1.6.5 Lung Function in Children 12 1.6.6 Coronary Heart Disease and Blood Pressure 12 1.6.7 A-Bomb Survivors Data 13 1.6.8 Blood Pressure and Urinary Sodium Chloride 13 1.6.9 Multiplicative Error for Confidentiality 14 1.6.10 Cervical Cancer and Herpes Simplex Virus 14 1.7 Checking the Classical Error Model 14 1.8 Loss of Power 18 1.8.1 Linear Regression Example 18 1.8.2 Radiation Epidemiology Example 20 1.9 A Brief Tour 23 Bibliographic Notes 23 2 IMPORTANT CONCEPTS 25 2.1 Functional and Structural Models 25 2.2 Models for Measurement Error 26 2.2.1 General Approaches: Berkson and Classical Models 26 2.2.2 Is It Berkson or Classical? 27 2.2.3 Berkson Models from Classical 28 2.2.4 Transportability of Models 29 xix © 2006 by Taylor & Francis Group, LLC
  • 21. 2.2.5 Potential Dangers of Transporting Models 30 2.2.6 Semicontinuous Variables 32 2.2.7 Misclassification of a Discrete Covariate 32 2.3 Sources of Data 32 2.4 Is There an “Exact” Predictor? What Is Truth? 33 2.5 Differential and Nondifferential Error 36 2.6 Prediction 38 Bibliographic Notes 39 3 LINEAR REGRESSION AND ATTENUATION 41 3.1 Introduction 41 3.2 Bias Caused by Measurement Error 41 3.2.1 Simple Linear Regression with Additive Error 42 3.2.2 Regression Calibration: Classical Error as Berkson Error 44 3.2.3 Simple Linear Regression with Berkson Error 45 3.2.4 Simple Linear Regression, More Complex Error Structure 46 3.2.5 Summary of Simple Linear Regression 49 3.3 Multiple and Orthogonal Regression 52 3.3.1 Multiple Regression: Single Covariate Measured with Error 52 3.3.2 Multiple Covariates Measured with Error 53 3.4 Correcting for Bias 55 3.4.1 Method of Moments 55 3.4.2 Orthogonal Regression 57 3.5 Bias Versus Variance 60 3.5.1 Theoretical Bias–Variance Tradeoff Calculations 61 3.6 Attenuation in General Problems 63 Bibliographic Notes 64 4 REGRESSION CALIBRATION 65 4.1 Overview 65 4.2 The Regression Calibration Algorithm 66 4.3 NHANES Example 66 4.4 Estimating the Calibration Function Parameters 70 4.4.1 Overview and First Methods 70 4.4.2 Best Linear Approximations Using Replicate Data 70 4.4.3 Alternatives When Using Partial Replicates 72 4.4.4 James–Stein Calibration 72 4.5 Multiplicative Measurement Error 72 4.5.1 Should Predictors Be Transformed? 73 4.5.2 Lognormal X and U 74 xx © 2006 by Taylor & Francis Group, LLC
  • 22. 4.5.3 Linear Regression 77 4.5.4 Additive and Multiplicative Error 78 4.6 Standard Errors 79 4.7 Expanded Regression Calibration Models 79 4.7.1 The Expanded Approximation Defined 81 4.7.2 Implementation 83 4.7.3 Bioassay Data 85 4.8 Examples of the Approximations 90 4.8.1 Linear Regression 90 4.8.2 Logistic Regression 90 4.8.3 Loglinear Mean Models 93 4.9 Theoretical Examples 94 4.9.1 Homoscedastic Regression 94 4.9.2 Quadratic Regression with Homoscedastic Regres- sion Calibration 94 4.9.3 Loglinear Mean Model 95 Bibliographic Notes and Software 95 5 SIMULATION EXTRAPOLATION 97 5.1 Overview 97 5.2 Simulation Extrapolation Heuristics 98 5.2.1 SIMEX in Simple Linear Regression 98 5.3 The SIMEX Algorithm 100 5.3.1 Simulation and Extrapolation Steps 100 5.3.2 Extrapolant Function Considerations 108 5.3.3 SIMEX Standard Errors 110 5.3.4 Extensions and Refinements 111 5.3.5 Multiple Covariates with Measurement Error 112 5.4 Applications 112 5.4.1 Framingham Heart Study 112 5.4.2 Single Covariate Measured with Error 113 5.4.3 Multiple Covariates Measured with Error 118 5.5 SIMEX in Some Important Special Cases 120 5.5.1 Multiple Linear Regression 120 5.5.2 Loglinear Mean Models 122 5.5.3 Quadratic Mean Models 122 5.6 Extensions and Related Methods 123 5.6.1 Mixture of Berkson and Classical Error 123 5.6.2 Misclassification SIMEX 125 5.6.3 Checking Structural Model Robustness via Re- measurement 126 Bibliographic Notes 128 xxi © 2006 by Taylor & Francis Group, LLC
  • 23. 6 INSTRUMENTAL VARIABLES 129 6.1 Overview 129 6.1.1 A Note on Notation 130 6.2 Instrumental Variables in Linear Models 131 6.2.1 Instrumental Variables via Differentiation 131 6.2.2 Simple Linear Regression with One Instrument 132 6.2.3 Linear Regression with Multiple Instruments 134 6.3 Approximate Instrumental Variable Estimation 137 6.3.1 IV Assumptions 137 6.3.2 Mean and Variance Function Models 138 6.3.3 First Regression Calibration IV Algorithm 139 6.3.4 Second Regression Calibration IV Algorithm 140 6.4 Adjusted Score Method 140 6.5 Examples 143 6.5.1 Framingham Data 143 6.5.2 Simulated Data 145 6.6 Other Methodologies 145 6.6.1 Hybrid Classical and Regression Calibration 145 6.6.2 Error Model Approaches 147 Bibliographic Notes 148 7 SCORE FUNCTION METHODS 151 7.1 Overview 151 7.2 Linear and Logistic Regression 152 7.2.1 Linear Regression Corrected and Conditional Scores 152 7.2.2 Logistic Regression Corrected and Conditional Scores 157 7.2.3 Framingham Data Example 159 7.3 Conditional Score Functions 162 7.3.1 Conditional Score Basic Theory 162 7.3.2 Conditional Scores for Basic Models 164 7.3.3 Conditional Scores for More Complicated Models 166 7.4 Corrected Score Functions 169 7.4.1 Corrected Score Basic Theory 170 7.4.2 Monte Carlo Corrected Scores 170 7.4.3 Some Exact Corrected Scores 172 7.4.4 SIMEX Connection 173 7.4.5 Corrected Scores with Replicate Measurements 173 7.5 Computation and Asymptotic Approximations 174 7.5.1 Known Measurement Error Variance 175 7.5.2 Estimated Measurement Error Variance 176 7.6 Comparison of Conditional and Corrected Scores 177 xxii © 2006 by Taylor & Francis Group, LLC
  • 24. 7.7 Bibliographic Notes 178 Bibliographic Notes 178 8 LIKELIHOOD AND QUASILIKELIHOOD 181 8.1 Introduction 181 8.1.1 Step 1: The Likelihood If X Were Observable 183 8.1.2 A General Concern: Identifiable Models 184 8.2 Steps 2 and 3: Constructing Likelihoods 184 8.2.1 The Discrete Case 185 8.2.2 Likelihood Construction for General Error Models 186 8.2.3 The Berkson Model 188 8.2.4 Error Model Choice 189 8.3 Step 4: Numerical Computation of Likelihoods 190 8.4 Cervical Cancer and Herpes 190 8.5 Framingham Data 192 8.6 Nevada Test Site Reanalysis 193 8.6.1 Regression Calibration Implementation 195 8.6.2 Maximum Likelihood Implementation 196 8.7 Bronchitis Example 197 8.7.1 Calculating the Likelihood 198 8.7.2 Effects of Measurement Error on Threshold Models 199 8.7.3 Simulation Study and Maximum Likelihood 199 8.7.4 Berkson Analysis of the Data 201 8.8 Quasilikelihood and Variance Function Models 201 8.8.1 Details of Step 3 for QVF Models 202 8.8.2 Details of Step 4 for QVF Models 203 Bibliographic Notes 203 9 BAYESIAN METHODS 205 9.1 Overview 205 9.1.1 Problem Formulation 205 9.1.2 Posterior Inference 207 9.1.3 Bayesian Functional and Structural Models 208 9.1.4 Modularity of Bayesian MCMC 209 9.2 The Gibbs Sampler 209 9.3 Metropolis–Hastings Algorithm 211 9.4 Linear Regression 213 9.4.1 Example 216 9.5 Nonlinear Models 219 9.5.1 A General Model 219 9.5.2 Polynomial Regression 220 9.5.3 Multiplicative Error 221 9.5.4 Segmented Regression 222 xxiii © 2006 by Taylor & Francis Group, LLC
  • 25. 9.6 Logistic Regression 223 9.7 Berkson Errors 225 9.7.1 Nonlinear Regression with Berkson Errors 225 9.7.2 Logistic Regression with Berkson Errors 227 9.7.3 Bronchitis Data 228 9.8 Automatic Implementation 230 9.8.1 Implementation and Simulations in WinBUGS 231 9.8.2 More Complex Models 234 9.9 Cervical Cancer and Herpes 235 9.10 Framingham Data 237 9.11 OPEN Data: A Variance Components Model 238 Bibliographic Notes 240 10 HYPOTHESIS TESTING 243 10.1 Overview 243 10.1.1 Simple Linear Regression, Normally Distributed X 243 10.1.2 Analysis of Covariance 246 10.1.3 General Considerations: What Is a Valid Test? 248 10.1.4 Summary of Major Results 248 10.2 The Regression Calibration Approximation 249 10.2.1 Testing H0 : βx = 0 250 10.2.2 Testing H0 : βz = 0 250 10.2.3 Testing H0 : (βt x, βt z)t = 0 250 10.3 Illustration: OPEN Data 251 10.4 Hypotheses about Subvectors of βx and βz 251 10.4.1 Illustration: Framingham Data 252 10.5 Efficient Score Tests of H0 : βx = 0 253 10.5.1 Generalized Score Tests 254 Bibliographic Notes 257 11 LONGITUDINAL DATA AND MIXED MODELS 259 11.1 Mixed Models for Longitudinal Data 259 11.1.1 Simple Linear Mixed Models 259 11.1.2 The General Linear Mixed Model 260 11.1.3 The Linear Logistic Mixed Model 261 11.1.4 The Generalized Linear Mixed Model 261 11.2 Mixed Measurement Error Models 262 11.2.1 The Variance Components Model Revisited 262 11.2.2 General Considerations 263 11.2.3 Some Simple Examples 263 11.2.4 Models for Within-Subject X-Correlation 265 11.3 A Bias-Corrected Estimator 265 xxiv © 2006 by Taylor & Francis Group, LLC
  • 26. 11.4 SIMEX for GLMMEMs 267 11.5 Regression Calibration for GLMMs 267 11.6 Maximum Likelihood Estimation 268 11.7 Joint Modeling 268 11.8 Other Models and Applications 269 11.8.1 Models with Random Effects Multiplied by X 269 11.8.2 Models with Random Effects Depending Nonlin- early on X 270 11.8.3 Inducing a True-Data Model from a Standard Observed Data Model 270 11.8.4 Autoregressive Models in Longitudinal Data 271 11.9 Example: The CHOICE Study 272 11.9.1 Basic Model 273 11.9.2 Naive Replication and Sensitivity 273 11.9.3 Accounting for Biological Variability 274 Bibliographic Notes 276 12 NONPARAMETRIC ESTIMATION 279 12.1 Deconvolution 279 12.1.1 The Problem 279 12.1.2 Fourier Inversion 280 12.1.3 Methodology 280 12.1.4 Properties of Deconvolution Methods 281 12.1.5 Is It Possible to Estimate the Bandwidth? 282 12.1.6 Parametric Deconvolution 284 12.1.7 Estimating Distribution Functions 287 12.1.8 Optimal Score Tests 288 12.1.9 Framingham Data 289 12.1.10NHANES Data 290 12.1.11Bayesian Density Estimation by Normal Mixtures 291 12.2 Nonparametric Regression 293 12.2.1 Local-Polynomial, Kernel-Weighted Regression 293 12.2.2 Splines 294 12.2.3 QVF and Likelihood Models 295 12.2.4 SIMEX for Nonparametric Regression 296 12.2.5 Regression Calibration 297 12.2.6 Structural Splines 297 12.2.7 Taylex and Other Methods 298 12.3 Baseline Change Example 299 12.3.1 Discussion of the Baseline Change Controls Data 301 Bibliographic Notes 302 13 SEMIPARAMETRIC REGRESSION 303 xxv © 2006 by Taylor & Francis Group, LLC
  • 27. 13.1 Overview 303 13.2 Additive Models 303 13.3 MCMC for Additive Spline Models 304 13.4 Monte Carlo EM-Algorithm 305 13.4.1 Starting Values 306 13.4.2 Metropolis–Hastings Fact 306 13.4.3 The Algorithm 306 13.5 Simulation with Classical Errors 309 13.6 Simulation with Berkson Errors 311 13.7 Semiparametrics: X Modeled Parametrically 312 13.8 Parametric Models: No Assumptions on X 314 13.8.1 Deconvolution Methods 314 13.8.2 Models Linear in Functions of X 315 13.8.3 Linear Logistic Regression with Replicates 316 13.8.4 Doubly Robust Parametric Modeling 317 Bibliographic Notes 318 14 SURVIVAL DATA 319 14.1 Notation and Assumptions 319 14.2 Induced Hazard Function 320 14.3 Regression Calibration for Survival Analysis 321 14.3.1 Methodology and Asymptotic Properties 321 14.3.2 Risk Set Calibration 322 14.4 SIMEX for Survival Analysis 323 14.5 Chronic Kidney Disease Progression 324 14.5.1 Regression Calibration for CKD Progression 325 14.5.2 SIMEX for CKD Progression 326 14.6 Semi and Nonparametric Methods 329 14.6.1 Nonparametric Estimation with Validation Data 330 14.6.2 Nonparametric Estimation with Replicated Data 332 14.6.3 Likelihood Estimation 333 14.7 Likelihood Inference for Frailty Models 336 Bibliographic Notes 337 15 RESPONSE VARIABLE ERROR 339 15.1 Response Error and Linear Regression 339 15.2 Other Forms of Additive Response Error 343 15.2.1 Biased Responses 343 15.2.2 Response Error in Heteroscedastic Regression 344 15.3 Logistic Regression with Response Error 345 15.3.1 The Impact of Response Misclassification 345 15.3.2 Correcting for Response Misclassification 347 15.4 Likelihood Methods 353 xxvi © 2006 by Taylor & Francis Group, LLC
  • 28. 15.4.1 General Likelihood Theory and Surrogates 353 15.4.2 Validation Data 354 15.5 Use of Complete Data Only 355 15.5.1 Likelihood of the Validation Data 355 15.5.2 Other Methods 356 15.6 Semiparametric Methods for Validation Data 356 15.6.1 Simple Random Sampling 356 15.6.2 Other Types of Sampling 357 Bibliographic Notes 358 A BACKGROUND MATERIAL 359 A.1 Overview 359 A.2 Normal and Lognormal Distributions 359 A.3 Gamma and Inverse-Gamma Distributions 360 A.4 Best and Best Linear Prediction and Regression 361 A.4.1 Linear Prediction 361 A.4.2 Best Linear Prediction without an Intercept 363 A.4.3 Nonlinear Prediction 363 A.5 Likelihood Methods 364 A.5.1 Notation 364 A.5.2 Maximum Likelihood Estimation 364 A.5.3 Likelihood Ratio Tests 365 A.5.4 Profile Likelihood and Likelihood Ratio Confidence Intervals 365 A.5.5 Efficient Score Tests 366 A.6 Unbiased Estimating Equations 367 A.6.1 Introduction and Basic Large Sample Theory 367 A.6.2 Sandwich Formula Example: Linear Regression without Measurement Error 369 A.6.3 Sandwich Method and Likelihood-Type Inference 370 A.6.4 Unbiased, but Conditionally Biased, Estimating Equations 372 A.6.5 Biased Estimating Equations 372 A.6.6 Stacking Estimating Equations: Using Prior Esti- mates of Some Parameters 372 A.7 Quasilikelihood and Variance Function Models (QVF) 374 A.7.1 General Ideas 374 A.7.2 Estimation and Inference for QVF Models 375 A.8 Generalized Linear Models 377 A.9 Bootstrap Methods 377 A.9.1 Introduction 377 A.9.2 Nonlinear Regression without Measurement Error 378 A.9.3 Bootstrapping Heteroscedastic Regression Models 380 xxvii © 2006 by Taylor & Francis Group, LLC
  • 29. A.9.4 Bootstrapping Logistic Regression Models 380 A.9.5 Bootstrapping Measurement Error Models 381 A.9.6 Bootstrap Confidence Intervals 382 B TECHNICAL DETAILS 385 B.1 Appendix to Chapter 1: Power in Berkson and Classical Error Models 385 B.2 Appendix to Chapter 3: Linear Regression and Attenua- tion 386 B.3 Appendix to Chapter 4: Regression Calibration 387 B.3.1 Standard Errors and Replication 387 B.3.2 Quadratic Regression: Details of the Expanded Calibration Model 391 B.3.3 Heuristics and Accuracy of the Approximations 391 B.4 Appendix to Chapter 5: SIMEX 392 B.4.1 Simulation Extrapolation Variance Estimation 393 B.4.2 Estimating Equation Approach to Variance Esti- mation 395 B.5 Appendix to Chapter 6: Instrumental Variables 399 B.5.1 Derivation of the Estimators 399 B.5.2 Asymptotic Distribution Approximations 401 B.6 Appendix to Chapter 7: Score Function Methods 406 B.6.1 Technical Complements to Conditional Score Theory 406 B.6.2 Technical Complements to Distribution Theory for Estimated Σuu 406 B.7 Appendix to Chapter 8: Likelihood and Quasilikelihood 407 B.7.1 Monte Carlo Computation of Integrals 407 B.7.2 Linear, Probit, and Logistic Regression 408 B.8 Appendix to Chapter 9: Bayesian Methods 409 B.8.1 Code for Section 9.8.1 409 B.8.2 Code for Section 9.11 410 References 413 xxviii © 2006 by Taylor & Francis Group, LLC
  • 30. CHAPTER 1 INTRODUCTION 1.1 The Double/Triple Whammy of Measurement Error Measurement error in covariates has three effects: • It causes bias in parameter estimation for statistical models. • It leads to a loss of power, sometimes profound, for detecting inter- esting relationship among variables. • It masks the features of the data, making graphical model analysis difficult. We call the first two the double whammy of measurement error. Most of the statistical methods described in this book are aimed at the first prob- lem, namely, to correct for biases of estimation caused by measurement error. Later in this chapter, we will describe an example from radiation dosimetry and the profound loss of power for detecting risks that occurs with uncertainties in individual doses. Here, we briefly describe the third issue, the masking of features. Consider a regression of a response Y on a predictor X, uniformly distributed on the interval [−2, 2]. Suppose that the mean is sin(2X) and the variance σ2 ² = 0.10. In the top panel of Figure 1.1, we plot 200 simulated observations from such a model that indicate quite clearly the sinusoidal aspect of the regression function. However, suppose that instead of observing X, we observe W, normally distributed with mean X but with variance 4/9. As we will later describe in Section 3.2.1, this is an attenuation coefficient of 0.75. Thus, what we observe is not X, but an unbiased estimate of it, W. In the bottom panel of Figure 1.1, we plot the observed data Y versus W. Note that the sinusoid is no longer evident and the main feature of the data has been hidden. It is also worth noting that the variability about the sinusoid is far smaller when X is observed than the variability about any curve one could reasonably guess at when only W is observed. This is one sub- stantial cause of the loss of power. Finally, if one only observes (Y, W) and hence the bottom panel of Figure 1.1, it would be essentially impos- sible to reconstruct the sinusoid, and something different would certainly be used. This is the bias caused by measurement error. 1 © 2006 by Taylor & Francis Group, LLC
  • 31. −3 −2 −1 0 1 2 3 −2 −1 0 1 2 −3 −2 −1 0 1 2 3 −2 −1 0 1 2 Figure 1.1 Illustration of the bias, loss of power, and masking of features caused by measurement error in predictors. Top panel regression on the true covariate. Bottom panel regression on the observed covariate. 1.2 Classical Measurement Error: A Nutrition Example Much of the measurement error literature is based around what is called classical measurement error, in which the truth is measured with ad- ditive error, usually with constant variance. We introduce the classical measurement error model via an example from nutrition. In the National Cancer Institute’s OPEN study, see Subar, Thomp- son, Kipnis, et al. (2001), one interest is in measuring the logarithm of dietary protein intake. True, long-term log-intake is denoted by X, but this cannot be observed in practice. Instead, the investigators measured a biomarker of log-protein intake, namely urinary nitrogen, denoted by W. In this study, 297 subjects had replicated urinary nitrogen mea- surements. If there were no measurement error, then of course the two biomarker measurements would be equal, but then, since this is a book about measurement error, we would not be wasting space. Indeed, in Figure 1.2 we see that when we plot the second biomarker versus the first, the correlation is relatively high (0.695), but there clearly is some variability in the measurements. In this context, there is evidence from feeding studies that the pro- tein biomarker captures true protein intake with added variability. Such situations are often called classical measurement error. In symbols, let Xi be the true log-protein intake for individual i, and let Wij be the 2 © 2006 by Taylor & Francis Group, LLC
  • 32. 4.5 5 5.5 6 6.5 7 5 5.2 5.4 5.6 5.8 6 6.2 6.4 6.6 6.8 First Protein Biomarker, Attenuation = 0.694 Second Protein Biomarker OPEN data, Protein, Log Scale, Correlation = 0.695 Figure 1.2 OPEN Study data, scatterplot of the logarithm of the first and sec- ond protein biomarker measurements. The fact that there is scatter means that the biomarker has measurement error. jth biomarker log-protein measurement. Then the classical measurement error model states that Wij = Xi + Uij. (1.1) In this model, Wij is an unbiased measure of Xi, so that Uij must have mean zero, that is, in symbols, E(Uij|Xi) = 0. The error structure of Uij could be homoscedastic (constant variance) or heteroscedastic. In this particular example, we will show later, in Section 1.7, that the measure- ment error structure is approximately normal with constant variance, so we can reasonably think that Uij|Xi ∼ Normal(0, σ2 u). 1.3 Measurement Error Examples Nonlinear measurement error models commonly begin with an underly- ing nonlinear model for the response Y in terms of the predictors. We distinguish between two kinds of predictors: Z represents those predic- tors that, for all practical purposes, are measured without error, and X those that cannot be observed exactly for all study subjects. The distin- guishing feature of a measurement error problem is that we can observe a variable W, which is related to an unobservable X. The parameters in the model relating Y and (Z, X) cannot, of course, be estimated directly 3 © 2006 by Taylor & Francis Group, LLC
  • 33. by fitting Y to (Z, X), since X is not observed. The goal of measurement error modeling is to obtain nearly unbiased estimates of these parame- ters indirectly by fitting a model for Y in terms of (Z, W). Attainment of this goal requires careful analysis. Substituting W for X, but making no adjustments in the usual fitting methods for this substitution, leads to estimates that are biased, sometimes seriously, see Figure 1.1. The problem here is that the parameters of the regression of Y on (Z, W) are different from those of Y on (Z, X). In assessing measurement error, careful attention must be given to the type and nature of the error, and the sources of data that allow modeling of this error. The following examples illustrate some of the different types of problems considered in this book. 1.4 Radiation Epidemiology and Berkson Errors There are many studies relating radiation exposure to disease, including the Nevada Test Site (NTS) Thyroid Disease Study and the Hanford Thyroid Disease Study (HTDS). Full disclosure: One of us (RJC) was involved in litigation concerning HTDS, and his expert report is avail- able at http://www.downwinders.com/files/htds expert report.pdf, the plaintiffs’ Web site, at least as of May 2005. Stevens, Till, Thomas, et al. (1992); Kerber, Till, Simon, et al. (1993); and Simon, Till, Lloyd, et al. (1995) described the Nevada test site study, where radiation exposure largely came as the result of above-ground nuclear testing in the 1950s. Similar statistical issues arise in the Hanford Thyroid Disease Study: see Davis, Kopecky, Stram, et al. (2002); Stram and Kopecky (2003); and Kopecky, Davis, Hamilton, et al. (2004), where radiation was released in the 1950s and 1960s. In the Nevada study, over 2, 000 individuals who were exposed to radiation as children were examined for thyroid disease. The primary radiation exposure came from milk and vegetables. The idea of the study was to relate various thyroid disease outcomes to radiation exposure to the thyroid. Of course, once again, since this is a book about measurement er- ror, the main exposure of interest, radiation to the thyroid, cannot be observed exactly. What is typical in these studies is to build a large dosimetry model that attempts to convert the known data about the above-ground nuclear tests to radiation actually absorbed into the thy- roid. Dosimetry calculations in NTS were based on age at exposure, gender, residence history, x-ray history, whether the individual was as a child breast-fed, and a diet questionnaire filled out by the parent, focus- ing on milk consumption and vegetables. The data were then input into a complex model and, for each individual, the point estimate of thyroid dose and an associated standard error for the measurement error were 4 © 2006 by Taylor & Francis Group, LLC
  • 34. reported. Roughly similar considerations led to the dose estimates and uncertainties in HTDS. In both NTS and HTDS, the authors consider analyses taking into account the uncertainties (measurement error) in dose estimates. Indeed, both consider the classical measurement error situation in (1.1). The HTDS study, though, also considered a different type of measurement error, and based most of their power calculations on it. We will go into detail on the power and analysis issues; see Section 1.8.2 of this chapter for power and Section 8.6 for the analysis. What we see in the classical measurement error model (1.1) is that the observed dose equals the true dose plus (classical) measurement error. This, of course, means that the variability of the observed doses will be greater than the variability of true doses. In HTDS, in contrast, the authors not only consider this classical measurement error, but they also turn the issue around; namely, they assumed that the true dose is equal to the estimated dose plus measurement error. In symbols, this is Xi = Wi + Ui, (1.2) where E(Ui|Wi) = 0, so that the true dose has more variability than the estimated dose; contrast with (1.1). Model (1.2) is called a Berkson measurement error model, see Berkson (1950). 1.4.1 The Difference Between Berkson and Classical Errors: How to Gain More Power Without Really Trying Measurement error modeling requires considerable care. In this section, we discuss why it is crucial that one understands the seemingly subtle differences between Berkson and classical errors, and we illustrate some possible pitfalls when choosing between the two error models. As far as we are aware, one cannot be put in jail for using the wrong model, but an incorrect measurement error model often causes erroneous inferences, which to a statistician is worse than going to jail (okay, we have exag- gerated). In Section 2.2.2 we provide additional guidance so that the reader can be confident of choosing the correct error model in his/her own work. The difference between Berkson and classical measurement error is major when one is planning a study a priori, especially when one is attempting power calculations. There are some technical similarities be- tween classical and Berkson errors, see Section 3.2.2, but different issues arise in power calculations. What we will indicate here is that for a given measurement error variance, if you want to convince yourself that you have lots of statistical power despite measurement error, just pretend that the measurement error is Berkson and not classical. 5 © 2006 by Taylor & Francis Group, LLC
  • 35. Suppose that the observed data have a normal distribution with mean zero and variance σ2 w = 2.0. Suppose also that the measurement error has variance σ2 u = 1.0. Then, if one assumes a Berkson model, the true doses have mean zero and variance σ2 x = 3.0. This is so because the variance of X in (1.2) is the sum of the variance of W (σ2 w = 2.0) and the variance of the Berkson measurement error U (σ2 u = 1.0). Now, in major contrast, if one assumes that the measurement error is classical instead of Berkson, then the variance of X is, from (1.1), the difference of the variance of W (2.0) and the variance of the classical measurement error U (1.0), that is, 1.0. In other words, if we assume Berkson error, we think that the true dose X has variance 3.0, while if we assume classical measurement error, we think that the variance of the true dose equals 1.0, a feature reflected in Figure 1.3. Now, for a given set of parameter values of risk, it is generally the case that the power increases when the variance of true exposure X increases, Hence, assuming Berkson when the error is classical leads to a grossly optimistic overstatement of power. −5 −4 −3 −2 −1 0 1 2 3 4 0 10 20 30 40 50 60 Your Estimates Of True Doses If Error Is Berkson −5 −4 −3 −2 −1 0 1 2 3 4 0 10 20 30 40 50 60 Your Estimates Of True Doses If Error Is Classical Figure 1.3 A hypothetical example where the observed doses W have mean zero and variance 2.0, while the measurement errors have mean zero and variance 1.0. Displayed are the distributions of true dose that you think you have if you think that the errors are Berkson (top) or if you think the errors are classical (bottom). The much smaller variability of true dose under the classical model indicates that the power for detecting effects will be much smaller than if the errors are Berkson. 6 © 2006 by Taylor & Francis Group, LLC
  • 36. 7.2 7.4 7.6 7.8 8 8.2 8.4 8.6 6 6.5 7 7.5 8 8.5 9 Energy Biomarker Dietary History Questionnaire OPEN data, Energy, Log Scale, Correlation = 0.28 Figure 1.4 OPEN Study data, scatterplot of the logarithm of energy (calories) using a food frequency questionnaire and a biomarker. Further discussion of differences and similarities between power in classical and Berkson error models can be found in Section B.1. 1.5 Classical Measurement Error Model Extensions It almost goes without saying, but we will say it, that measurement error models can be more complex than the classical additive measurement error model (1.1) or the classical Berkson error model (1.2). Here we illustrate some of the complexities of measurement error modeling via an important nutrition biomarker study. The study of diet and disease has been a major motivation for nonlin- ear measurement error modeling. In these studies, it is typical to mea- sure diet via a self–report instrument, for example, a food frequency questionnaire (FFQ), some sort of diary, or a 24-hour recall interview. It has been appreciated for decades that these self-report instruments are only imperfect measures of long-term dietary intakes, and hence that measurement error is a major concern. To understand the profound nature of measurement error in this con- text, we consider the National Cancer Institute’s OPEN study, which is one of the largest biomarker studies ever done; see Subar, Kipnis, Troiano, et al. (2003) and Kipnis, Midthune, Freedman, et al. (2003). We illustrate this measurement error with energy (caloric) intake mea- 7 © 2006 by Taylor & Francis Group, LLC
  • 37. 0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 0 50 100 150 Biomarker for Calories 0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 0 50 100 150 FFQ for Calories Figure 1.5 OPEN Study data, histograms of energy (calories) using a biomarker (top panel) and a food frequency questionnaire (bottom panel). Note how individuals report far fewer calories than they actually consume. sures. In the OPEN Study, energy intake was measured by the dietary history questionnaire, an FFQ described in Subar, Thompson, Kipnis, et al. (2001). In keeping with our notation, since the FFQ is not the truth, we will denote by W the log energy intake as measured by the FFQ. In addition, the investigators obtained a near-perfect biomarker measure of energy intake using a technique called doubly-labeled water (DLW), which we call X. DLW is basically what it sounds like: Participants drink water that is enriched with respect to two isotopes, and urine samples allow the measurement of energy expenditure. That true intake X and observed intake W can be very different is seen in Figure 1.4, where we plot the FFQ versus the biomarker along with the associated least squares line. The correlation between truth and observed is only 0.28, indicating that the FFQ is not a very good measure of energy intake. It is also interesting to note the histograms for these two instruments; see Figure 1.5. One can see there that the FFQ is also clearly badly biased downward in general for energy intake, that is, people eat more calories than they are willing to report (no surprise!). In this example, because of the biases seen in Figures 1.4 and 1.5 the FFQ is not an unbiased measure of true energy intake, and hence the classical measurement error model (1.1) clearly does not hold. A more reasonable model, promoted in a series of papers by Kipnis et al. (1999, 8 © 2006 by Taylor & Francis Group, LLC
  • 38. 2001, 2003), is to allow for bias as well as variance components Wij = γ0 + γ1Xij + Uij, (1.3) Uij = ri + ²ij, where ri ∼ Normal(0, σ2 r ) and ²ij ∼ Normal(0, σ2 ² ). In model (1.3), the linear regression in true intake reflects the biases of the FFQ. The struc- ture of the measurement error random variables Uij is that they have two components: a shared component r and a random component ². Kipnis et al. (1999, 2001, 2003) call the shared component person-specific bias, reflecting the idea that two people who eat exactly the same foods will nonetheless systematically report intakes differently when given multiple FFQs. Fuller (1987) calls the person-specific bias an equation error. Of course, if γ0 = 0, γ1 = 1, and ri ≡ 0, then we have the standard classical measurement error model (1.1). 1.6 Other Examples of Measurement Error Models 1.6.1 NHANES The NHANES-I Epidemiologic Study Cohort data set (Jones, Schatzen, Green, et al., 1987) is a cohort study originally consisting of 8,596 women who were interviewed about their nutrition habits and later examined for evidence of cancer. We restrict attention to a subcohort of 3,145 women aged 25–50 who have no missing data on the variables of interest. The response Y indicates the presence of breast cancer. The predictor variables Z, assumed to be measured without significant error, include the following: age, poverty index ratio, body mass index, alcohol (Yes, No), family history of breast cancer, age at menarche, and menopausal status. We are primarily interested in the effects of nutrition variables X that are known to be imprecisely measured, for example, “long-term” saturated fat intake. If all these underlying variables were observable, then a standard lo- gistic regression analysis would be performed. However, it is both diffi- cult and expensive to measure long-term diet in a large cohort. In the NHANES data, instead of observing X, the measured W was a 24-hour recall, that is, each participant’s diet in the previous 24 hours was re- called and nutrition variables computed. That the measurement error is large in 24-hour recalls has been documented previously (Beaton, Mil- nor, & Little, 1979; Wu, Whittemore, & Jung, 1986). Indeed, there is evidence to support the conclusion that more than half of the variability in the observed data is due to measurement error. There are several sources of the measurement error. First, there is the error in the ascertainment of food consumption in the previous 24 hours, especially amounts. Some of this type of error is purely random, 9 © 2006 by Taylor & Francis Group, LLC
  • 39. while another part is due to systematic bias, for example, some people resist giving an accurate description of their consumption of snacks. The size of potential systematic bias can be determined in some instances (Freedman, Carroll, & Wax, 1991), but in the present study we have available only the 24-hour recall information, and any systematic bias is unidentifiable. The major source of “error” is the fact that a single day’s diet does not serve as an adequate measure of the previous year’s diet. There are seasonaL differences in diet, as well as day-to-day variations. This points out the fact that measurement error is much more than simple recording or instrument error and encompasses many different sources of variability. There is insufficient information in the NHANES data to model mea- surement error directly. Instead, the measurement error structure was modeled using an external data set, the CSFII (Continuing Survey of Food Intakes by Individuals) data (Thompson, Sowers, Frongillo, et al., 1992). The CSFII data contain the 24-hour recall measures W, as well as three additional 24-hour recall phone interviews. Using external data, rather than assessing measurement error on an internal subset of the primary study, entails certain risks that we discuss later in this chapter. The basic problem is that parameters in the external study may differ from parameters in the primary study, leading to bias when external estimates are transported to the primary study. 1.6.2 Nurses’ Health Study While the OPEN Study focused on the properties of instruments for measuring nutrient intakes, the real interest is in relating disease and nutrient intakes. A famous and still ongoing study concerning nutrition and breast cancer has been considered by Rosner, Willett, & Spiegelman (1989) and Rosner, Spiegelman, & Willett (1990), namely, the Nurses’ Health Study. The study has over 80,000 participants and includes many breast cancer cases. The variables are much the same as in the OPEN study, with the exceptions that (1) alcohol is assessed differently and (2) a food-frequency questionnaire was used instead of 24-hour recall interviews. The size of the measurement error in the nutrition variables is still quite large. Here, X = (long-term average alcohol intake, long- term average nutrient intake) and W = (alcohol intake measured by FFQs, nutrient intake measured by FFQs). It is known that W is both highly variable and biased as an estimator of X. The Nurses’ Health Study was designed so that a direct assessment of measurement error is possible. Specifically, 173 nurses recorded alcohol and nutrient intakes in diary form for four different weeks over the course 10 © 2006 by Taylor & Francis Group, LLC
  • 40. of a year. The average, T, of these diary entries is taken to be an unbiased estimate of X. We will call T a second measure of X. Thus, in contrast to NHANES, measurement error was assessed on data internal to the primary study. Because T is unbiased for X, E(T|W) = E(X|W), so we can estimate E(X|W) by regressing T on W. Estimating E(X|W) is the crucial first step in regression calibration, a widely used method of correcting for measurement error; see Chapter 4. 1.6.3 The Atherosclerosis Risk in Communities Study The Atherosclerosis Risk in Communities (ARIC) study is a multipur- pose prospective cohort study described in detail by The ARIC Investiga- tors (1989). From 1987 through 1989, 15,792 male and female volunteers were recruited from four U.S. communities (Forsyth County, NC; subur- ban Minneapolis, MN; Washington County, MD; and Jackson, MS) for a baseline visit including at-home interviews, clinic examination, and lab- oratory measurements. Participants returned approximately every three years for second (1990–1992), third (1993–1995), and fourth (1996–98) visits. Time to event data were obtained from annual participant in- terviews and review of local hospital discharge lists and county death certificates. The “event” was primary coronary kidney disease (CKD). One purpose of the study was to explain the race effect on the progres- sion of CKD. In particular, African-Americans have maintained approx- imately four times the age- and sex-adjusted rate of end-stage renal dis- ease (ESRD) compared to whites during the last two decades (USRDS, 2003), while the prevalence of decreased kidney function (CKD Stage 3) in the U.S. is lower among African-Americans than whites. These pat- terns suggest that that African-Americans progress faster through the different stages of kidney disease. In Chapter 14 we investigate the race effect on the probability of progression to CKD using a survival data approach. An important con- founder is the baseline kidney function, which is typically measured by the estimated glomerular filtration rate (eGFR), which is a noisy version of GFR obtained from a prediction equation. The nature of the adjust- ment is more complex because of the nonmonotonic relationship between eGFR and progression probability. 1.6.4 Bioassay in a Herbicide Study Rudemo, Ruppert, & Streibig (1989) consider a bioassay experiment with plants, in which eight herbicides were applied. For each of these eight combinations, six (common) nonzero doses were applied and the dry weight Y of five plants grown in the same pot was measured. In 11 © 2006 by Taylor & Francis Group, LLC
  • 41. this instance, the predictor variable X of interest is the amount of the herbicide actually absorbed by the plant, a quantity that cannot be mea- sured. Here the response is continuous, and if X were observable, then a nonlinear regression model would have been fit, probably by nonlin- ear least squares. The four-parameter logistic model (not to be confused with logistic regression where the response is binary) is commonly used. However, X is not observable; instead, we know only the nominal concentration W of herbicide applied to the plant. The sources of error include not only the error in diluting to the nominal concentration, but also the fact that two plants receiving the same amount of herbicide may absorb different amounts. In this example, the measurement error was not assessed directly. In- stead, the authors assumed that the true amount X was linearly related to the nominal amount W with nonconstant variance. This error model, combined with the approach discussed in Chapter 4, was used to con- struct a new model for the observed data. 1.6.5 Lung Function in Children Tosteson, Stefanski, & Schafer (1989) described an example in which the response was the presence (Y = 1) or absence (Y = 0) of wheeze in children, which is an indicator of lung dysfunction. The predictor variable of interest is X = personal exposure to NO2. Since Y is a binary variable, if X were observable, the authors would have used logistic or probit regression to model the relationship of Y and X. However, X was not available in their study. Instead, the investigators were able to measure a bivariate variable W, consisting of observed kitchen and bedroom concentrations of NO2 in the child’s home. School-aged children spend only a portion of their time in their homes, and only a portion of that time in their kitchens and bedrooms. Thus, it is clear that the true NO2 concentration is not fully explained by what happens in the kitchen and bedroom. While X was not measured in the primary data set, two independent, external studies were available in which both X and W were observed. We will describe this example in more detail later in this chapter. 1.6.6 Coronary Heart Disease and Blood Pressure The Framingham study (Kannel, Neaton, Wentworth, et al., 1986) is a large cohort study following individuals for the development Y of coro- nary heart disease. The main predictor of interest in the study is systolic blood pressure, but other variables include age at first exam, body mass, serum cholesterol, and whether or not the person is a smoker. In princi- 12 © 2006 by Taylor & Francis Group, LLC
  • 42. ple at least, Z consists only of age, body mass, and smoking status, while the variables X measured with error are serum cholesterol and systolic blood pressure. It should be noted that in a related analysis MacMahon, Peto, Cutler, et al. (1990) consider only the last as a variable measured with error. We will follow this convention in our discussion. Again, it is impossible to measure long-term systolic blood pressure X. Instead, what is available is the blood pressure W observed during a clinic visit. The reason that the long-term X and the single-visit W differ is that blood pressure has major daily, as well as seasonal, variation. Generally, the classical measurement error model (1.1) is used in this context. In this experiment, we have an extra measurement of blood pressure T from a clinic visit taken 4 years before W was observed. Hence, unlike any of the other studies we have discussed, in the Framingham study we have information on measurement error for each individual. One can look at T as simply a replicate of W. However, T may be a biased measure of X because of temporal changes in the distribution of blood pressure in the population. Each way of looking at the data is useful and leads to different methods of analysis. 1.6.7 A-Bomb Survivors Data Pierce, Stram, Vaeth, et al. (1992) considered analysis of A-bomb sur- vivor data from the Hiroshima and Nagasaki explosions. They discuss various responses Y, including the number of chromosomal aberrations. The true radiation dose X cannot be measured; instead, estimates W are available. They assume, as an approximation, that W = 0 if and only if X = 0. They adopt a fully parametric approach, specifying that when X and W are positive, then W is lognormal with median X and coefficient of variation of 30%. They assume that if X is positive, it has a Weibull distribution. In symbols, they propose the multiplicative model W = X U, log(U) ∼ Normal(µu, σ2 u), where log(U) is normally distributed with mean zero and variance 0.0862. 1.6.8 Blood Pressure and Urinary Sodium Chloride Liu & Liang (1992) described a problem of logistic regression where the response Y is the presence of high systolic blood pressure (greater than 140). However, in this particular study blood pressure was measured many times and the average recorded, so that the amount of measure- ment error in the average systolic blood pressure is reasonably small. The predictors Z measured without error are age and body mass index. The 13 © 2006 by Taylor & Francis Group, LLC
  • 43. predictor X subject to measurement error is urinary sodium chloride, which is subject to error because of intra-individual variation over time and also possibly due to measurement error in the chemical analyses. In order to understand the effects of measurement error, 24-hour urinary sodium chloride was measured on 6 consecutive days. 1.6.9 Multiplicative Error for Confidentiality Hwang (1986) used survey data released by the U. S. Department of Energy on energy consumption by U. S. households. The exact values of certain variables, for example, heating and cooling degree days, were not given since this information might allow the homeowners to be iden- tified. Instead the Department of Energy multiplied these variables by computer-generated random numbers. The Department of Energy re- leased the method for generating the random errors, so this is a rare case where the error distribution is known exactly. 1.6.10 Cervical Cancer and Herpes Simplex Virus In this example, the question is whether exposure to herpes simplex virus increases the risk of cervical cancer. The data are listed in Carroll, Gail, & Lubin (1993). The response Y is the indicator of invasive cervical cancer, X is exposure to herpes simplex virus, type 2 (HSV-2) measured by a refined western blot procedure, and W is exposure to HSV-2 mea- sured by the western blot procedure. See Hildesheim, Mann, Brinton, et al. (1991) for biological background to this problem. There are 115 com- plete observations where (Y, X, W) is observed and 1,929 incomplete observations where only (Y, W) is observed. There are 39 cases (Y = 1) among the complete data and 693 cases among the incomplete data. Among the complete data, there is substantial misclassification, that is, observations where X 6= W. Also, there is evidence of differential er- ror, meaning that the probability of misclassification depends on the re- sponse, that is, P(X = W|X = x, Y = 0) 6= P(X = W|X = x, Y = 1). 1.7 Checking the Classical Error Model Suppose that the classical error additive measurement error model (1.1) holds, and that the errors U are symmetric and have constant vari- ance in both X and any covariates Z measured without error, that is, var(U|Z, X) = σ2 (a constant). Then, if the instrument W can be repli- cated, the sample standard deviation of the W-values for an individual are uncorrelated with the individual means, and they are also uncorre- lated with Z. Further, suppose that these errors are normally distributed. 14 © 2006 by Taylor & Francis Group, LLC
  • 44. Then differences of the replicates within an individual are normally dis- tributed. This leads to simple graphical devices: • Plot the sample standard deviation of the W-values for an individual against her/his sample mean, call it W. If there are no obvious trends, this suggests that the measurement error variance does not depend on X. • Plot the sample standard deviation of the W-values for an individ- ual against her/his covariates Z. If there are no obvious trends, this suggests that the measurement error variance does not depend on Z. • Form the differences between replications within an individual, and then form a q-q plot of these differences across individuals. If the q-q plot shows no evidence of nonnormality, this suggests that the measurement errors are also roughly normally distributed. 100 200 300 400 500 600 700 800 900 0 20 40 60 80 100 120 140 160 180 Mean of the Protein Biomarkers s.d. of the Protein Biomarkers OPEN data, Protein, Original Scale, Constant Variance Plot Figure 1.6 OPEN Study data, plot of the within-individual standard deviation versus mean of the actual untransformed protein biomarkers. The obvious re- gression slope indicates that the variance of the measurement error depends on true protein intake. For example, consider the protein biomarker in the OPEN study; see Section 1.2. In Figure 1.6 we plot the standard deviation of the replicates versus the mean in the original protein scale. The fact that there is an obvious regression slope and the standard deviation of the biomarker varies by a factor of four over the range of the biomarker’s mean is strong evidence that, at the very least, the variance of the measurement error depends on true intake. 15 © 2006 by Taylor & Francis Group, LLC
  • 45. 5 5.2 5.4 5.6 5.8 6 6.2 6.4 6.6 6.8 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 Mean of the Protein Biomarkers s.d. of the Protein Biomarkers OPEN data, Protein, Log Scale, Constant Variance Plot Figure 1.7 OPEN Study data, plot of the within-individual standard deviation versus mean of the log protein biomarkers. The lack of any major regression slope indicates approximately constant variance measurement error. −3 −2 −1 0 1 2 3 −1.2 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 Standard Normal Quantiles QQ Plot of Log Protein Biomarker Differences, OPEN Study Figure 1.8 OPEN Study data, q-q plot of the differences of the log protein biomarkers. The nearly straight line of the data indicate nearly normally dis- tributed measurement errors. 16 © 2006 by Taylor & Francis Group, LLC
  • 46. −3 −2 −1 0 1 2 3 −40 −30 −20 −10 0 10 20 Standard Normal Quantiles Figure 1.9 Normal q-q plot of the differences between independent Lognor- mal(0,1) random variables, n = 200. A standard way to remove nonconstant variability is via a transfor- mation, and the obvious first attempt is to take logarithms. Figure 1.7 is the standard deviation versus the mean plot in this transformed scale. In contrast to Figure 1.6, here we see no major trend, suggesting that the transformation was successful in removing most of the nonconstant variation. Figure 1.8 gives the q-q plot of the differences: this is not a perfect straight line, but it is reasonably close to straight, suggesting that the transformation has also helped make the data much closer to normally distributed. Using differences between replicates to assess normality has its pitfalls. The difference between two iid random variables has a symmetric dis- tribution even when the random variable themselves are highly skewed. Thus, nonnormality of measurement errors is somewhat hidden by using differences. For example, Figure 1.9 is a normal q-q plot of the differ- ences between 200 pairs of Lognormal(0,1) random variables; see Section A.2 for the lognormal distribution. Note that the q-q plot shows no sign of asymmetry. Nonnormality is evident only in the presence of heavier- than-Gaussian tails. 17 © 2006 by Taylor & Francis Group, LLC
  • 47. 1.8 Loss of Power Classical measurement error causes loss of power, sometimes a profound loss of power. We illustrate this in two situations: linear regression and radiation epidemiology. 1.8.1 Linear Regression Example 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 0.4 0.5 0.6 0.7 0.8 0.9 1 Measurement Error Variance Power for One−Sided Test Simulation: Loss of Power With Increasing Classical Measurement Error Figure 1.10 An illustration of the loss of power when there is classical mea- surement error. When X is observed, the measurement error variance = 0.0, and the power is 90%. When X is not observed and the measurement error variance = 1.0, 1/2 of the variability of the observed W is due to noise, and the power is only 62%. When 2/3 of the variability of W is due to noise, the power is only 44%. Here we consider the simple linear regression model Yi = β0 + βxXi + ²i, where β0 = 0.0, βx = 0.69, var(X) = var(²) = 1.0, and the sample size is n = 20. The results here are based on exact calculations using the program nQuery Advisor. The slope was chosen so that, when X is observed, there is approximately 90% power for a one-sided test of the null hypothesis H0 : βx = 0. We added classical measurement error to the true Xs using the model (1.1), where we varied the variance of the measurement errors U from 0.0 to 2.0. When var(U) = 0.0, we are in the case that there is no clas- sical measurement error, and the power is 90%. When the measurement 18 © 2006 by Taylor & Francis Group, LLC
  • 48. error variance is var(U) = 1.0, this means that the observed predictors have variance var(W) = var(X) + var(U) = 2.0, and hence 1/2 of the variability in the observed predictors is due to noise. At the extreme with var(U) = 2.0, 2/3 of the variability in the observed predictors is due to noise. The results are displayed in Figure 1.10. Here we see that while the power would be 90% if X could be observed, when the measurement error variance equals the variance of X, and hence 1/2 of the variability in W is due to noise, the power crashes to 62%. Even worse, when 2/3 of the variability in the observed W is noise, the power falls below 50%. This is the first of the double whammy of measurement error; see Section 1.1. 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 0 10 20 30 40 50 60 70 80 Measurement Error Variance Sample Size Sample Size for 90% Power With Increasing Classical Measurement Error Figure 1.11 The sample size version of Figure 1.10. When there is no mea- surement error, the sample size needed for 90% power is n = 20. When X is not observed and the measurement error variance = 1.0, 1/2 of the variability of the observed W is due to noise, the necessary sample size for 90% power more than doubles to n = 45. When 2/3 of the variability of W is due to noise, the required sample size is n > 70. The flip side of a loss of power due to classical measurement error is that sample sizes necessary to gain a given power can increase dramati- cally. The following power calculations were done assuming all variances are known, and so should be interpreted qualitatively. In Figure 1.11, we show that while only n = 20 is required for 90% power when there is no measurement error, when 1/2 of the variability in the observed predictor W is due to noise, we require at least n = 45 observations, an increase of 19 © 2006 by Taylor & Francis Group, LLC
  • 49. Other documents randomly have different content
  • 50. FOOTNOTES [1] Dugald Stewart wrongly describes him as a Writer to the Signet, confusing him with a contemporary of the same name. [2] See W. R. Scott’s Hutcheson (1900). [3] Even in 1763 there was but one stage-coach in Scotland “which set out [from Edinburgh] once a month for London, and was from twelve to fourteen days on the journey.”—George Robertson’s Rural Recollections, p. 4. [4] See the Wealth of Nations, Book V. ch. i. art. 2. [5] See the Wealth of Nations, Book I. chap. ii. [6] The advertisement goes on to say: “It is long since he found it necessary to abandon that plan as far too extensive; and these parts of it lay beside him neglected till he was dead.” [7] First, Dugald Stewart declares that the History of Astronomy “was one of Mr. Smith’s earliest compositions.” Second, in a letter constituting Hume his literary executor, Smith describes it as a fragment of an intended juvenile work. Thirdly, Stewart heard him say more than once “that he had projected in the earlier part of his life a history of the other sciences on the same plan.” Fourthly, the work exactly fits in with all that we hear of his
  • 51. youthful bent for the Greek geometry and natural philosophy. Fifthly, it must have been written long before 1758, for he mentions a prediction that a certain comet will appear in that year. [8] “The author at the end of his essay,” says the advertisement, “left some notes and memorandums from which it appears he considered this last part of his History of Astronomy as imperfect and needing several additions.” It consists of 135 pages, and the imperfections are not obvious to the reader. [9] Moral Sentiments, Part III. chap. ii. p. 210 of the second, third, and fourth editions; chap. iii. of the sixth edition. [10] Mr. Rae, usually the most accurate of authorities, states that the first edition appeared “in two volumes 8vo.” [11] The crude theory that sympathy is the foundation of altruism was noticed by Hutcheson. In his System of Moral Philosophy (B. I. ch. iii.) he writes: “Others say that we regard the good of others, or of societies ... as the means of some subtiler pleasures of our own by sympathy with others in their happiness.” But this sympathy, he adds, “can never account for all kind affections, tho’ it is no doubt a natural principle and a beautiful part of our constitution.” [12] Mr. Rae’s Life of Adam Smith, pp. 148-9. Mr. Rae also says that it contained none of the alterations or additions that Hume expected, and expresses surprise that the additions, etc., which had been placed in the printer’s hands in 1760 were not incorporated in the text until the publication of the sixth edition thirty years afterwards. On the other hand, he says that the
  • 52. Dissertation on the Origin of Languages was added. But the Dissertation was first appended in the third edition (1767). [13] See Moral Sentiments, 1st edition, p. 464. [14] Origine de l’inégalité. Partie première, pp. 376, 377. Édition d’Amsterdam des œuvres diverses de J. J. Rousseau. The reference is from Moral Sentiments, 3rd ed. p. 440. [15] Millar adds: “The great Montesquieu pointed out the road. He was the Lord Bacon in this branch of philosophy. Dr. Smith is the Newton.” [16] Cp. Wealth of Nations, Book I. chap. iii. [17] And even Hume, as Smith warned his class, had not quite emancipated himself from mercantilist misconceptions. [18] Lectures, p. 241: “Excise raises the price of commodities and makes fewer people able to carry on business. If a man purchase £1000 worth of tobacco he has a hundred pounds of tax to pay, and therefore cannot deal to such an extent as he would otherwise do. Thus, as it requires greater stock to carry on trade, the dealers must be fewer, and the rich have, as it were, a monopoly against the poor.” [19] Uztariz, Theory and Practice of Commerce and Maritime Affairs, translated by John Kippax, 1751, vol. ii. p. 52. The allusion has been discovered by Mr. Edwin Cannan. See Lectures, p. 246.
  • 53. [20] Wealth of Nations (1776), Book V. chap. i. art. 2. [21] Tytler’s Kames, i. p. 278. [22] See Faujas Saint-Fond, Travels in England and Scotland, vol. ii. p. 241. [23] See Garrick Correspondence, vol. ii. pp. 549, 550. [24] See letter from Adam Smith to T. Cadell printed in the Economic Journal for September 1898. It appears that the last two books he had ordered were Postlethwait’s Dictionary of Trade and Anderson’s Deduction of the Origin of Commerce. Neither appears in Mr. Bonar’s catalogue of his library. [25] At Kirkcaldy George Drysdale, for some time Provost of the town and afterwards Collector of Customs, was a “steady and much esteemed friend.” His more distinguished brother, Dr. John Drysdale the minister, had been at school with Smith, and “among all his numerous friends and acquaintances,” says Dalzel, there was none “whom he loved with greater affection or spoke of with greater tenderness.” They often met in Kirkcaldy and Edinburgh. The death of James Oswald, who represented Kirkcaldy, early in 1769, was a serious loss to the little society, and particularly to Smith. [26] Steuart’s Political Economy, 1767.
  • 54. [27] The most important of these (in Book IV. chap, vii.) appear for the first time in the third edition (1784). [28] Letter to Cullen, London, 20th September 1774. [29] Mr. Macpherson’s recent abridgment is the only tolerable one I know of, and that solely because it carefully retains many of the finest chapters, and leaves the flesh on the bones. [30] A public pawnshop. [31] Charles Butler, the learned Catholic lawyer, once mentioned to Fox that he had never read the Wealth of Nations. “To tell you the truth,” said Fox, “nor I either. There is something in all these subjects which passes my comprehension; something so wide that I could never embrace them myself or find any one who did.” [32] See Book IV. chap. vii. [33] See Skarzinski’s Adam Smith (1878), quoted by Oncken, Economic Journal, vol. vii. p. 445. [34] See Ruskin’s Fors Clavigera, letters 62 and 72. [35] Smith avoids the error so commonly committed in modern doctrines of international trade, of regarding a nation as a trading unit.
  • 55. [36] The second case is simple and uncontroversial. If there is an excise duty upon a home product, it seems reasonable, says Smith, that an equal tax should be imposed in the shape of an import duty upon the same product imported from abroad. [37] The author of Douglas. [38] Written from Kirkcaldy, November 9, 1776. [39] In the Budget of 1778 North adopted two more important recommendations: the inhabited house duty, which is still with us, and the malt tax, which was commuted for the beer duty by Mr. Gladstone in 1880. The house tax proved very productive, as taxes went in those days, its yield rising from £26,000 in 1779 to £108,000 in 1782. [40] Sir Gray Cooper was Secretary to the Treasury. [41] Rae’s Life of Adam Smith, p. 326. [42] See the Life of Smith by William Smellie, a contemporary. [43] See Sinclair’s Life of Sir John Sinclair, vol. i. p. 39. [44] Edinburgh, 15th December 1783. The letter is printed in the Journals and Correspondence of Lord Auckland, vol. i. p. 64.
  • 56. [45] Sir Gilbert Elliot wrote from Edinburgh, July 25, 1782, to his wife: —“I have found one just man in Gomorrah, Adam Smith, author of the Wealth of Nations. He was the Duke of Buccleuch’s tutor, is a wise and deep philosopher, and although made Commissioner of the Customs here by the Duke and Lord Advocate, is what I call an honest fellow. He wrote a most kind as well as elegant letter to Burke on his resignation, as I believe I told you before, and on my mentioning it to him he told me he was the only man here who spoke out for the Rockinghams.”—Life of Lord Minto, vol. i. p. 84. [46] Afterwards Lord Lauderdale, a finished economist, who passed some ingenious criticisms on the Wealth of Nations. [47] See Dugald Stewart’s Memoir, section V. [48] Mr. Rae, the only one of Smith’s biographers, I think, who has noticed Saint-Fond’s visit, dates it wrongly (in 1782), and says the account was published in 1783. The journey took place in 1784, and the account was published in 1797. An English translation appeared two years later. [49] This appeared in 1786 with a prefatory note expressing the author’s grateful obligations to Mr. Henry Hope of Amsterdam, for his information concerning the great Dutch Bank. [50] In his first will Gibbon left a legacy of £100 to Adam Smith. [51] In his Defence of Usury, “Letter XIII. to Dr. Smith,” Bentham had written: “Instead therefore of pretending to owe you nothing, I
  • 57. 237 shall begin with acknowledging that, as far as your trade coincides with mine, I should come much nearer the truth were I to say I owed you everything.” Mr. Rae (Life of Adam Smith, p. 424) quotes a letter from George Wilson to Bentham, in the Bentham MSS., British Museum. I may add to this the following note which I find in Bentham’s Rationale of Reward (1825), p. 332, in chapter xvi. of Book IV., on Rates of Interest. “Adam Smith, after having read the letter upon Projects, which was addressed to him, and printed at the end of the first edition of the Defence of Usury, declared to a gentleman, the common friend of the two authors, that he had been deceived. With the tidings of his death Mr. Bentham received a copy of his works, which had been sent to him as a token of esteem.”
  • 58. INDEX A B C D E F G H I J K L M N O P Q R S T U V W X Y Z A Alembert, 132, 139. American colonies, 163, 176-9. Aristotle, 6, 24-6, 37, 53, 74, 194. Armaments, 172-4. Astronomy, History of, 16-18. B Bacon, 5, 74 n., 118-19. Bagehot (quoted), 204. Balliol College, 9-12. Banks (in Scotland), 101. Beauclerk, 160-1. Bee, The, 21. Bentham, Jeremy, 12, 184, 216; his Defence of Usury, 231-2. Black, Joseph, 83, 96-7, 99, 208, 231, 233. Bordeaux, 123, 141. Boswell, James, 19, 161, 164. Brougham, Lord, 14. Buccleuch, Duke of, 111-14, 131, 135, 150, 153, 157, 163, 213. Buchan, Lord, 21, 99. Buckle, Henry Thomas, 63, 64. Burke, Edmund, 20, 30, 47, 49, 67, 75, 112, 160-2, 171, 174, 221-3, 226, 235-6. Butler, Bishop, 12, 51, 54.
  • 59. 238 C Calas, Jean, the case of, 124-5. Cannan, Edwin, 71, 78-9, 90 n., 169; the Lectures, 182. Carlyle, Dr. Alexander, 101, 104, 105, 151, 226. Clubs—the Poker, 107-9; the Literary, 160, 161, 212, 216; the Oyster, 216-18, 230. Cobden, Richard, 78, 175, 184, 189-91. Cochrane, Andrew, 101-2. Colbert, Abbé, 121-3. Colliers, 76-7. Colonies, 145-9, 175-80. Condorcet, 133. Cullen, Dr., 26-7, 157, 226. Customs, 88 sqq., 196 sqq., 213-15. D Dalkeith House, 150-1. Dalrymple, Sir John, 21, 95, 99, 101, 104-5, 216. Degrees, medical, 157-60. Descartes, 17, 55. Douglas, David, 213, 234. —— Jane, 213, 223, 229. —— John, Bishop of Salisbury, 9, 160. Drysdale, John, 3. Dundas, Henry, 201, 228. Dunlop, Alexander, 4. E Eden, William, 199, 201-20, 227, 237 Edinburgh, 4, 78, 100, 103, 105 sqq., 153, 206, 213 sqq. Encyclopædia, the, 118-20. England, wealth of, 139-42. Enville, Duchess of, 128, 131. Epictetus, 55, 56.
  • 60. Excise, 88-91, 191 n. Exports, theory of, 86 sqq., 190 sqq. F Ferguson, Adam, 128, 216. Ferney, 127-8. Foulis, Robert, printer, 21, 95, 97-9. Fox, Charles James, 174, 212, 221-2. France, 86-7, 118 sqq., 188, 235. Franklin, Benjamin, 108, 161-2. Free Trade, 88, 142, 176, 188 sqq.; (chapter x.), 220. G Garrick, David, 130, 160, 212. Geneva, 126-8. Gibbon, 12, 13, 131, 157, 160, 164, 212, 216, 229-31. Gladstone, W. E., 165, 193. Glasgow, 4-9, 11, 23, 27, 78, 95 sqq., 100-3, 222. —— University of, 3-9, 94 sqq., 229. Glassford, John, 101. Grotius, 5, 71, 73, 92. H Hamilton of Bangour, 21. Helvétius, 132. Hobbes, Thomas, 36, 51, 71. Holland, 90, 139, 172, 192. Home, Henry (see Kames). —— John, 103, 105. Hume, David, 6, 11, 17, 20, 22, 26, 30, 36, 38, 43, 46 sqq., 51, 60 sqq., 73, 95, 96, 103, 106, 110-11, 113, 129, 130, 136-8, 150 sqq., 163-4, 181, 194, 205-11, 233, 235. Hunter, Sir William, 157. —— John, 227. Hutcheson, Francis, 4, 5, 6, 7, 11, 30, 31, 36-8, 51, 57 n., 62, 64, 73, 97, 181, 229.
  • 61. 239 Hutton, Dr., 216, 217, 233. I Imitative Arts, 16, 17, 19-20, 33, 67. Imports, theory of, 86 sqq., 192 sqq., 220. Ireland, 200-3. J Jardine, George, 30-31. Johnson, Samuel, 19, 109-10, 165. Johnstone, William (see Pulteney). Jurisprudence, 69-72, 78. Justice, 68 sqq. K Kames, Lord, 18, 19, 77, 103. Kant, 40, 58. Kirkcaldy, 1-3, 16, 76, 150-6, 205 sqq. Kraus, Christian Jakob, 185. L Labour, division of, 81, 194-5. Languedoc, 124-6. Law, international, 71, 92-3. List, Friedrich, 185-6, 189, 196. Locke, John, 5, 25, 73, 235-6. Logan, John, 226. Logic, chair of, 23, 30-3. Logic and Metaphysics, History of, 18, 23-8, 31-3. London, 78, 156 sqq., 227-8. Lowe, Robert, 187. M Mackintosh, Sir James, 50, 132. Malebranche, 25.
  • 62. Malesherbes, 184. Manchester School, 189-91. Mandeville, 36-7, 53-4, 62. Mathematics, 7, 8. Maxims of Rochefoucauld, 54. Mercantile system, 85-8, 197-8. Metaphysics, 26, 32-3; see Logic. Mill, John Stuart, 165, 186-7, 196. Millar, Andrew (the publisher), 46-8, 138, 144. —— John, 31, 33, 37, 68, 74, 99, 222. Milton, 21, 36, 67, 184. Mollien, Count, 143, 184-5. Monopoly, 159, 220. Montesquieu, 68, 73, 76, 215, 235. Morals, Chair of, 26 sqq., 116-17. Moral Sentiments, Theory of, 31, 37-9, 46 sqq., 232. Morellet, 132, 142, 220. N Navigation Act, 4, 190-1. Necker, 131-2. Newton, Sir Isaac, 8, 17, 36, 235. North, Lord, 199, 200, 212, 213. O Oswald, James, of Dunnikier, 3, 18, 22, 104. Oxford, 9. —— University of, 11-15. P Panmure House, 213-14. Paris, 129 sqq., 136-9. Peel, Sir Robert, 193. Physics, History of Ancient, 18. Pitt, the younger, 184, 188, 200, 222, 227. Plato, 24-5, 37, 194.
  • 63. Police, lectures on, 68-72, 78. Pope, 13, 19, 56. Population, 76. Price, Dr. Richard, 161, 230. Protection (see Free Trade). Pulteney, Sir William, 19, 104, 154-5. Q Quesnai, 68, 71, 134-5, 142, 169. R Rae, John (quoted), 14, 28, 94, 101, 106, 111, 114, 129, 211, 212, 226 n., 228. Raikes, Thomas, 228. Ramsay, Allan, 105, 110. —— John, of Ochtertyre, 38, 44, 96. Religion, 183. Review, Edinburgh, 109. Revenue of France, 141-2. Reynolds, Sir Joshua, 160. Riccoboni, Madame, 130. Richelieu, Duke of, 123, 127. Rochefoucauld, 129, 131, 133. Rockingham Ministry, 146-7. Rogers, Dr. Charles (quoted), 155. —— Samuel, 9, 127, 133, 230-1. Rousseau, J. J., 65, 136-8, 150, 224, 235. Ruskin, 183. S Saint-Fond, Faujas de, 127, 223-6. Schmoller, Professor, 180. Schools (public) in England, 12. Scotland, 9-10, 139-41. Scott, Sir Walter (quoted), 217. Shaftesbury, 31, 36, 51.
  • 64. 240 Shelburne, Lord, 144, 148, 184, 188. Simson, Robert, 4, 8, 96. Smith, Adam (the elder), 2. —— Margaret, 2, 8. Snell Exhibition, 9, 10, 15. Society, the Select, 105-7. Spectator, Impartial, 56-60, 182. Spain, 86, 87, 145, 175. Stamp Act, 146, 147. Stewart, Dugald (quoted), 2, 5, 13, 14, 21, 68, 102, 105, 131, 132, 139, 151, 214, 234. —— Matthew, 7, 8. Strahan, William, 61, 144, 164, 206 sqq., 226. Strathendry, 2. Sympathy, doctrine of, 57 sqq. T Taille, 142. Tax, Land, 89, 142; the French, 142. Taxation, 88 sqq., 170-2, 176 sqq. Theology, Natural, 7, 37. Tocqueville, 125. Tooke, Horne, 124. Toulouse, 124-5, 144. Townshend, Charles, 48-9, 104, 111-15, 135, 147-8. Treaties, Commercial (with France), 200, 220, 227. Turgot, 68, 71, 125, 126, 129, 132-4, 142, 184, 219, 235-6. U Union, Act of, 4, 36. Uztariz (quoted), 90. V Vingtième, 142. Voltaire, 20, 44, 48, 120, 125, 127, 128, 139, 224, 235.
  • 65. W Wages, 140. Wakefield, E. G., 165-6. Walpole, Sir Robert, 91. War, 172-4. Watt, James, 83, 96-7. Wealth of Nations, 2, 12, 15, 22, 32, 33, 63, 69, 81 sqq., 139, 144, 156, 158, 161-2; (chapter ix.), 163 sqq., 213. Wedderburn, Alexander, 19, 47, 109. Wilberforce, William, 228-9. Windham, William, 226. Wordsworth, 20, 21. Printed by T. and A. Constable, Printers to His Majesty at the Edinburgh University Press
  • 66. Transcriber’s Notes Silently corrected a few typos. Retained publication information from the printed edition: this eBook is public-domain in the country of publication. In the text versions only, text in italics is delimited by _underscores_.
  • 67. *** END OF THE PROJECT GUTENBERG EBOOK ADAM SMITH *** Updated editions will replace the previous one—the old editions will be renamed. Creating the works from print editions not protected by U.S. copyright law means that no one owns a United States copyright in these works, so the Foundation (and you!) can copy and distribute it in the United States without permission and without paying copyright royalties. Special rules, set forth in the General Terms of Use part of this license, apply to copying and distributing Project Gutenberg™ electronic works to protect the PROJECT GUTENBERG™ concept and trademark. Project Gutenberg is a registered trademark, and may not be used if you charge for an eBook, except by following the terms of the trademark license, including paying royalties for use of the Project Gutenberg trademark. If you do not charge anything for copies of this eBook, complying with the trademark license is very easy. You may use this eBook for nearly any purpose such as creation of derivative works, reports, performances and research. Project Gutenberg eBooks may be modified and printed and given away—you may do practically ANYTHING in the United States with eBooks not protected by U.S. copyright law. Redistribution is subject to the trademark license, especially commercial redistribution. START: FULL LICENSE
  • 68. THE FULL PROJECT GUTENBERG LICENSE
  • 69. PLEASE READ THIS BEFORE YOU DISTRIBUTE OR USE THIS WORK To protect the Project Gutenberg™ mission of promoting the free distribution of electronic works, by using or distributing this work (or any other work associated in any way with the phrase “Project Gutenberg”), you agree to comply with all the terms of the Full Project Gutenberg™ License available with this file or online at www.gutenberg.org/license. Section 1. General Terms of Use and Redistributing Project Gutenberg™ electronic works 1.A. By reading or using any part of this Project Gutenberg™ electronic work, you indicate that you have read, understand, agree to and accept all the terms of this license and intellectual property (trademark/copyright) agreement. If you do not agree to abide by all the terms of this agreement, you must cease using and return or destroy all copies of Project Gutenberg™ electronic works in your possession. If you paid a fee for obtaining a copy of or access to a Project Gutenberg™ electronic work and you do not agree to be bound by the terms of this agreement, you may obtain a refund from the person or entity to whom you paid the fee as set forth in paragraph 1.E.8. 1.B. “Project Gutenberg” is a registered trademark. It may only be used on or associated in any way with an electronic work by people who agree to be bound by the terms of this agreement. There are a few things that you can do with most Project Gutenberg™ electronic works even without complying with the full terms of this agreement. See paragraph 1.C below. There are a lot of things you can do with Project Gutenberg™ electronic works if you follow the terms of this agreement and help preserve free future access to Project Gutenberg™ electronic works. See paragraph 1.E below.
  • 70. 1.C. The Project Gutenberg Literary Archive Foundation (“the Foundation” or PGLAF), owns a compilation copyright in the collection of Project Gutenberg™ electronic works. Nearly all the individual works in the collection are in the public domain in the United States. If an individual work is unprotected by copyright law in the United States and you are located in the United States, we do not claim a right to prevent you from copying, distributing, performing, displaying or creating derivative works based on the work as long as all references to Project Gutenberg are removed. Of course, we hope that you will support the Project Gutenberg™ mission of promoting free access to electronic works by freely sharing Project Gutenberg™ works in compliance with the terms of this agreement for keeping the Project Gutenberg™ name associated with the work. You can easily comply with the terms of this agreement by keeping this work in the same format with its attached full Project Gutenberg™ License when you share it without charge with others. 1.D. The copyright laws of the place where you are located also govern what you can do with this work. Copyright laws in most countries are in a constant state of change. If you are outside the United States, check the laws of your country in addition to the terms of this agreement before downloading, copying, displaying, performing, distributing or creating derivative works based on this work or any other Project Gutenberg™ work. The Foundation makes no representations concerning the copyright status of any work in any country other than the United States. 1.E. Unless you have removed all references to Project Gutenberg: 1.E.1. The following sentence, with active links to, or other immediate access to, the full Project Gutenberg™ License must appear prominently whenever any copy of a Project Gutenberg™ work (any work on which the phrase “Project
  • 71. Gutenberg” appears, or with which the phrase “Project Gutenberg” is associated) is accessed, displayed, performed, viewed, copied or distributed: This eBook is for the use of anyone anywhere in the United States and most other parts of the world at no cost and with almost no restrictions whatsoever. You may copy it, give it away or re-use it under the terms of the Project Gutenberg License included with this eBook or online at www.gutenberg.org. If you are not located in the United States, you will have to check the laws of the country where you are located before using this eBook. 1.E.2. If an individual Project Gutenberg™ electronic work is derived from texts not protected by U.S. copyright law (does not contain a notice indicating that it is posted with permission of the copyright holder), the work can be copied and distributed to anyone in the United States without paying any fees or charges. If you are redistributing or providing access to a work with the phrase “Project Gutenberg” associated with or appearing on the work, you must comply either with the requirements of paragraphs 1.E.1 through 1.E.7 or obtain permission for the use of the work and the Project Gutenberg™ trademark as set forth in paragraphs 1.E.8 or 1.E.9. 1.E.3. If an individual Project Gutenberg™ electronic work is posted with the permission of the copyright holder, your use and distribution must comply with both paragraphs 1.E.1 through 1.E.7 and any additional terms imposed by the copyright holder. Additional terms will be linked to the Project Gutenberg™ License for all works posted with the permission of the copyright holder found at the beginning of this work. 1.E.4. Do not unlink or detach or remove the full Project Gutenberg™ License terms from this work, or any files
  • 72. containing a part of this work or any other work associated with Project Gutenberg™. 1.E.5. Do not copy, display, perform, distribute or redistribute this electronic work, or any part of this electronic work, without prominently displaying the sentence set forth in paragraph 1.E.1 with active links or immediate access to the full terms of the Project Gutenberg™ License. 1.E.6. You may convert to and distribute this work in any binary, compressed, marked up, nonproprietary or proprietary form, including any word processing or hypertext form. However, if you provide access to or distribute copies of a Project Gutenberg™ work in a format other than “Plain Vanilla ASCII” or other format used in the official version posted on the official Project Gutenberg™ website (www.gutenberg.org), you must, at no additional cost, fee or expense to the user, provide a copy, a means of exporting a copy, or a means of obtaining a copy upon request, of the work in its original “Plain Vanilla ASCII” or other form. Any alternate format must include the full Project Gutenberg™ License as specified in paragraph 1.E.1. 1.E.7. Do not charge a fee for access to, viewing, displaying, performing, copying or distributing any Project Gutenberg™ works unless you comply with paragraph 1.E.8 or 1.E.9. 1.E.8. You may charge a reasonable fee for copies of or providing access to or distributing Project Gutenberg™ electronic works provided that: • You pay a royalty fee of 20% of the gross profits you derive from the use of Project Gutenberg™ works calculated using the method you already use to calculate your applicable taxes. The fee is owed to the owner of the Project Gutenberg™ trademark, but he has agreed to donate royalties under this paragraph to the Project Gutenberg Literary Archive Foundation. Royalty
  • 73. payments must be paid within 60 days following each date on which you prepare (or are legally required to prepare) your periodic tax returns. Royalty payments should be clearly marked as such and sent to the Project Gutenberg Literary Archive Foundation at the address specified in Section 4, “Information about donations to the Project Gutenberg Literary Archive Foundation.” • You provide a full refund of any money paid by a user who notifies you in writing (or by e-mail) within 30 days of receipt that s/he does not agree to the terms of the full Project Gutenberg™ License. You must require such a user to return or destroy all copies of the works possessed in a physical medium and discontinue all use of and all access to other copies of Project Gutenberg™ works. • You provide, in accordance with paragraph 1.F.3, a full refund of any money paid for a work or a replacement copy, if a defect in the electronic work is discovered and reported to you within 90 days of receipt of the work. • You comply with all other terms of this agreement for free distribution of Project Gutenberg™ works. 1.E.9. If you wish to charge a fee or distribute a Project Gutenberg™ electronic work or group of works on different terms than are set forth in this agreement, you must obtain permission in writing from the Project Gutenberg Literary Archive Foundation, the manager of the Project Gutenberg™ trademark. Contact the Foundation as set forth in Section 3 below. 1.F. 1.F.1. Project Gutenberg volunteers and employees expend considerable effort to identify, do copyright research on, transcribe and proofread works not protected by U.S. copyright
  • 74. law in creating the Project Gutenberg™ collection. Despite these efforts, Project Gutenberg™ electronic works, and the medium on which they may be stored, may contain “Defects,” such as, but not limited to, incomplete, inaccurate or corrupt data, transcription errors, a copyright or other intellectual property infringement, a defective or damaged disk or other medium, a computer virus, or computer codes that damage or cannot be read by your equipment. 1.F.2. LIMITED WARRANTY, DISCLAIMER OF DAMAGES - Except for the “Right of Replacement or Refund” described in paragraph 1.F.3, the Project Gutenberg Literary Archive Foundation, the owner of the Project Gutenberg™ trademark, and any other party distributing a Project Gutenberg™ electronic work under this agreement, disclaim all liability to you for damages, costs and expenses, including legal fees. YOU AGREE THAT YOU HAVE NO REMEDIES FOR NEGLIGENCE, STRICT LIABILITY, BREACH OF WARRANTY OR BREACH OF CONTRACT EXCEPT THOSE PROVIDED IN PARAGRAPH 1.F.3. YOU AGREE THAT THE FOUNDATION, THE TRADEMARK OWNER, AND ANY DISTRIBUTOR UNDER THIS AGREEMENT WILL NOT BE LIABLE TO YOU FOR ACTUAL, DIRECT, INDIRECT, CONSEQUENTIAL, PUNITIVE OR INCIDENTAL DAMAGES EVEN IF YOU GIVE NOTICE OF THE POSSIBILITY OF SUCH DAMAGE. 1.F.3. LIMITED RIGHT OF REPLACEMENT OR REFUND - If you discover a defect in this electronic work within 90 days of receiving it, you can receive a refund of the money (if any) you paid for it by sending a written explanation to the person you received the work from. If you received the work on a physical medium, you must return the medium with your written explanation. The person or entity that provided you with the defective work may elect to provide a replacement copy in lieu of a refund. If you received the work electronically, the person or entity providing it to you may choose to give you a second opportunity to receive the work electronically in lieu of a refund.
  • 75. If the second copy is also defective, you may demand a refund in writing without further opportunities to fix the problem. 1.F.4. Except for the limited right of replacement or refund set forth in paragraph 1.F.3, this work is provided to you ‘AS-IS’, WITH NO OTHER WARRANTIES OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO WARRANTIES OF MERCHANTABILITY OR FITNESS FOR ANY PURPOSE. 1.F.5. Some states do not allow disclaimers of certain implied warranties or the exclusion or limitation of certain types of damages. If any disclaimer or limitation set forth in this agreement violates the law of the state applicable to this agreement, the agreement shall be interpreted to make the maximum disclaimer or limitation permitted by the applicable state law. The invalidity or unenforceability of any provision of this agreement shall not void the remaining provisions. 1.F.6. INDEMNITY - You agree to indemnify and hold the Foundation, the trademark owner, any agent or employee of the Foundation, anyone providing copies of Project Gutenberg™ electronic works in accordance with this agreement, and any volunteers associated with the production, promotion and distribution of Project Gutenberg™ electronic works, harmless from all liability, costs and expenses, including legal fees, that arise directly or indirectly from any of the following which you do or cause to occur: (a) distribution of this or any Project Gutenberg™ work, (b) alteration, modification, or additions or deletions to any Project Gutenberg™ work, and (c) any Defect you cause. Section 2. Information about the Mission of Project Gutenberg™
  • 76. Project Gutenberg™ is synonymous with the free distribution of electronic works in formats readable by the widest variety of computers including obsolete, old, middle-aged and new computers. It exists because of the efforts of hundreds of volunteers and donations from people in all walks of life. Volunteers and financial support to provide volunteers with the assistance they need are critical to reaching Project Gutenberg™’s goals and ensuring that the Project Gutenberg™ collection will remain freely available for generations to come. In 2001, the Project Gutenberg Literary Archive Foundation was created to provide a secure and permanent future for Project Gutenberg™ and future generations. To learn more about the Project Gutenberg Literary Archive Foundation and how your efforts and donations can help, see Sections 3 and 4 and the Foundation information page at www.gutenberg.org. Section 3. Information about the Project Gutenberg Literary Archive Foundation The Project Gutenberg Literary Archive Foundation is a non- profit 501(c)(3) educational corporation organized under the laws of the state of Mississippi and granted tax exempt status by the Internal Revenue Service. The Foundation’s EIN or federal tax identification number is 64-6221541. Contributions to the Project Gutenberg Literary Archive Foundation are tax deductible to the full extent permitted by U.S. federal laws and your state’s laws. The Foundation’s business office is located at 809 North 1500 West, Salt Lake City, UT 84116, (801) 596-1887. Email contact links and up to date contact information can be found at the Foundation’s website and official page at www.gutenberg.org/contact
  • 77. Section 4. Information about Donations to the Project Gutenberg Literary Archive Foundation Project Gutenberg™ depends upon and cannot survive without widespread public support and donations to carry out its mission of increasing the number of public domain and licensed works that can be freely distributed in machine-readable form accessible by the widest array of equipment including outdated equipment. Many small donations ($1 to $5,000) are particularly important to maintaining tax exempt status with the IRS. The Foundation is committed to complying with the laws regulating charities and charitable donations in all 50 states of the United States. Compliance requirements are not uniform and it takes a considerable effort, much paperwork and many fees to meet and keep up with these requirements. We do not solicit donations in locations where we have not received written confirmation of compliance. To SEND DONATIONS or determine the status of compliance for any particular state visit www.gutenberg.org/donate. While we cannot and do not solicit contributions from states where we have not met the solicitation requirements, we know of no prohibition against accepting unsolicited donations from donors in such states who approach us with offers to donate. International donations are gratefully accepted, but we cannot make any statements concerning tax treatment of donations received from outside the United States. U.S. laws alone swamp our small staff. Please check the Project Gutenberg web pages for current donation methods and addresses. Donations are accepted in a number of other ways including checks, online payments and
  • 78. credit card donations. To donate, please visit: www.gutenberg.org/donate. Section 5. General Information About Project Gutenberg™ electronic works Professor Michael S. Hart was the originator of the Project Gutenberg™ concept of a library of electronic works that could be freely shared with anyone. For forty years, he produced and distributed Project Gutenberg™ eBooks with only a loose network of volunteer support. Project Gutenberg™ eBooks are often created from several printed editions, all of which are confirmed as not protected by copyright in the U.S. unless a copyright notice is included. Thus, we do not necessarily keep eBooks in compliance with any particular paper edition. Most people start at our website which has the main PG search facility: www.gutenberg.org. This website includes information about Project Gutenberg™, including how to make donations to the Project Gutenberg Literary Archive Foundation, how to help produce our new eBooks, and how to subscribe to our email newsletter to hear about new eBooks.
  • 79. Welcome to Our Bookstore - The Ultimate Destination for Book Lovers Are you passionate about books and eager to explore new worlds of knowledge? At our website, we offer a vast collection of books that cater to every interest and age group. From classic literature to specialized publications, self-help books, and children’s stories, we have it all! Each book is a gateway to new adventures, helping you expand your knowledge and nourish your soul Experience Convenient and Enjoyable Book Shopping Our website is more than just an online bookstore—it’s a bridge connecting readers to the timeless values of culture and wisdom. With a sleek and user-friendly interface and a smart search system, you can find your favorite books quickly and easily. Enjoy special promotions, fast home delivery, and a seamless shopping experience that saves you time and enhances your love for reading. Let us accompany you on the journey of exploring knowledge and personal growth! ebookgate.com