SlideShare a Scribd company logo
Read Anytime Anywhere Easy Ebook Downloads at ebookmeta.com
Probability and Statistics for Engineering and the
Sciences 9th Edition Devore J.L.
https://ebookmeta.com/product/probability-and-statistics-
for-engineering-and-the-sciences-9th-edition-devore-j-l/
OR CLICK HERE
DOWLOAD EBOOK
Visit and Get More Ebook Downloads Instantly at https://ebookmeta.com
Probability and Statistics for Engineering and the Sciences 9th Edition Devore J.L.
5 REASONS
to buy your textbooks
and course materials at
SAVINGS:
Prices up to 75% off, daily coupons, and free shipping on orders over $25
CHOICE:
Multiple format options including textbook, eBook and eChapter rentals
CONVENIENCE:
Anytime, anywhere access of eBooks or eChapters via mobile devices
SERVICE:
Free eBook access while your text ships, and instant access to online homework products
STUDY TOOLS:
Study tools* for your text, plus writing, research, career and job search resources
*availability varies
1
2
3
4
5
Find your course materials and start saving at:
www.cengagebrain.com
Engaged with you.
www.cengage com
Source Code: 14M-AA0107
Copyright 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
NINTH EdITIoN
Australia Brazil Mexico Singapore United Kingdom United States
Probability and Statistics
for Engineering
and the Sciences
Jay Devore
California Polytechnic State University, San Luis Obispo
Copyright 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
© 2016, 2012, 2009, Cengage Learning
ALL RIGHTS RESERVED. No part of this work covered by the copyright
herein may be reproduced, transmitted, stored, or used in any form
or by any means graphic, electronic, or mechanical, including but not
limited to photocopying, recording, scanning, digitizing, taping, web
distribution, information networks, or information storage and retrieval
systems, except as permitted under Section 107 or 108 of the 1976
United States Copyright Act, without the prior written permission of
the publisher.
Unless otherwise noted, all items © Cengage Learning
Library of Congress Control Number: 2014946237
ISBN: 978-1-305-25180-9
Cengage Learning
20 Channel Center Street
Boston, MA 02210
USA
Cengage Learning is a leading provider of customized learning solutions
with office locations around the globe, including Singapore, the United
Kingdom, Australia, Mexico, Brazil, and Japan. Locate your local office at
www.cengage.com/global.
Cengage Learning products are represented in Canada by Nelson
Education, Ltd.
To learn more about Cengage Learning Solutions, visit www.cengage.com.
Purchase any of our products at your local college store or at our
preferred online store www.cengagebrain.com.
Probability and Statistics for Engineering
and the Sciences, Ninth Edition
Jay L. Devore
Senior Product Team Manager: Richard
Stratton
Senior Product Manager: Molly Taylor
Senior Content Developer: Jay Campbell
Product Assistant: Spencer Arritt
Media Developer: Andrew Coppola
Marketing Manager: Julie Schuster
Content Project Manager: Cathy Brooks
Art Director: Linda May
Manufacturing Planner: Sandee Milewski
IP Analyst: Christina Ciaramella
IP Project Manager: Farah Fard
Production Service and Compositor:
MPS Limited
Text and Cover Designer: C Miller Design
For product information and technology assistance, contact us at
Cengage Learning Customer & Sales Support, 1-800-354-9706
For permission to use material from this text or product,
submit all requests online at www.cengage.com/permissions
Further permissions questions can be emailed to
permissionrequest@cengage.com
Printed in the United States of America
Print Number: 01 Print Year: 2014
WCN: 02-200-203
To my beloved grandsons
Philip and Elliot, who are
highly statistically significant.
Probability and Statistics for Engineering and the Sciences 9th Edition Devore J.L.
vii
1 Overview and Descriptive Statistics
Introduction 1
1.1 Populations, Samples, and Processes 3
1.2 Pictorial and Tabular Methods in Descriptive Statistics 13
1.3 Measures of Location 29
1.4 Measures of Variability 36
Supplementary Exercises 47
Bibliography 51
2 Probability
Introduction 52
2.1 Sample Spaces and Events 53
2.2 Axioms, Interpretations,
and Properties of Probability 58
2.3 Counting Techniques 66
2.4 Conditional Probability 75
2.5 Independence 85
Supplementary Exercises 91
Bibliography 94
3 Discrete Random Variables and Probability
Distributions
Introduction 95
3.1 Random Variables 96
3.2 Probability Distributions for Discrete Random Variables 99
3.3 Expected Values 109
3.4 The Binomial Probability Distribution 117
3.5 Hypergeometric and Negative Binomial Distributions 126
3.6 The Poisson Probability Distribution 131
Supplementary Exercises 137
Bibliography 140
Contents
viii Contents
4 Continuous Random Variables and Probability
Distributions
Introduction 141
4.1 Probability Density Functions 142
4.2 Cumulative Distribution Functions
and Expected Values 147
4.3 The Normal Distribution 156
4.4 The Exponential and Gamma Distributions 170
4.5 Other Continuous Distributions 177
4.6 Probability Plots 184
Supplementary Exercises 193
Bibliography 197
5 Joint Probability Distributions
and Random Samples
Introduction 198
5.1 Jointly Distributed Random Variables 199
5.2 Expected Values, Covariance, and Correlation 213
5.3 Statistics and Their Distributions 220
5.4 The Distribution of the Sample Mean 230
5.5 The Distribution of a Linear Combination 238
Supplementary Exercises 243
Bibliography 246
6 Point Estimation
Introduction 247
6.1 Some General Concepts of Point Estimation 248
6.2 Methods of Point Estimation 264
Supplementary Exercises 274
Bibliography 275
7 Statistical Intervals Based on a Single Sample
Introduction 276
7.1 Basic Properties of Confidence Intervals 277
7.2 Large-Sample Confidence Intervals
for a Population Mean and Proportion 285
Contents ix
7.3 Intervals Based on a Normal Population Distribution 295
7.4 Confidence Intervals for the Variance
and Standard Deviation of a Normal Population 304
Supplementary Exercises 307
Bibliography 309
8 Tests of Hypotheses Based on
a Single Sample
Introduction 310
8.1 Hypotheses and Test Procedures 311
8.2 z Tests for Hypotheses about a Population Mean 326
8.3 The One-Sample t Test 335
8.4 Tests Concerning a Population Proportion 346
8.5 Further Aspects of Hypothesis Testing 352
Supplementary Exercises 357
Bibliography 360
9 Inferences Based on Two Samples
Introduction 361
9.1 z Tests and Confidence Intervals for a Difference
Between Two Population Means 362
9.2 The Two-Sample t Test and Confidence Interval 374
9.3 Analysis of Paired Data 382
9.4 Inferences Concerning a Difference Between
Population Proportions 391
9.5 Inferences Concerning Two Population Variances 399
Supplementary Exercises 403
Bibliography 408
10 The Analysis of Variance
Introduction 409
10.1 Single-Factor ANOVA 410
10.2 Multiple Comparisons in ANOVA 420
10.3 More on Single-Factor ANOVA 426
Supplementary Exercises 435
Bibliography 436
x Contents
11 Multifactor Analysis of Variance
Introduction 437
11.1 Two-Factor ANOVA with Kij 5 1 438
11.2 Two-Factor ANOVA with Kij . 1 451
11.3 Three-Factor ANOVA 460
11.4 2p
Factorial Experiments 469
Supplementary Exercises 483
Bibliography 486
12 Simple Linear Regression and Correlation
Introduction 487
12.1 The Simple Linear Regression Model 488
12.2 Estimating Model Parameters 496
12.3 Inferences About the Slope Parameter b1 510
12.4 Inferences Concerning mY ? x* and
the Prediction of Future Y Values 519
12.5 Correlation 527
Supplementary Exercises 437
Bibliography 541
13 Nonlinear and Multiple Regression
Introduction 542
13.1 Assessing Model Adequacy 543
13.2 Regression with Transformed Variables 550
13.3 Polynomial Regression 562
13.4 Multiple Regression Analysis 572
13.5 Other Issues in Multiple Regression 595
Supplementary Exercises 610
Bibliography 618
14 Goodness-of-Fit Tests and Categorical
Data Analysis
Introduction 619
14.1 Goodness-of-Fit Tests When Category
Probabilities Are Completely Specified 620
14.2 Goodness-of-Fit Tests for Composite Hypotheses 627
14.3 Two-Way Contingency Tables 639
Contents xi
Supplementary Exercises 648
Bibliography 651
15 Distribution-Free Procedures
Introduction 652
15.1 The Wilcoxon Signed-Rank Test 653
15.2 The Wilcoxon Rank-Sum Test 661
15.3 Distribution-Free Confidence Intervals 667
15.4 Distribution-Free ANOVA 671
Supplementary Exercises 675
Bibliography 677
16 Quality Control Methods
Introduction 678
16.1 General Comments on Control Charts 679
16.2 Control Charts for Process Location 681
16.3 Control Charts for Process Variation 690
16.4 Control Charts for Attributes 695
16.5 CUSUM Procedures 700
16.6 Acceptance Sampling 708
Supplementary Exercises 714
Bibliography 715
Appendix Tables
A.1 Cumulative Binomial Probabilities A-2
A.2 Cumulative Poisson Probabilities A-4
A.3 Standard Normal Curve Areas A-6
A.4 The Incomplete Gamma Function A-8
A.5 Critical Values for t Distributions A-9
A.6 Tolerance Critical Values for Normal Population Distributions A-10
A.7 Critical Values for Chi-Squared Distributions A-11
A.8 t Curve Tail Areas A-12
A.9 Critical Values for F Distributions A-14
A.10 Critical Values for Studentized Range Distributions A-20
A.11 Chi-Squared Curve Tail Areas A-21
A.12 Approximate Critical Values for the Ryan-Joiner Test of
Normality A-23
A.13 Critical Values for the Wilcoxon Signed-Rank Test A-24
xii Contents
A.14 Critical Values for the Wilcoxon Rank-Sum Test A-25
A.15 Critical Values for the Wilcoxon Signed-Rank Interval A-26
A.16 Critical Values for the Wilcoxon Rank-Sum Interval A-27
A.17 b Curves for t Tests A-28
Answers to Selected Odd-Numbered Exercises A-29
Glossary of Symbols/Abbreviations G-1
Index I-1
xiii
Purpose
The use of probability models and statistical methods for analyzing data has become
common practice in virtually all scientific disciplines. This book attempts to provide
a comprehensive introduction to those models and methods most likely to be encoun­
tered and used by students in their careers in engineering and the natural sciences.
Although the examples and exercises have been designed with scientists and engi­
neers in mind, most of the methods covered are basic to statistical analyses in many
other disciplines, so that students of business and the social sciences will also profit
from reading the book.
Approach
Students in a statistics course designed to serve other majors may be initially skeptical
of the value and relevance of the subject matter, but my experience is that students can
be turned on to statistics by the use of good examples and exercises that blend their
everyday experiences with their scientific interests. Consequently, I have worked hard
to find examples of real, rather than artificial, data—data that someone thought was
worth collecting and analyzing. Many of the methods presented, especially in the later
chapters on statistical inference, are illustrated by analyzing data taken from published
sources, and many of the exercises also involve working with such data. Sometimes
the reader may be unfamiliar with the context of a particular problem (as indeed I
often was), but I have found that students are more attracted by real problems with
a somewhat strange context than by patently artificial problems in a familiar setting.
Mathematical Level
The exposition is relatively modest in terms of mathematical development. Substantial
use of the calculus is made only in Chapter 4 and parts of Chapters 5 and 6. In par­
ticular, with the exception of an occasional remark or aside, calculus appears in the
inference part of the book only—in the second section of Chapter 6. Matrix algebra
is not used at all. Thus almost all the exposition should be accessible to those whose
mathematical background includes one semester or two quarters of differential and
integral calculus.
Content
Chapter 1 begins with some basic concepts and terminology—population, sample,
descriptive and inferential statistics, enumerative versus analytic studies, and so on—
and continues with a survey of important graphical and numerical descriptive methods.
A rather traditional development of probability is given in Chapter 2, followed by prob­
ability distributions of discrete and continuous random variables in Chapters 3 and 4,
respectively. Joint distributions and their properties are discussed in the first part of
Chapter5.Thelatterpartofthischapterintroducesstatisticsandtheirsamplingdistri­
butions, which form the bridge between probability and inference. The next three
Preface
xiv Preface
chapters cover point estimation, statistical intervals, and hypothesis testing based on a
single sample. Methods of inference involving two independent samples and paired
data are presented in Chapter 9. The analysis of variance is the subject of Chapters 10
and 11 (single­factor and multifactor, respectively). Regression makes its initial
appearance in Chapter 12 (the simple linear regression model and correlation)
and returns for an extensive encore in Chapter 13. The last three chapters develop
chi­squared methods, distribution­free (nonparametric) procedures, and techniques
from statistical quality control.
Helping Students Learn
Although the book’s mathematical level should give most science and engineering
students little difficulty, working toward an understanding of the concepts and gaining
an appreciation for the logical development of the methodology may sometimes
require substantial effort. To help students gain such an understanding and appreci­
ation, I have provided numerous exercises ranging in difficulty from many that
involve routine application of text material to some that ask the reader to extend
concepts discussed in the text to somewhat new situations. There are many more
exercises than most instructors would want to assign during any particular course,
but I recommend that students be required to work a substantial number of them. In
a problem­solving discipline, active involvement of this sort is the surest way to
identify and close the gaps in understanding that inevitably arise. Answers to most
odd­numbered exercises appear in the answer section at the back of the text. In
addition, a Student Solutions Manual, consisting of worked­out solutions to virtu­
ally all the odd­numbered exercises, is available.
To access additional course materials and companion resources, please visit
www.cengagebrain.com. At the CengageBrain.com home page, search for the ISBN
of your title (from the back cover of your book) using the search box at the top of
the page. This will take you to the product page where free companion resources can
be found.
New for This Edition
The major change for this edition is the elimination of the rejection region
approach to hypothesis testing. Conclusions from a hypothesis­testing analysis
are now based entirely on P­values. This has necessitated completely rewriting
Section 8.1, which now introduces hypotheses and then test procedures based on
P­values. Substantial revision of the remaining sections of Chapter 8 was then
required, and this in turn has been propagated through the hypothesis­testing
sections and subsections of Chapters 9–15.
Many new examples and exercises, almost all based on real data or actual
problems. Some of these scenarios are less technical or broader in scope than
what has been included in previous editions—for example, investigating the
nocebo effect (the inclination of those told about a drug’s side effects to experi­
ence them), comparing sodium contents of cereals produced by three different
manufacturers, predicting patient height from an easy­to­measure anatomical
characteristic, modeling the relationship between an adolescent mother’s age
and the birth weight of her baby, assessing the effect of smokers’ short­term
abstinence on the accurate perception of elapsed time, and exploring the impact
of phrasing in a quantitative literacy test.
More examples and exercises in the probability material (Chapters 2–5) are based
on information from published sources.
Preface xv
The exposition has been polished whenever possible to help students gain a better
intuitive understanding of various concepts.
Acknowledgments
My colleagues at Cal Poly have provided me with invaluable support and feedback
over the years. I am also grateful to the many users of previous editions who have
made suggestions for improvement (and on occasion identified errors). A special
note of thanks goes to Jimmy Doi for his accuracy checking and to Matt Carlton for
his work on the two solutions manuals, one for instructors and the other for students.
The generous feedback provided by the following reviewers of this and pre­
vious editions has been of great benefit in improving the book: Robert L.Armacost,
University of Central Florida; Bill Bade, Lincoln Land Community College;
Douglas M. Bates, University of Wisconsin–Madison; Michael Berry, WestVirginia
Wesleyan College; Brian Bowman, Auburn University; Linda Boyle, University of
Iowa; Ralph Bravaco, Stonehill College; Linfield C. Brown, Tufts University;
Karen M. Bursic, University of Pittsburgh; Lynne Butler, Haverford College;
Troy Butler, Colorado State University; Barrett Caldwell, Purdue University; Kyle
Caudle, South Dakota School of Mines & Technology; Raj S. Chhikara, University
of Houston–Clear Lake; Edwin Chong, Colorado State University; David Clark,
California State Polytechnic University at Pomona; Ken Constantine, Taylor
University; Bradford Crain, Portland State University; David M. Cresap, University
of Portland; Savas Dayanik, Princeton University; Don E. Deal, University of
Houston; Annjanette M. Dodd, Humboldt State University; Jimmy Doi, California
Polytechnic State University–San Luis Obispo; Charles E. Donaghey, University
of Houston; Patrick J. Driscoll, U.S. Military Academy; Mark Duva, University
of Virginia; Nassir Eltinay, Lincoln Land Community College; Thomas English,
College of the Mainland; Nasser S. Fard, Northeastern University; Ronald Fricker,
Naval Postgraduate School; Steven T. Garren, James Madison University; Mark
Gebert, University of Kentucky; Harland Glaz, University of Maryland; Ken Grace,
Anoka­Ramsey Community College; Celso Grebogi, University of Maryland;
Veronica Webster Griffis, Michigan Technological University; Jose Guardiola,
Texas A&M University–Corpus Christi; K. L. D. Gunawardena, University of
Wisconsin–Oshkosh; James J. Halavin, Rochester Institute of Technology; James
Hartman, Marymount University; Tyler Haynes, Saginaw Valley State University;
Jennifer Hoeting, Colorado State University; Wei­Min Huang, Lehigh University;
Aridaman Jain, New Jersey Institute of Technology; Roger W. Johnson, South
Dakota School of Mines & Technology; Chihwa Kao, Syracuse University;
Saleem A. Kassam, University of Pennsylvania; Mohammad T. Khasawneh, State
University of NewYork–Binghamton; Kyungduk Ko, Boise State University;
Stephen Kokoska, Colgate University; Hillel J. Kumin, University of Oklahoma;
Sarah Lam, Binghamton University; M. Louise Lawson, Kennesaw State
University; Jialiang Li, University of Wisconsin–Madison; Wooi K. Lim, William
Paterson University; Aquila Lipscomb, The Citadel; Manuel Lladser, University of
Colorado at Boulder; Graham Lord, University of California–Los Angeles; Joseph
L. Macaluso, DeSales University; Ranjan Maitra, Iowa State University; David
Mathiason, Rochester Institute of Technology; Arnold R. Miller, University of
Denver; John J. Millson, University of Maryland; Pamela Kay Miltenberger, West
Virginia Wesleyan College; Monica Molsee, Portland State University; Thomas
Moore, Naval Postgraduate School; Robert M. Norton, College of Charleston;
Steven Pilnick, Naval Postgraduate School; Robi Polikar, Rowan University; Justin
Post, North Carolina State University; Ernest Pyle, Houston Baptist University;
xvi Preface
Xianggui Qu, Oakland University; Kingsley Reeves, University of South Florida;
Steve Rein, California Polytechnic State University–San Luis Obispo; Tony
Richardson, University of Evansville; Don Ridgeway, North Carolina State
University; Larry J. Ringer, Texas A&M University; Nabin Sapkota, University
of Central Florida; Robert M. Schumacher, Cedarville University; Ron Schwartz,
Florida Atlantic University; Kevan Shafizadeh, California State University–
Sacramento; Mohammed Shayib, Prairie View A&M; Alice E. Smith, Auburn
University; James MacGregor Smith, University of Massachusetts; Paul J. Smith,
University of Maryland; Richard M. Soland, The George Washington University;
Clifford Spiegelman, Texas A&M University; Jery Stedinger, Cornell University;
David Steinberg, Tel Aviv University; William Thistleton, State University of
New York Institute of Technology; J A Stephen Viggiano, Rochester Institute
of Technology; G. Geoffrey Vining, University of Florida; Bhutan Wadhwa,
Cleveland State University; Gary Wasserman, Wayne State University; Elaine
Wenderholm, State University of New York–Oswego; Samuel P. Wilcock, Messiah
College; Michael G. Zabetakis, University of Pittsburgh; and Maria Zack, Point
Loma Nazarene University.
Preeti Longia Sinha of MPS Limited has done a terrific job of supervis­
ing the book’s production. Once again I am compelled to express my gratitude
to all those people at Cengage who have made important contributions over
the course of my textbook writing career. For this most recent edition, special
thanks go to Jay Campbell (for his timely and informed feedback throughout the
project), Molly Taylor, Ryan Ahern, Spencer Arritt, Cathy Brooks, and Andrew
Coppola. I also greatly appreciate the stellar work of all those Cengage Learning
sales representatives who have labored to make my books more visible to the
statistical community. Last but by no means least, a heartfelt thanks to my wife
Carol for her decades of support, and to my daughters for providing inspiration
through their own achievements.
Jay Devore
1
Overview and
Descriptive Statistics 1
IntroductIon
Statistical concepts and methods are not only useful but indeed often indis­
pensable in understanding the world around us. They provide ways of gaining
new insights into the behavior of many phenomena that you will encounter in
your chosen field of specialization in engineering or science.
The discipline of statistics teaches us how to make intelligent judgments
and informed decisions in the presence of uncertainty and variation. Without
uncertainty or variation, there would be little need for statistical methods or stat­
isticians. If every component of a particular type had exactly the same lifetime, if
all resistors produced by a certain manufacturer had the same resistance value,
“I took statistics at business school, and it was a transformative
experience. Analytical training gives you a skill set that differen­
tiates you from most people in the labor market.”
—LaszLo Bock, senior Vice President of PeoPLe oPerations (in charge of aLL hiring) at
googLe
April 20, 2014, The New York Times, interview with columnist Thomas Friedman
“I am not much given to regret, so I puzzled over this one a while.
Should have taken much more statistics in college, I think.”
—Max LeVchin, PayPaL co-founder, sLide founder
Quote of the week from the Web site of the American Statistical Association on
November 23, 2010
“I keep saying that the sexy job in the next 10 years will be statisti­
cians, and I’m not kidding.”
—Hal Varian, CHief eConomist at GooGle
August 6, 2009, The New York Times
2 Chapter 1 Overview and Descriptive Statistics
if pH determinations for soil specimens from a particular locale gave identical
results, and so on, then a single observation would reveal all desired information.
An interesting manifestation of variation appeared in connection with
determining the “greenest” way to travel. The article “Carbon Conundrum”
(Consumer Reports, 2008: 9) identified organizations that help consumers
calculate carbon output. The following results on output for a flight from New
York to Los Angeles were reported:
Carbon Calculator CO2 (lb)
Terra Pass 1924
Conservation International 3000
Cool It 3049
World Resources Institute/Safe Climate 3163
National Wildlife Federation 3465
Sustainable Travel International 3577
Native Energy 3960
Environmental Defense 4000
Carbonfund.org 4820
The Climate Trust/CarbonCounter.org 5860
Bonneville Environmental Foundation 6732
There is clearly rather substantial disagreement among these calculators
as to exactly how much carbon is emitted, characterized in the article as “from
a ballerina’s to Bigfoot’s.” A website address was provided where readers could
learn more about how the various calculators work.
How can statistical techniques be used to gather information and draw
conclusions? Suppose, for example, that a materials engineer has developed a
coating for retarding corrosion in metal pipe under specified circumstances. If
this coating is applied to different segments of pipe, variation in environmental
conditions and in the segments themselves will result in more substantial corro­
sion on some segments than on others. Methods of statistical analysis could be
used on data from such an experiment to decide whether the average amount
of corrosion exceeds an upper specification limit of some sort or to predict how
much corrosion will occur on a single piece of pipe.
Alternatively, suppose the engineer has developed the coating in the belief
that it will be superior to the currently used coating. A comparative experiment
could be carried out to investigate this issue by applying the current coating to
some segments of pipe and the new coating to other segments. This must be done
with care lest the wrong conclusion emerge. For example, perhaps the average
amount of corrosion is identical for the two coatings. However, the new coating
may be applied to segments that have superior ability to resist corrosion and under
less stressful environmental conditions compared to the segments and conditions
for the current coating. The investigator would then likely observe a difference
1.1 populations, Samples, and processes 3
between the two coatings attributable not to the coatings themselves, but just to
extraneous variation. Statistics offers not only methods for analyzing the results of
experiments once they have been carried out but also suggestions for how experi­
ments can be performed in an efficient manner to mitigate the effects of variation
and have a better chance of producing correct conclusions.
Engineers and scientists are constantly exposed to collections of facts, or data, both
in their professional capacities and in everyday activities. The discipline of statistics
provides methods for organizing and summarizing data and for drawing conclusions
based on information contained in the data.
An investigation will typically focus on a well-defined collection of objects
constituting a population of interest. In one study, the population might consist of
all gelatin capsules of a particular type produced during a specified period. Another
investigation might involve the population consisting of all individuals who received
a B.S. in engineering during the most recent academic year. When desired informa-
tion is available for all objects in the population, we have what is called a census.
Constraints on time, money, and other scarce resources usually make a census
impractical or infeasible. Instead, a subset of the population—a sample—is selected
in some prescribed manner. Thus we might obtain a sample of bearings from a par-
ticular production run as a basis for investigating whether bearings are conforming to
manufacturing specifications, or we might select a sample of last year’s engineering
graduates to obtain feedback about the quality of the engineering curricula.
We are usually interested only in certain characteristics of the objects in a pop-
ulation: the number of flaws on the surface of each casing, the thickness of each capsule
wall, the gender of an engineering graduate, the age at which the individual graduated,
and so on. A characteristic may be categorical, such as gender or type of malfunction,
or it may be numerical in nature. In the former case, the value of the characteristic is
a category (e.g., female or insufficient solder), whereas in the latter case, the value is a
number (e.g., age 5 23 years or diameter 5 .502 cm).A variable is any characteristic
whose value may change from one object to another in the population. We shall initially
denote variables by lowercase letters from the end of our alphabet. Examples include
x 5 brand of calculator owned by a student
y 5 number of visits to a particular Web site during a specified period
z 5 braking distance of an automobile under specified conditions
Data results from making observations either on a single variable or simultaneously
on two or more variables. A univariate data set consists of observations on a single
variable. For example, we might determine the type of transmission, automatic (A)
or manual (M), on each of ten automobiles recently purchased at a certain dealer-
ship, resulting in the categorical data set
M A A A M A A M A A
The following sample of pulse rates (beats per minute) for patients recently admitted
to an adult intensive care unit is a numerical univariate data set:
88 80 71 103 154 132 67 110 60 105
1.1 Populations, Samples, and Processes
4 Chapter 1 Overview and Descriptive Statistics
We have bivariate data when observations are made on each of two variables. Our
data set might consist of a (height, weight) pair for each basketball player on a
team, with the first observation as (72, 168), the second as (75, 212), and so on. If
an engineer determines the value of both x 5 component lifetime and y 5 reason
for component failure, the resulting data set is bivariate with one variable numeri-
cal and the other categorical. Multivariate data arises when observations are made
on more than one variable (so bivariate is a special case of multivariate). For exam-
ple, a research physician might determine the systolic blood pressure, diastolic blood
pressure, and serum cholesterol level for each patient participating in a study.
Each observation would be a triple of numbers, such as (120, 80, 146). In many
multivariate data sets, some variables are numerical and others are categorical. Thus
the annual automobile issue of Consumer Reports gives values of such variables as
type of vehicle (small, sporty, compact, mid-size, large), city fuel efficiency (mpg),
highway fuel efficiency (mpg), drivetrain type (rear wheel, front wheel, four
wheel), and so on.
Branches of Statistics
An investigator who has collected data may wish simply to summarize and describe
important features of the data. This entails using methods from descriptive statistics.
Some of these methods are graphical in nature; the construction of histograms, boxplots,
and scatter plots are primary examples. Other descriptive methods involve calculation of
numerical summary measures, such as means, standard deviations, and correlation coef-
ficients. The wide availability of statistical computer software packages has made these
tasks much easier to carry out than they used to be. Computers are much more efficient
than human beings at calculation and the creation of pictures (once they have received
appropriate instructions from the user!). This means that the investigator doesn’t have
to expend much effort on “grunt work” and will have more time to study the data and
extract important messages. Throughout this book, we will present output from various
packages such as Minitab, SAS, JMP, and R. The R software can be downloaded without
charge from the site http://www.r-project.org. It has achieved great popularity in the
statistical community, and many books describing its various uses are available (it does
entail programming as opposed to the pull-down menus of Minitab and JMP).
Charity is a big business in the United States. The Web site charitynavigator.com
gives information on roughly 6000 charitable organizations, and there are many
smaller charities that fly below the navigator’s radar screen. Some charities operate
very efficiently, with fundraising and administrative expenses that are only a small
percentage of total expenses, whereas others spend a high percentage of what they
take in on such activities. Here is data on fundraising expenses as a percentage of
total expenditures for a random sample of 60 charities:
6.1 12.6 34.7 1.6 18.8 2.2 3.0 2.2 5.6 3.8
2.2 3.1 1.3 1.1 14.1 4.0 21.0 6.1 1.3 20.4
7.5 3.9 10.1 8.1 19.5 5.2 12.0 15.8 10.4 5.2
6.4 10.8 83.1 3.6 6.2 6.3 16.3 12.7 1.3 0.8
8.8 5.1 3.7 26.3 6.0 48.0 8.2 11.7 7.2 3.9
15.3 16.6 8.8 12.0 4.7 14.7 6.4 17.0 2.5 16.2
Without any organization, it is difficult to get a sense of the data’s most prominent
features—what a typical (i.e., representative) value might be, whether values are
highly concentrated about a typical value or quite dispersed, whether there are any
gaps in the data, what fraction of the values are less than 20%, and so on. Figure 1.1
ExamplE 1.1
1.1 populations, Samples, and processes 5
shows what is called a stem-and-leaf display as well as a histogram. In Section 1.2
we will discuss construction and interpretation of these data summaries. For the
moment, we hope you see how they begin to describe how the percentages are dis-
tributed over the range of possible values from 0 to 100. Clearly a substantial major-
ity of the charities in the sample spend less than 20% on fundraising, and only a few
percentages might be viewed as beyond the bounds of sensible practice. ■
Having obtained a sample from a population, an investigator would frequently
like to use sample information to draw some type of conclusion (make an inference
of some sort) about the population. That is, the sample is a means to an end rather
than an end in itself. Techniques for generalizing from a sample to a population are
gathered within the branch of our discipline called inferential statistics.
Material strength investigations provide a rich area of application for statistical methods.
The article “Effects of Aggregates and Microfillers on the Flexural Properties of
Concrete” (Magazine of Concrete Research, 1997: 81–98) reported on a study of
strength properties of high-performance concrete obtained by using superplasticizers
and certain binders. The compressive strength of such concrete had previously
been investigated, but not much was known about flexural strength (a measure of
ability to resist failure in bending). The accompanying data on flexural strength (in
MegaPascal, MPa, where 1 Pa (Pascal) 5 1.45 3 1024
psi) appeared in the article
cited:
5.9 7.2 7.3 6.3 8.1 6.8 7.0 7.6 6.8 6.5 7.0 6.3 7.9 9.0
8.2 8.7 7.8 9.7 7.4 7.7 9.7 7.8 7.7 11.6 11.3 11.8 10.7
Suppose we want an estimate of the average value of flexural strength for all beams
that could be made in this way (if we conceptualize a population of all such beams,
we are trying to estimate the population mean). It can be shown that, with a high
degree of confidence, the population mean strength is between 7.48 MPa and
8.80 MPa; we call this a confidence interval or interval estimate. Alternatively, this
data could be used to predict the flexural strength of a single beam of this type. With
a high degree of confidence, the strength of a single such beam will exceed
7.35 MPa; the number 7.35 is called a lower prediction bound. ■
ExamplE 1.2
0
0
10
20
Frequency
30
40
Stem–and–leaf of FundRsng N = 60
Leaf Unit = 1.0
0 0111112222333333344
0 55556666666778888
1 0001222244
1 55666789
2 01
2 6
3
3
4
4 8
5
5
6
6
7
7
8 3
4
10 20 30 40 50
FundRsng
60 70 80 90
Figure 1.1 A Minitab stem-and-leaf display (tenths digit truncated) and histogram for the charity
fundraising percentage data
6 Chapter 1 Overview and Descriptive Statistics
The main focus of this book is on presenting and illustrating methods of
inferential statistics that are useful in scientific work. The most important types
of inferential procedures—point estimation, hypothesis testing, and estimation by
confidence intervals—are introduced in Chapters 6–8 and then used in more com-
plicated settings in Chapters 9–16. The remainder of this chapter presents methods
from descriptive statistics that are most used in the development of inference.
Chapters 2–5 present material from the discipline of probability. This mate-
rial ultimately forms a bridge between the descriptive and inferential techniques.
Mastery of probability leads to a better understanding of how inferential procedures
are developed and used, how statistical conclusions can be translated into everyday
language and interpreted, and when and where pitfalls can occur in applying the
methods. Probability and statistics both deal with questions involving populations
and samples, but do so in an “inverse manner” to one another.
In a probability problem, properties of the population under study are
assumed known (e.g., in a numerical population, some specified distribution of the
population values may be assumed), and questions regarding a sample taken from
the population are posed and answered. In a statistics problem, characteristics of a
sample are available to the experimenter, and this information enables the experi-
menter to draw conclusions about the population. The relationship between the
two disciplines can be summarized by saying that probability reasons from the popu-
lation to the sample (deductive reasoning), whereas inferential statistics reasons from
the sample to the population (inductive reasoning). This is illustrated in Figure 1.2.
Population
Probability
Inferential
statistics
Sample
Figure 1.2 The relationship between probability and inferential statistics
Before we can understand what a particular sample can tell us about the popu-
lation, we should first understand the uncertainty associated with taking a sample
from a given population. This is why we study probability before statistics.
As an example of the contrasting focus of probability and inferential statistics, con-
sider drivers’ use of manual lap belts in cars equipped with automatic shoulder belt
systems. (The article “Automobile Seat Belts: Usage Patterns in Automatic Belt
Systems,” Human Factors, 1998: 126–135, summarizes usage data.) In probability,
we might assume that 50% of all drivers of cars equipped in this way in a certain
metropolitan area regularly use their lap belt (an assumption about the population),
so we might ask, “How likely is it that a sample of 100 such drivers will include at
least 70 who regularly use their lap belt?” or “How many of the drivers in a sample
of size 100 can we expect to regularly use their lap belt?” On the other hand, in infer-
ential statistics, we have sample information available; for example, a sample of 100
drivers of such cars revealed that 65 regularly use their lap belt. We might then ask,
“Does this provide substantial evidence for concluding that more than 50% of all
such drivers in this area regularly use their lap belt?” In this latter scenario, we are
attempting to use sample information to answer a question about the structure of the
entire population from which the sample was selected. ■
In the foregoing lap belt example, the population is well defined and concrete: all
drivers of cars equipped in a certain way in a particular metropolitan area. In Example
1.2, however, the strength measurements came from a sample of prototype beams that
ExamplE 1.3
1.1 populations, Samples, and processes 7
had not been selected from an existing population. Instead, it is convenient to think of
the population as consisting of all possible strength measurements that might be made
under similar experimental conditions. Such a population is referred to as a conceptual
or hypothetical population. There are a number of problem situations in which we fit
questions into the framework of inferential statistics by conceptualizing a population.
the Scope of Modern Statistics
These days statistical methodology is employed by investigators in virtually all dis-
ciplines, including such areas as
● molecular biology (analysis of microarray data)
● ecology (describing quantitatively how individuals in various animal and plant
populations are spatially distributed)
● materials engineering (studying properties of various treatments to retard corrosion)
● marketing (developing market surveys and strategies for marketing new products)
● public health (identifying sources of diseases and ways to treat them)
● civil engineering (assessing the effects of stress on structural elements and the
impacts of traffic flows on communities)
As you progress through the book, you’ll encounter a wide spectrum of different sce-
narios in the examples and exercises that illustrate the application of techniques from
probability and statistics. Many of these scenarios involve data or other material
extracted from articles in engineering and science journals. The methods presented
herein have become established and trusted tools in the arsenal of those who work with
data. Meanwhile, statisticians continue to develop new models for describing rand-
omness, and uncertainty and new methodology for analyzing data. As evidence of
the continuing creative efforts in the statistical community, here are titles and capsule
descriptions of some articles that have recently appeared in statistics journals (Journal
of the American Statistical Association is abbreviated JASA, and AAS is short for the
Annals of Applied Statistics, two of the many prominent journals in the discipline):
● “How Many People Do You Know? Efficiently Estimating Personal
Network Size” (JASA, 2010: 59–70): How many of the N individuals at your
college do you know? You could select a random sample of students from the
population and use an estimate based on the fraction of people in this sam-
ple that you know. Unfortunately this is very inefficient for large populations
because the fraction of the population someone knows is typically very small. A
“latent mixing model” was proposed that the authors asserted remedied deficien-
cies in previously used techniques. A simulation study of the method’s effec-
tiveness based on groups consisting of first names (“How many people named
Michael do you know?”) was included as well as an application of the method to
actual survey data. The article concluded with some practical guidelines for the
construction of future surveys designed to estimate social network size.
● “Active Learning Through Sequential Design, with Applications to the
Detection of Money Laundering” (JASA, 2009: 969–981): Money launder-
ing involves concealing the origin of funds obtained through illegal activities.
The huge number of transactions occurring daily at financial institutions makes
detection of money laundering difficult. The standard approach has been to
extract various summary quantities from the transaction history and conduct a
time-consuming investigation of suspicious activities. The article proposes a
more efficient statistical method and illustrates its use in a case study.
8 Chapter 1 Overview and Descriptive Statistics
● “Robust Internal Benchmarking and False Discovery Rates for Detecting
Racial Bias in Police Stops” (JASA, 2009: 661–668): Allegations of police
actions that are attributable at least in part to racial bias have become a contentious
issue in many communities. This article proposes a new method that is designed
to reduce the risk of flagging a substantial number of “false positives” (individuals
falsely identified as manifesting bias). The method was applied to data on 500,000
pedestrian stops in New York City in 2006; of the 3000 officers regularly involved
in pedestrian stops, 15 were identified as having stopped a substantially greater frac-
tion of Black and Hispanic people than what would be predicted were bias absent.
● “Records in Athletics Through Extreme Value Theory” (JASA, 2008:
1382–1391): The focus here is on the modeling of extremes related to world
records in athletics. The authors start by posing two questions: (1) What is the
ultimate world record within a specific event (e.g., the high jump for women)?
and (2) How “good” is the current world record, and how does the quality of
current world records compare across different events? A total of 28 events
(8 running, 3 throwing, and 3 jumping for both men and women) are consid-
ered. For example, one conclusion is that only about 20 seconds can be shaved
off the men’s marathon record, but that the current women’s marathon record
is almost 5 minutes longer than what can ultimately be achieved. The method-
ology also has applications to such issues as ensuring airport runways are long
enough and that dikes in Holland are high enough.
● “Self-Exciting Hurdle Models for Terrorist Activity” (AAS, 2012: 106–124): The
authors developed a predictive model of terrorist activity by considering the daily
number of terrorist attacks in Indonesia from 1994 through 2007. The model esti-
mates the chance of future attacks as a function of the times since past attacks. One
feature of the model considers the excess of nonattack days coupled with the pres-
ence of multiple coordinated attacks on the same day. The article provides an inter-
pretation of various model characteristics and assesses its predictive performance.
● “Prediction of Remaining Life of Power Transformers Based on Left
Truncated and Right Censored Lifetime Data” (AAS, 2009: 857–879): There
are roughly 150,000 high-voltage power transmission transformers in the United
States. Unexpected failures can cause substantial economic losses, so it is impor-
tant to have predictions for remaining lifetimes. Relevant data can be complicated
because lifetimes of some transformers extend over several decades during which
records were not necessarily complete. In particular, the authors of the article use
data from a certain energy company that began keeping careful records in 1980.
But some transformers had been installed before January 1, 1980, and were still
in service after that date (“left truncated” data), whereas other units were still in
service at the time of the investigation, so their complete lifetimes are not available
(“right censored” data). The article describes various procedures for obtaining an
interval of plausible values (a prediction interval) for a remaining lifetime and for
the cumulative number of failures over a specified time period.
● “The BARISTA: A Model for Bid Arrivals in Online Auctions” (AAS, 2007:
412–441): Online auctions such as those on eBay and uBid often have character-
istics that differentiate them from traditional auctions. One particularly important
difference is that the number of bidders at the outset of many traditional auctions
is fixed, whereas in online auctions this number and the number of resulting bids
are not predetermined. The article proposes a new BARISTA (for Bid ARrivals
In STAges) model for describing the way in which bids arrive online. The model
allows for higher bidding intensity at the outset of the auction and also as the
auction comes to a close. Various properties of the model are investigated and
1.1 populations, Samples, and processes 9
then validated using data from eBay.com on auctions for Palm M515 personal
assistants, Microsoft Xbox games, and Cartier watches.
● “Statistical Challenges in the Analysis of Cosmic Microwave Background
Radiation” (AAS, 2009: 61–95): The cosmic microwave background (CMB)
is a significant source of information about the early history of the universe. Its
radiation level is uniform, so extremely delicate instruments have been developed
to measure fluctuations. The authors provide a review of statistical issues with
CMB data analysis; they also give many examples of the application of statistical
procedures to data obtained from a recent NASA satellite mission, the Wilkinson
Microwave Anisotropy Probe.
Statistical information now appears with increasing frequency in the popular media,
and occasionally the spotlight is even turned on statisticians. For example, the Nov. 23,
2009, New York Times reported in an article “Behind Cancer Guidelines, Quest for
Data” that the new science for cancer investigations and more sophisticated methods
for data analysis spurred the U.S. Preventive Services task force to re-examine guide-
lines for how frequently middle-aged and older women should have mammograms.
The panel commissioned six independent groups to do statistical modeling. The
result was a new set of conclusions, including an assertion that mammograms every
two years are nearly as beneficial to patients as annual mammograms, but confer only
half the risk of harms. Donald Berry, a very prominent biostatistician, was quoted as
saying he was pleasantly surprised that the task force took the new research to heart in
making its recommendations. The task force’s report has generated much controversy
among cancer organizations, politicians, and women themselves.
It is our hope that you will become increasingly convinced of the importance
and relevance of the discipline of statistics as you dig more deeply into the book and
the subject. Hopefully you’ll be turned on enough to want to continue your statistical
education beyond your current course.
Enumerative Versus Analytic Studies
W. E. Deming, a very influential American statistician who was a moving force in
Japan’s quality revolution during the 1950s and 1960s, introduced the distinction
between enumerative studies and analytic studies. In the former, interest is focused
on a finite, identifiable, unchanging collection of individuals or objects that make up
a population. A sampling frame—that is, a listing of the individuals or objects to be
sampled—is either available to an investigator or else can be constructed. For exam-
ple, the frame might consist of all signatures on a petition to qualify a certain initia-
tive for the ballot in an upcoming election; a sample is usually selected to ascertain
whether the number of valid signatures exceeds a specified value. As another
example, the frame may contain serial numbers of all furnaces manufactured by a
particular company during a certain time period; a sample may be selected to infer
something about the average lifetime of these units. The use of inferential methods
to be developed in this book is reasonably noncontroversial in such settings (though
statisticians may still argue over which particular methods should be used).
An analytic study is broadly defined as one that is not enumerative in nature. Such
studies are often carried out with the objective of improving a future product by taking
action on a process of some sort (e.g., recalibrating equipment or adjusting the level of
some input such as the amount of a catalyst). Data can often be obtained only on an
existing process, one that may differ in important respects from the future process. There
is thus no sampling frame listing the individuals or objects of interest. For example, a
sample of five turbines with a new design may be experimentally manufactured and
10 Chapter 1 Overview and Descriptive Statistics
tested to investigate efficiency. These five could be viewed as a sample from the concep-
tual population of all prototypes that could be manufactured under similar conditions,
but not necessarily as representative of the population of units manufactured once regular
production gets underway. Methods for using sample information to draw conclusions
about future production units may be problematic. Someone with expertise in the area
of turbine design and engineering (or whatever other subject area is relevant) should be
called upon to judge whether such extrapolation is sensible. A good exposition of these
issues is contained in the article “Assumptions for Statistical Inference” by Gerald
Hahn and William Meeker (The American Statistician, 1993: 1–11).
collecting data
Statistics deals not only with the organization and analysis of data once it has been
collected but also with the development of techniques for collecting the data. If data
is not properly collected, an investigator may not be able to answer the questions
under consideration with a reasonable degree of confidence. One common problem
is that the target population—the one about which conclusions are to be drawn—may
be different from the population actually sampled. For example, advertisers would
like various kinds of information about the television-viewing habits of potential cus-
tomers. The most systematic information of this sort comes from placing monitoring
devices in a small number of homes across the United States. It has been conjectured
that placement of such devices in and of itself alters viewing behavior, so that charac-
teristics of the sample may be different from those of the target population.
When data collection entails selecting individuals or objects from a frame, the
simplest method for ensuring a representative selection is to take a simple random
sample. This is one for which any particular subset of the specified size (e.g., a sam-
ple of size 100) has the same chance of being selected. For example, if the frame
consists of 1,000,000 serial numbers, the numbers 1, 2,…, up to 1,000,000 could
be placed on identical slips of paper. After placing these slips in a box and thor-
oughly mixing, slips could be drawn one by one until the requisite sample size has
been obtained. Alternatively (and much to be preferred), a table of random numbers
or a software package’s random number generator could be employed.
Sometimes alternative sampling methods can be used to make the selection
process easier, to obtain extra information, or to increase the degree of confidence in
conclusions. One such method, stratified sampling, entails separating the population
units into nonoverlapping groups and taking a sample from each one. For example,
a study of how physicians feel about the Affordable Care Act might proceed by
stratifying according to specialty: select a sample of surgeons, another sample of
radiologists, yet another sample of psychiatrists, and so on. This would result in
information separately from each specialty and ensure that no one specialty is over-
or underrepresented in the entire sample.
Frequently a “convenience” sample is obtained by selecting individuals or
objects without systematic randomization. As an example, a collection of bricks may
be stacked in such a way that it is extremely difficult for those in the center to be
selected. If the bricks on the top and sides of the stack were somehow different from
the others, resulting sample data would not be representative of the population. Often
an investigator will assume that such a convenience sample approximates a random
sample, in which case a statistician’s repertoire of inferential methods can be used;
however, this is a judgment call. Most of the methods discussed herein are based on
a variation of simple random sampling described in Chapter 5.
Engineers and scientists often collect data by carrying out some sort of designed
experiment. This may involve deciding how to allocate several different treatments (such
as fertilizers or coatings for corrosion protection) to the various experimental units (plots
1.1 populations, Samples, and processes 11
of land or pieces of pipe). Alternatively, an investigator may systematically vary the
levels or categories of certain factors (e.g., pressure or type of insulating material) and
observe the effect on some response variable (such as yield from a production process).
An article in the NewYork Times (Jan. 27, 1987) reported that heart attack risk could be
reduced by taking aspirin. This conclusion was based on a designed experiment involv-
ing both a control group of individuals that took a placebo having the appearance of
aspirin but known to be inert and a treatment group that took aspirin according to a
specified regimen. Subjects were randomly assigned to the groups to protect against
any biases and so that probability-based methods could be used to analyze the data. Of
the 11,034 individuals in the control group, 189 subsequently experienced heart attacks,
whereas only 104 of the 11,037 in the aspirin group had a heart attack. The incidence
rate of heart attacks in the treatment group was only about half that in the control group.
One possible explanation for this result is chance variation—that aspirin really doesn’t
have the desired effect and the observed difference is just typical variation in the same
way that tossing two identical coins would usually produce different numbers of heads.
However, in this case, inferential methods suggest that chance variation by itself cannot
adequately explain the magnitude of the observed difference. ■
An engineer wishes to investigate the effects of both adhesive type and conductor
material on bond strength when mounting an integrated circuit (IC) on a certain sub-
strate. Two adhesive types and two conductor materials are under consideration.
Two observations are made for each adhesive-type/conductor-material combination,
resulting in the accompanying data:
ExamplE 1.4
ExamplE 1.5
Adhesive Type Conductor Material Observed Bond Strength Average
1 1 82, 77 79.5
1 2 75, 87 81.0
2 1 84, 80 82.0
2 2 78, 90 84.0
The resulting average bond strengths are pictured in Figure 1.3. It appears that adhe-
sive type 2 improves bond strength as compared with type 1 by about the same
amount whichever one of the conducting materials is used, with the 2, 2 combin-
ation being best. Inferential methods can again be used to judge whether these
effects are real or simply due to chance variation.
Conducting material
Average
strength
1 2
80
85
Adhesive type 2
Adhesive type 1
Figure 1.3 Average bond strengths in Example 1.5
Suppose additionally that there are two cure times under consideration and also two
types of IC post coating. There are then 2 ? 2 ? 2 ? 2 5 16 combinations of these four
12 Chapter 1 Overview and Descriptive Statistics
factors, and our engineer may not have enough resources to make even a single observa-
tion for each of these combinations. In Chapter 11, we will see how the careful selection
of a fraction of these possibilities will usually yield the desired information. ■
1. Give one possible sample of size 4 from each of the fol-
lowing populations:
a. All daily newspapers published in the United States
b. All companies listed on the New York Stock
Exchange
c. All students at your college or university
d. All grade point averages of students at your college
or university
2. For each of the following hypothetical populations, give
a plausible sample of size 4:
a. All distances that might result when you throw a
football
b. Page lengths of books published 5 years from now
c. All possible earthquake-strength measurements
(Richter scale) that might be recorded in California
during the next year
d. All possible yields (in grams) from a certain chemi-
cal reaction carried out in a laboratory
3. Consider the population consisting of all computers of a
certain brand and model, and focus on whether a com-
puter needs service while under warranty.
a. Pose several probability questions based on selecting
a sample of 100 such computers.
b. What inferential statistics question might be answered
by determining the number of such computers in a
sample of size 100 that need warranty service?
4. a. Give three different examples of concrete popula-
tions and three different examples of hypothetical
populations.
b. For one each of your concrete and your hypothetical
populations, give an example of a probability question
and an example of an inferential statistics question.
5. Many universities and colleges have instituted supplemen-
tal instruction (SI) programs, in which a student facilitator
meets regularly with a small group of students enrolled in
the course to promote discussion of course material and
enhance subject mastery. Suppose that students in a large
statistics course (what else?) are randomly divided into a
control group that will not participate in SI and a treatment
group that will participate. At the end of the term, each
student’s total score in the course is determined.
a. Are the scores from the SI group a sample from an
existing population? If so, what is it? If not, what is
the relevant conceptual population?
b. What do you think is the advantage of randomly
dividing the students into the two groups rather than
letting each student choose which group to join?
c. Why didn’t the investigators put all students in the treat-
ment group? [Note: The article “Supplemental
Instruction: An Effective Component of Student
Affairs Programming” (J. of College Student Devel.,
1997: 577–586) discusses the analysis of data from
several SI programs.]
6. The California State University (CSU) system consists
of 23 campuses, from San Diego State in the south to
Humboldt State near the Oregon border. A CSU admin-
istrator wishes to make an inference about the average
distance between the hometowns of students and their
campuses. Describe and discuss several different sam-
pling methods that might be employed. Would this be
an enumerative or an analytic study? Explain your
reasoning.
7. A certain city divides naturally into ten district neighbor-
hoods. How might a real estate appraiser select a sample
of single-family homes that could be used as a basis for
developing an equation to predict appraised value from
characteristics such as age, size, number of bathrooms,
distance to the nearest school, and so on? Is the study
enumerative or analytic?
8. The amount of flow through a solenoid valve in an auto-
mobile’s pollution-control system is an important char-
acteristic. An experiment was carried out to study how
flow rate depended on three factors: armature length,
spring load, and bobbin depth. Two different levels (low
and high) of each factor were chosen, and a single
observation on flow was made for each combination of
levels.
a. The resulting data set consisted of how many
observations?
b. Is this an enumerative or analytic study? Explain
your reasoning.
9. In a famous experiment carried out in 1882, Michelson
and Newcomb obtained 66 observations on the time it
took for light to travel between two locations in
Washington, D.C. A few of the measurements (coded in
a certain manner) were 31, 23, 32, 36, 22, 26, 27, and 31.
a. Why are these measurements not identical?
b. Is this an enumerative study? Why or why not?
EXERCISES Section 1.1 (1–9)
1.2 pictorial and tabular Methods in Descriptive Statistics 13
1.2 Pictorial and Tabular Methods
in Descriptive Statistics
Descriptive statistics can be divided into two general subject areas. In this section, we
consider representing a data set using visual displays. In Sections 1.3 and 1.4, we will
develop some numerical summary measures for data sets. Many visual techniques
may already be familiar to you: frequency tables, tally sheets, histograms, pie charts,
bar graphs, scatter diagrams, and the like. Here we focus on a selected few of these
techniques that are most useful and relevant to probability and inferential statistics.
notation
Some general notation will make it easier to apply our methods and formulas to a
wide variety of practical problems. The number of observations in a single sample,
that is, the sample size, will often be denoted by n, so that n 5 4 for the sample of
universities {Stanford, Iowa State, Wyoming, Rochester} and also for the sample of
pH measurements {6.3, 6.2, 5.9, 6.5}. If two samples are simultaneously under con-
sideration, either m and n or n1 and n2 can be used to denote the numbers of observa-
tions. An experiment to compare thermal efficiencies for two different types of diesel
engines might result in samples {29.7, 31.6, 30.9} and {28.7, 29.5, 29.4, 30.3}, in
which case m 5 3 and n 5 4.
Given a data set consisting of n observations on some variable x, the individ-
ual observations will be denoted by x1, x2, x3,…, xn. The subscript bears no relation
to the magnitude of a particular observation. Thus x1 will not in general be the small-
est observation in the set, nor will xn typically be the largest. In many applications,
x1 will be the first observation gathered by the experimenter, x2 the second, and so
on. The ith observation in the data set will be denoted by xi.
Stem-and-Leaf displays
Consider a numerical data set x1, x2,…, xn for which each xi consists of at least two
digits. A quick way to obtain an informative visual representation of the data set is
to construct a stem-and-leaf display.
constructing a Stem-and-Leaf display
1. Select one or more leading digits for the stem values. The trailing digits
become the leaves.
2. List possible stem values in a vertical column.
3. Record the leaf for each observation beside the corresponding stem value.
4. Indicate the units for stems and leaves someplace in the display.
For a data set consisting of exam scores, each between 0 and 100, the score of 83
would have a stem of 8 and a leaf of 3. If all exam scores are in the 90s, 80s, and
70s (an instructor’s dream!), use of the tens digit as the stem would give a display
14 Chapter 1 Overview and Descriptive Statistics
with only three rows. In this case, it is desirable to stretch the display by repeating
each stem value twice—9H, 9L, 8H, . . . ,7L—once for high leaves 9, . . . , 5 and
again for low leaves 4, . . . , 0. Then a score of 93 would have a stem of 9L and
leaf of 3. In general, a display based on between 5 and 20 stems is recommended.
A common complaint among college students is that they are getting less sleep than
they need. The article “Class Start Times, Sleep, and Academic Performance in
College: A Path Analysis” (Chronobiology Intl., 2012: 318–335) investigated fac-
tors that impact sleep time. The stem-and-leaf display in Figure 1.4 shows the average
number of hours of sleep per day over a two-week period for a sample of 253 students.
ExamplE 1.6
Figure 1.4 Stem-and-leaf display for average sleep time per day
5L
5H
00
6889
000111123444444
55556778899999
Stem: ones digit
Leaf: tenths digit
000011111112222223333333344444444
55555555666666666666777777888888888999999999999999
00000000000011111122222222222222222333333333334444444444444
6L
6H
7L
7H
8L
5555555566666666677777788888888899999999999
00001111111222223334
666678999
00
56
8H
9L
9H
10L
10H
The first observation in the top row of the display is 5.0, corresponding to a
stem of 5 and leaf of 0, and the last observation at the bottom of the display is 10.6.
Note that in the absence of a context, without the identification of stem and leaf
digits in the display, we wouldn’t know whether the observation with stem 7 and
leaf 9 was .79, 7.9, or 79. The leaves in each row are ordered from smallest to larg-
est; this is commonly done by software packages but is not necessary if a display is
created by hand.
The display suggests that a typical or representative sleep time is in the stem
8L row, perhaps 8.1 or 8.2. The data is not highly concentrated about this typical
value as would be the case if almost all students were getting between 7.5 and 9.5
hours of sleep on average. The display appears to rise rather smoothly to a peak in
the 8L row and then decline smoothly (we conjecture that the minor peak in the 6L
row would disappear if more data was available). The general shape of the display
is rather symmetric, bearing strong resemblance to a bell-shaped curve; it does not
stretch out more in one direction than the other. The two smallest and two largest
values seem a bit separated from the remainder of the data—perhaps they are very
mild, but certainly not extreme,“outliers”. A reference in the cited article suggests
that individuals in this age group need about 8.4 hours of sleep per day. So it appears
that a substantial percentage of students in the sample are sleep deprived. ■
A stem-and-leaf display conveys information about the following aspects of
the data:
● identification of a typical or representative value
● extent of spread about the typical value
● presence of any gaps in the data
1.2 pictorial and tabular Methods in Descriptive Statistics 15
● extent of symmetry in the distribution of values
● number and locations of peaks
● presence of any outliers—values far from the rest of the data
Figure 1.5 presents stem-and-leaf displays for a random sample of lengths of golf
courses (yards) that have been designated by Golf Magazine as among the most chal-
lenging in the United States.Among the sample of 40 courses, the shortest is 6433 yards
long, and the longest is 7280 yards. The lengths appear to be distributed in a roughly
uniform fashion over the range of values in the sample. Notice that a stem choice here of
either a single digit (6 or 7) or three digits (643, … , 728) would yield an uninformative
display, the first because of too few stems and the latter because of too many.
ExamplE 1.7
64 35 64 33 70 Stem: Thousands and hundreds digits
65 26 27 06 83 Leaf: Tens and ones digits
66 05 94 14
67 90 70 00 98 70 45 13
68 90 70 73 50
69 00 27 36 04
70 51 05 11 40 50 22
71 31 69 68 05 13 65
72 80 09
Stem-and-leaf of yardage N 40
Leaf Unit 10
4 64 3367
8 65 0228
11 66 019
18 67 0147799
(4) 68 5779
18 69 0023
14 70 012455
8 71 013666
2 72 08
(a) (b)
Figure 1.5 Stem-and-leaf displays of golf course lengths: (a) two-digit leaves; (b) display from
Minitab with truncated one-digit leaves
Statistical software packages do not generally produce displays with multiple-
digit stems. The Minitab display in Figure 1.5(b) results from truncating each obser-
vation by deleting the ones digit. ■
dotplots
A dotplot is an attractive summary of numerical data when the data set is reasonably
small or there are relatively few distinct data values. Each observation is represented
by a dot above the corresponding location on a horizontal measurement scale. When
a value occurs more than once, there is a dot for each occurrence, and these dots are
stacked vertically. As with a stem-and-leaf display, a dotplot gives information about
location, spread, extremes, and gaps.
There is growing concern in the U.S. that not enough students are graduating from
college. America used to be number 1 in the world for the percentage of adults
with college degrees, but it has recently dropped to 16th. Here is data on the
percentage of 25- to 34-year-olds in each state who had some type of postsecond-
ary degree as of 2010 (listed in alphabetical order, with the District of Columbia
included):
31.5 32.9 33.0 28.6 37.9 43.3 45.9 37.2 68.8 36.2 35.5
40.5 37.2 45.3 36.1 45.5 42.3 33.3 30.3 37.2 45.5 54.3
37.2 49.8 32.1 39.3 40.3 44.2 28.4 46.0 47.2 28.7 49.6
37.6 50.8 38.0 30.8 37.6 43.9 42.5 35.2 42.2 32.8 32.2
38.5 44.5 44.6 40.9 29.5 41.3 35.4
ExamplE 1.8
16 Chapter 1 Overview and Descriptive Statistics
Figure 1.6 shows a dotplot of the data. Dots corresponding to some values close together
(e.g., 28.6 and 28.7) have been vertically stacked to prevent crowding. There is clearly a
great deal of state-to-state variability. The largest value, for D.C., is obviously an extreme
outlier, and four other values on the upper end of the data are candidates for mild outliers
(MA, MN, NY, and ND). There is also a cluster of states at the low end, primarily located
in the South and Southwest. The overall percentage for the entire country is 39.3%; this
is not a simple average of the 51 numbers but an average weighted by population sizes.
25 30 35 40 45 50 55 60 65 70
Figure 1.6 A dotplot of the data from Example 1.8 n
A dotplot can be quite cumbersome to construct and look crowded when the
number of observations is large. Our next technique is well suited to such situations.
Histograms
Some numerical data is obtained by counting to determine the value of a variable (the
number of traffic citations a person received during the last year, the number of custom-
ers arriving for service during a particular period), whereas other data is obtained by
taking measurements (weight of an individual, reaction time to a particular stimulus).
The prescription for drawing a histogram is generally different for these two cases.
A numerical variable is discrete if its set of possible values either is finite or
else can be listed in an infinite sequence (one in which there is a first number,
a second number, and so on). A numerical variable is continuous if its possible
values consist of an entire interval on the number line.
DEFINITION
A discrete variable x almost always results from counting, in which case pos-
sible values are 0, 1, 2, 3, … or some subset of these integers. Continuous variables
arise from making measurements. For example, if x is the pH of a chemical sub-
stance, then in theory x could be any number between 0 and 14: 7.0, 7.03, 7.032, and
so on. Of course, in practice there are limitations on the degree of accuracy of any
measuring instrument, so we may not be able to determine pH, reaction time, height,
and concentration to an arbitrarily large number of decimal places. However, from
the point of view of creating mathematical models for distributions of data, it is help-
ful to imagine an entire continuum of possible values.
Consider data consisting of observations on a discrete variable x. The frequency
of any particular x value is the number of times that value occurs in the data set. The
relative frequency of a value is the fraction or proportion of times the value occurs:
relative frequency of a value 5
number of times the value occurs
number of observations in the data set
Suppose, for example, that our data set consists of 200 observations on x 5 the number
of courses a college student is taking this term. If 70 of these x values are 3, then
frequency of the x value 3: 70
relative frequency of the x value 3:
70
200
5 .35
1.2 pictorial and tabular Methods in Descriptive Statistics 17
Multiplying a relative frequency by 100 gives a percentage; in the college-course
example, 35% of the students in the sample are taking three courses. The relative
frequencies, or percentages, are usually of more interest than the frequencies them-
selves. In theory, the relative frequencies should sum to 1, but in practice the sum
may differ slightly from 1 because of rounding. A frequency distribution is a tabu-
lation of the frequencies and/or relative frequencies.
constructing a Histogram for discrete data
First, determine the frequency and relative frequency of each x value. Then mark
possible x values on a horizontal scale. Above each value, draw a rectangle whose
height is the relative frequency (or alternatively, the frequency) of that value; the
rectangles should have equal widths.
This construction ensures that the area of each rectangle is proportional to the rela-
tive frequency of the value. Thus if the relative frequencies of x 5 1 and x 5 5 are
.35 and .07, respectively, then the area of the rectangle above 1 is five times the area
of the rectangle above 5.
How unusual is a no-hitter or a one-hitter in a major league baseball game, and how
frequently does a team get more than 10, 15, or even 20 hits? Table 1.1 is a frequency
distribution for the number of hits per team per game for all nine-inning games that
were played between 1989 and 1993.
ExamplE 1.9
Table 1.1 Frequency Distribution for Hits in Nine-Inning Games
Number Relative Number of Relative
Hits/Game of Games Frequency Hits/Game Games Frequency
0 20 .0010 14 569 .0294
1 72 .0037 15 393 .0203
2 209 .0108 16 253 .0131
3 527 .0272 17 171 .0088
4 1048 .0541 18 97 .0050
5 1457 .0752 19 53 .0027
6 1988 .1026 20 31 .0016
7 2256 .1164 21 19 .0010
8 2403 .1240 22 13 .0007
9 2256 .1164 23 5 .0003
10 1967 .1015 24 1 .0001
11 1509 .0779 25 0 .0000
12 1230 .0635 26 1 .0001
13 834 .0430 27 1 .0001
19,383 1.0005
The corresponding histogram in Figure 1.7 rises rather smoothly to a single peak and
then declines. The histogram extends a bit more on the right (toward large values)
than it does on the left—a slight “positive skew.”
18 Chapter 1 Overview and Descriptive Statistics
Either from the tabulated information or from the histogram itself, we can determine
the following:
proportion of games with
at most two hits
5
relative
frequency
for x 5 0
1
relative
frequency
for x 5 1
1
relative
frequency
for x 5 2
5 .0010 1 .0037 1 .0108 5 .0155
Similarly,
proportion of games with
between 5 and 10 hits (inclusive)
5 .0752 1 .1026 1 … 1 .1015 5 .6361
That is, roughly 64% of all these games resulted in between 5 and 10 (inclusive)
hits. ■
Constructing a histogram for continuous data (measurements) entails subdividing
the measurement axis into a suitable number of class intervals or classes, such that
each observation is contained in exactly one class. Suppose, for example, that we
have 50 observations on x 5 fuel efficiency of an automobile (mpg), the smallest of
which is 27.8 and the largest of which is 31.4. Then we could use the class bounda-
ries 27.5, 28.0, 28.5, … , and 31.5 as shown here:
10
.05
0
.10
0
Hits/game
20
Relative frequency
Figure 1.7 Histogram of number of hits per nine-inning game
27.5 28.0 28.5 29.0 29.5 30.0 30.5 31.0 31.5
One potential difficulty is that occasionally an observation lies on a class boundary so
therefore does not fall in exactly one interval, for example, 29.0. One way to deal with
thisproblemistouseboundarieslike27.55,28.05,…,31.55.Addingahundredthsdigit
to the class boundaries prevents observations from falling on the resulting boundaries.
Another approach is to use the classes 27.52, 28.0, 28.02, 28.5,…, 31.02,31.5.
Then 29.0 falls in the class 29.02, 29.5 rather than in the class 28.52, 29.0. In
other words, with this convention, an observation on a boundary is placed in the inter-
val to the right of the boundary. This is how Minitab constructs a histogram.
1.2 pictorial and tabular Methods in Descriptive Statistics 19
Power companies need information about customer usage to obtain accurate fore-
casts of demands. Investigators from Wisconsin Power and Light determined energy
consumption (BTUs) during a particular period for a sample of 90 gas-heated
homes. An adjusted consumption value was calculated as follows:
adjusted consumption 5
consumption
(weather, in degree days)(house area)
This resulted in the accompanying data (part of the stored data set FURNACE.MTW
available in Minitab), which we have ordered from smallest to largest.
2.97 4.00 5.20 5.56 5.94 5.98 6.35 6.62 6.72 6.78
6.80 6.85 6.94 7.15 7.16 7.23 7.29 7.62 7.62 7.69
7.73 7.87 7.93 8.00 8.26 8.29 8.37 8.47 8.54 8.58
8.61 8.67 8.69 8.81 9.07 9.27 9.37 9.43 9.52 9.58
9.60 9.76 9.82 9.83 9.83 9.84 9.96 10.04 10.21 10.28
10.28 10.30 10.35 10.36 10.40 10.49 10.50 10.64 10.95 11.09
11.12 11.21 11.29 11.43 11.62 11.70 11.70 12.16 12.19 12.28
12.31 12.62 12.69 12.71 12.91 12.92 13.11 13.38 13.42 13.43
13.47 13.60 13.96 14.24 14.35 15.12 15.24 16.06 16.90 18.26
The most striking feature of the histogram in Figure 1.8 is its resemblance to a bell-
shaped curve, with the point of symmetry roughly at 10.
ExamplE 1.10
constructing a Histogram for continuous data: Equal class Widths
Determine the frequency and relative frequency for each class. Mark the
class boundaries on a horizontal measurement axis. Above each class inter-
val, draw a rectangle whose height is the corresponding relative frequency
(or frequency).
Class 12,3 32,5 52,7 72,9 92,11 112,13 132,15 152,17 172,19
Frequency 1 1 11 21 25 17 9 4 1
Relative .011 .011 .122 .233 .278 .189 .100 .044 .011
frequency
1 3 5 7 9
BTU
0
10
20
30
Percent
11 13 15 17 19
Figure 1.8 Histogram of the energy consumption data from Example 1.10
20 Chapter 1 Overview and Descriptive Statistics
From the histogram,
proportion of
observations
less than 9
< .01 1 .01 1 .12 1 .23 5 .37 (exact value 5
34
90
5 .378)
The relative frequency for the 92,11 class is about .27, so we estimate that roughly
half of this, or .135, is between 9 and 10. Thus
proportion of observations
less than 10 < .37 1 .135 5 .505 (slightly more than 50%)
The exact value of this proportion is 47y90 5 .522. ■
There are no hard-and-fast rules concerning either the number of classes or the
choice of classes themselves. Between 5 and 20 classes will be satisfactory for most
data sets. Generally, the larger the number of observations in a data set, the more
classes should be used. A reasonable rule of thumb is
number of classes < Ïnumber of observations
Equal-width classes may not be a sensible choice if there are some regions of the
measurement scale that have a high concentration of data values and other parts
where data is quite sparse. Figure 1.9 shows a dotplot of such a data set; there is
high concentration in the middle, and relatively few observations stretched out to
either side. Using a small number of equal-width classes results in almost all obser-
vations falling in just one or two of the classes. If a large number of equal-width
classes are used, many classes will have zero frequency. A sound choice is to use a
few wider intervals near extreme observations and narrower intervals in the region
of high concentration.
(a)
(b)
(c)
Figure 1.9 Selecting class intervals for “varying density” data: (a) many short equal-width
intervals; (b) a few wide equal-width intervals; (c) unequal-width intervals
constructing a Histogram for continuous data: unequal class Widths
After determining frequencies and relative frequencies, calculate the height of
each rectangle using the formula
rectangle height 5
relative frequency of the class
class width
The resulting rectangle heights are usually called densities, and the vertical
scale is the density scale. This prescription will also work when class widths
are equal.
1.2 pictorial and tabular Methods in Descriptive Statistics 21
Corrosion of reinforcing steel is a serious problem in concrete structures located
in environments affected by severe weather conditions. For this reason, research-
ers have been investigating the use of reinforcing bars made of composite material.
One study was carried out to develop guidelines for bonding glass-fiber-reinforced
plastic rebars to concrete (“Design Recommendations for Bond of GFRP Rebars
to Concrete,” J. of Structural Engr., 1996: 247–254). Consider the following 48
observations on measured bond strength:
11.5 12.1 9.9 9.3 7.8 6.2 6.6 7.0 13.4 17.1 9.3 5.6
5.7 5.4 5.2 5.1 4.9 10.7 15.2 8.5 4.2 4.0 3.9 3.8
3.6 3.4 20.6 25.5 13.8 12.6 13.1 8.9 8.2 10.7 14.2 7.6
5.2 5.5 5.1 5.0 5.2 4.8 4.1 3.8 3.7 3.6 3.6 3.6
ExamplE 1.11
Class 22,4 42,6 62,8 82,12 122,20 202,30
Frequency 9 15 5 9 8 2
Relative frequency .1875 .3125 .1042 .1875 .1667 .0417
Density .094 .156 .052 .047 .021 .004
2 4 6 8 12 20 30
Bond strength
0.00
0.05
0.10
Density
0.15
Figure 1.10 A Minitab density histogram for the bond strength data of Example 1.11 n
When class widths are unequal, not using a density scale will give a pic-
ture with distorted areas. For equal-class widths, the divisor is the same in each
density calculation, and the extra arithmetic simply results in a rescaling of the
vertical axis (i.e., the histogram using relative frequency and the one using den-
sity will have exactly the same appearance). A density histogram does have one
interesting property. Multiplying both sides of the formula for density by the class
width gives
relative frequency 5 (class width)(density) 5 (rectangle width)(rectangle height)
5 rectangle area
That is, the area of each rectangle is the relative frequency of the corresponding
class. Furthermore, since the sum of relative frequencies should be 1, the total area
of all rectangles in a density histogram is l. It is always possible to draw a histogram
The resulting histogram appears in Figure 1.10. The right or upper tail stretches
out much farther than does the left or lower tail—a substantial departure from
symmetry.
22 Chapter 1 Overview and Descriptive Statistics
so that the area equals the relative frequency (this is true also for a histogram of dis-
crete data)—just use the density scale. This property will play an important role in
motivating models for distributions in Chapter 4.
Histogram Shapes
Histograms come in a variety of shapes. A unimodal histogram is one that rises to
a single peak and then declines. A bimodal histogram has two different peaks.
Bimodality can occur when the data set consists of observations on two quite
different kinds of individuals or objects. For example, consider a large data set
consisting of driving times for automobiles traveling between San Luis Obispo,
California, and Monterey, California (exclusive of stopping time for sightseeing,
eating, etc.). This histogram would show two peaks: one for those cars that took the
inland route (roughly 2.5 hours) and another for those cars traveling up the coast
(3.5–4 hours). However, bimodality does not automatically follow in such situa-
tions. Only if the two separate histograms are “far apart” relative to their spreads
will bimodality occur in the histogram of combined data. Thus a large data set
consisting of heights of college students should not result in a bimodal histogram
because the typical male height of about 69 inches is not far enough above the typi-
cal female height of about 64–65 inches. A histogram with more than two peaks
is said to be multimodal. Of course, the number of peaks may well depend on the
choice of class intervals, particularly with a small number of observations. The
larger the number of classes, the more likely it is that bimodality or multimodality
will manifest itself.
Figure 1.11(a) shows a Minitab histogram of the weights (lb) of the 124 play-
ers listed on the rosters of the San Francisco 49ers and the New England Patriots
(teams the author would like to see meet in the Super Bowl) as of Nov. 20, 2009.
Figure 1.11(b) is a smoothed histogram (actually what is called a density estimate)
of the data from the R software package. Both the histogram and the smoothed his-
togram show three distinct peaks; the one on the right is for linemen, the middle
peak corresponds to linebacker weights, and the peak on the left is for all other
players (wide receivers, quarterbacks, etc.).
ExamplE 1.12
180
0
2
4
6
8
Percent
10
12
14
200
(a)
220 240 260
Weight
280 300 320 340
Figure 1.11 NFL player weights (a) Histogram (b) Smoothed histogram
1.2 Pictorial and Tabular Methods in Descriptive Statistics 23
150
0.000
0.002
0.004
0.006
Density
Estimate
0.008
0.010
0.012
200 250
Player Weight
300 350
(b)
We ght
Figure 1.11 (continued) n
A histogram is symmetric if the left half is a mirror image of the right half. A
unimodal histogram is positively skewed if the right or upper tail is stretched out
compared with the left or lower tail and negatively skewed if the stretching is to
the left. Figure 1.12 shows “smoothed” histograms, obtained by superimposing a
smooth curve on the rectangles, that illustrate the various possibilities.
(a) (d)
(b) (c)
Figure 1.12 Smoothed histograms: (a) symmetric unimodal; (b) bimodal; (c) positively skewed;
and (d) negatively skewed
Qualitative Data
Both a frequency distribution and a histogram can be constructed when the data
set is qualitative (categorical) in nature. In some cases, there will be a natural
ordering of classes—for example, freshmen, sophomores, juniors, seniors, graduate
students—whereas in other cases the order will be arbitrary—for example, Catholic,
Jewish, Protestant, and the like. With such categorical data, the intervals above which
rectangles are constructed should have equal width.
The Public Policy Institute of California carried out a telephone survey of 2501
California adult residents during April 2006 to ascertain how they felt about various
aspects of K–12 public education. One question asked was “Overall, how would you
rate the quality of public schools in your neighborhood today?” Table 1.2 displays
the frequencies and relative frequencies, and Figure 1.13 shows the corresponding
histogram (bar chart).
ExamplE 1.13
24 Chapter 1 Overview and Descriptive Statistics
More than half the respondents gave an A or B rating, and only slightly more than
10% gave a D or F rating. The percentages for parents of public school children were
somewhat more favorable to schools: 24%, 40%, 24%, 6%, 4%, and 2%. n
Multivariate data
Multivariate data is generally rather difficult to describe visually. Several methods for
doing so appear later in the book, notably scatterplots for bivariate numerical data.
Table 1.2 Frequency Distribution for the School Rating Data
Rating Frequency Relative Frequency
A 478 .191
B 893 .357
C 680 .272
D 178 .071
F 100 .040
Don’t know 172 .069
2501 1.000
Relative
Frequency
Rating
0.4
0.3
0.2
0.1
0.0
A B C D F Don’t know
Chart of Relative Frequency vs. Rating
Figure 1.13 Histogram of the school rating data from Minitab
10. Consider the strength data for beams given in Example
1.2.
a. Construct a stem-and-leaf display of the data.
What appears to be a representative strength
value? Do the observations appear to be highly
concentrated about the representative value or
rather spread out?
b. Does the display appear to be reasonably symmetric
about a representative value, or would you describe
its shape in some other way?
c. Do there appear to be any outlying strength values?
d. What proportion of strength observations in this
sample exceed 10 MPa?
11. The accompanying specific gravity values for various
wood types used in construction appeared in the article
“Bolted Connection Design Values Based on European
Yield Model” (J. of Structural Engr., 1993: 2169–2186):
.31 .35 .36 .36 .37 .38 .40 .40 .40
.41 .41 .42 .42 .42 .42 .42 .43 .44
.45 .46 .46 .47 .48 .48 .48 .51 .54
.54 .55 .58 .62 .66 .66 .67 .68 .75
EXERCISES Section 1.2 (10–32)
Other documents randomly have
different content
Before we leave you we wish to make one request. That is: If you
follow our advice and fit up your box as we have directed, you will
add nothing else thereto. No, nothing whatever—not even to fill up
an awkward corner.
VARIETIES.
All Will Come Right.
Things can never go badly wrong
If the heart be true and the love be strong,
For the mist, if it comes, and the weeping rain
Will be changed by the love into sunshine again.—
George MacDonald.
In Praise of Work.—Work drives away depression, whets the appetite
for food, invites sleep, promotes digestion, strengthens the muscles
and sinews, gives free circulation to the blood, stimulates the
intellectual faculties, provides the comforts of life, develops all the
powers which it brings into exercise, transforms stupid ignorance
into brilliant genius, fills the world with works of art and literature,
and develops the resources of nature. Nothing can stand before
work.
Well Named.
Bainbridge: “I know why they are called fugitive poems.”
Goldsborough: “Why?”
Bainbridge: “Because the author had to run for his life.”
Be Wise To-day.—“Dear young friends,” says a popular preacher of
the present day, “begin right. You will never find it so easy to make
any decisive step as just now. You will get less and less flexible as
you grow older. You will get set in your ways. Habits will twine their
tendrils round you and hinder your free movement. The truths of the
Gospel will become commonplace by familiarity. Associations and
companions will get more and more powerful, and you will stiffen as
a tree trunk becomes stiffened with the growth of years. Be wise to-
day.”
Printers’ Errors.
In the early half of the present century it was announced in a
London newspaper that “Sir Robert Peel, with a party of fiends, was
shooting peasants in Ireland,” whereas the Minister and his friends
were only indulging in the comparatively harmless amusement of
pheasant-shooting.
Shortly after the battle of Inkerman one of the morning papers
informed its readers that “after a desperate struggle the enemy was
repulsed with great laughter.” The omission of a single letter has
rarely perhaps played more havoc with a subject which was certainly
no laughing matter.
No more fault-finding.—The business of fault-finding would soon come
to an end if every fault-finder could be only introduced to herself.
“THE GUIDING LIGHT.”
(From the painting by Marcella Walker.)
“THE SONG THE RAINDROPS SING.”
By AUGUSTA BRYERS.
I stood within my garden,
While fell the gentle rain
On thirsty leaf and blossom,
To bid them smile again.
But my heart mourn’d sore the sunbeams
And all the joys they bring,
Till I heard, like sweet bells chiming,
The song the raindrops sing.
They told of fading flow’rets,
Of little leaves that die,
While the lark soars on unceasing,
And the sunlight floods the sky.
And they said, “We are not teardrops,
But tend’rest show’rs of love,
To cheer the heart that’s weary,
And lift sad eyes above.”
And I thought, as fell the raindrops
On fainting buds and leaves,
So may sweet words and hopeful
Wake for the heart that grieves.
That to the world’s great garden
Each may his harvest bring,
Thro’ the sunshine or the storm-cloud,
Or the song the raindrops sing.
Probability and Statistics for Engineering and the Sciences 9th Edition Devore J.L.
THE HOUSE WITH THE VERANDAH.
By ISABELLA FYVIE MAYO, Author of “Other
People’s Stairs,” “Her Object in Life,” etc.
CHAPTER III.
AS A LITTLE CHILD.
The day after the episode with Pollie Captain Grant’s letter duly
arrived.
He was only too delighted to think that his suggestion had borne
fruit with his old friend.
“It’s just settled that the Slains Castle will leave Peterhead on the
twenty-eighth of this month,” he wrote. “That will give you plenty of
time. But as we don’t touch anywhere in Great Britain, you will have
to join me here. Don’t take the long railway journey. Like a wise
man, come as far as Aberdeen in the steamer, and then you have
not much further to travel. As for the hundred pounds, I tell you, my
dear fellow, that we don’t intend to be away longer than one year,
and that is the precise figure I should name. But I’ll go on to add
that if we should happen to be a little longer, you shall not be
charged a shilling more. Persuade Mrs. Challoner to come north with
you and to bring the boy, and then she’ll see you fairly aboard, and
will note what snug quarters you’ll have, and be able to see you with
her mind’s eye all the time you are away. My wife hopes she will
come.”
“You see there is no doubt that you are to go, Charlie,” said Mrs.
Challoner. “Everything has worked to that end without one hitch. You
are to go, because you are to come back strong and well. It is
clearly the will of God that you go. I am so glad that my plans have
been carried out beyond my own power. If it had all been my
planning, I might have doubted afterwards.”
“But, Lucy,” said the young husband, his pleasant frank face shining
with the mysterious light which often illumines the countenances
which have just been bravely turned to confront the darkness of the
Valley of the Shadow of Death, “I own, with you, that it does seem
to be God’s will, but we must not think so now, unless we can
continue to think so whatever be the result. Let us say together
solemnly, ‘God’s will be done: not our will, but His.’”
And there was a little silence.
Lucy resolutely refused to consider the Grants’ invitation to
Peterhead. She warmly seconded their suggestion that Charlie
should travel in the snug, well-attended saloon of the magnificent
coasting steamer rather than in the train. She refused to listen to his
plea, that unless she would come with him this mode of travel would
part them a day or two sooner than if he went by rail. She even
nerved herself to say that when a matter of a year’s separation was
in hand, what could a day more or less signify? It would be best that
they should part in their own home, where life could go straight on,
and she could set to work at once. It would be dreadful to come
back to the house alone. (“Pollie would have a welcome for you,”
interposed young Challoner.) Besides, who knew whether she and
little Hugh might not prove to be the very worst of sailors, and then
Charlie would go off quite unhappy, thinking of the misery of their
return journey.
In all these arguments Lucy knew there was force and good sense,
but she knew, too, that but for the secret knowledge that her whole
household life was crumbling about her, they would not for one
moment have sufficed to withhold her from clinging to her husband’s
presence till the latest possible moment.
“Now, Pollie,” she said to her servant, “I am able to tell you why I
wish silence about your departure. It is finally arranged that my
husband is going for a long sea-voyage. He will be away for about a
year. When we first began to think of this it was a great
consideration that I and the boy should have you—our household
friend of seven years’ standing—to be with us. That thought was a
great comfort. Now as we find this cannot be, I think we may save
Mr. Challoner the distress of knowing about it before he goes. It
might make him wish to postpone his going. And he ought to be off
before the winter.”
Pollie was sniffing.
“I wish I’d known. I was huffed a bit thinking I wasn’t to be much
consideration in any changes you was making.”
“But what made you think of changes?” asked Mrs. Challoner. “This
is no change. All will go on the same, though with your master away
for a year, and then we shall hope that all will go on the same when
he returns. It is only since this very morning, Pollie, that we have
been quite sure this voyage can be arranged. You are the very first
person I have told. We think those in a household, whose interests
are naturally bound up together, should be the first to hear such
news.”
“Didn’t you mention it to Mrs. Brand, m’m?” asked Pollie.
“Certainly not,” Lucy answered. “There was no use mentioning it to
anybody when it might have come to nothing.”
Lucy rather wondered at this question. It did not occur to her that
Mrs. Brand had had the girl to herself on the afternoon when Lucy
had gone to see Dr. Ivery, nor that it could have been she who had
put the idea of “changes” into the head of her sister’s servant. And
Pollie kept the secret, as servants often do, or it would go hard with
many a gadding and gossipping “lady.”
“You may trust me not to speak, m’m,” said poor Pollie, still sniffing.
But she put such emphasis on “me” that it reminded her mistress
that Mrs. Brand also knew of Pollie’s imminent departure and would
be almost sure to blurt it out before Charlie. Lucy hated to ask her
sister to keep a secret from him. It would be as useless, too, as
painful, for Florence would be sure to “forget” or “not to think.” As
such “forgetfulness” or “want of thought” always pleads “meaning
no harm,” and resents indignation as harsh and uncharitable, there
is nothing to be done but to prevent their harmful doings by keeping
them out of the way.
And while Lucy was pondering what steps she could take in this
direction, she got warning that Charlie might get an impression of
trouble of some sort and question the facts out of poor Pollie
however loyally she might try to keep them back. For Pollie seemed
unable to keep from “sniffing,” and when she went into the drawing-
room the sight of Mr. Challoner brought on a very bad attack.
“Why, what’s the matter, Pollie?” asked he, quite innocently. “No bad
news from home, I hope?”
“Oh, no, sir! Yes—oh, it’s a-thinking of you a-going away, sir!” said
Pollie desperately, and rushed from the room.
“I had no idea that Pollie cared a bit for me,” observed her master. “I
thought she regarded me as a wage-paying machine, and that you
and the boy were the idols. It is quite flattering to find that I came
in for a bit of the adoration.”
“Of course Pollie cares for you after living in the house with us for
seven years, and you always so considerate and polite,” said Lucy.
“Considerate and polite!” echoed Charlie. “Well, I do hope I’m not
quite a brute in my own home, and I don’t know how many other
fellows I’ve rebuked for calling their landladies’ servants ‘the slavey.’
And they’ve often said to me, ‘Well, but it’s true,’ and I say, ‘Then it
oughtn’t to be true.’”
“Of course it should not be true,” Lucy responded.
“It is quite touching to think Pollie cares for my going away,” Charlie
went on. “But I tell you, Lucy, it occurs to me that it is not my going
that has grieved her, but the thought of your being left alone.” He
paused for a moment. “She thinks you’ll be so dull,” he said, fearing
lest his words might have brought to Lucy’s mind the idea they had
wakened in his own—to wit, that probably Pollie regarded this
temporary separation as likely to be for the earthly forever. “Well, I
can only say again,” he went on, “that my greatest comfort is that
she is with you. What a blessing we have not changed our servant
perpetually as the Brands do! How could I go off and leave you with
an utter stranger, who might desert you the next week?”
“We never know what changes may come,” said Lucy, to whom
silence began to seem criminal. “But we must trust God to provide
for emergencies. They never are so bad as they look beforehand.”
“That is quite true,” answered Charlie, “and that’s just how I feel—a
special trial has come to us, and a special blessing is prepared for it
in the shape of Pollie.”
Lucy could endure no more. She jumped up and went out of the
room so hurriedly that Charlie thought she must have heard a ring at
the door-bell. She really went to little Hugh’s bedroom, and sat down
in the darkness beside the cot where he was already asleep. She
began to revolve schemes. She would get Charlie to go with her and
the boy to spend the interval before his departure at the seaside.
That would take him away in safety from Florence’s chatter and
Pollie’s tears. It had other substantial recommendations, too, such as
she could urge. It was highly desirable that before his great journey
Charlie should shake off the little ways and weaknesses of invalidism
as a “change” helps a convalescent to do. Then she would add what
she knew would be a supreme argument with him—that her
teaching duties at the Institute would begin at the Christmas quarter,
and that she ought not to take up these labours when below par in
nerve and health after her anxious nursing. She would plead, too,
the charm of the little family of three being together quite by
themselves in a strange place, where they would be safe from any
calls or condolences or curiosity, and could wander about or rest,
just at their own sweet will. Of course, this trip would cost a little
money, but not very much, and apart from all its other charms, Lucy
felt that it would soothe her own heart in the pain of having been
forced to refuse to accompany her husband to his port of departure.
“You are a funny little woman,” said her husband, when she went
downstairs again and made these suggestions. “What else will you
think out so cleverly? I shall like this of all things; and all the while I
am away, it will be so much cheerier to have last thoughts of each
other taking quiet holiday by the sea, than of each of us mewed up
in a sick-room, coddling and being coddled.”
“And I’ll be able to do two or three sketches,” Lucy went on. “I
should like to do them with you looking on, to know if you think my
hand has lost any of its cunning. It will get me up to the mark, too. I
daresay I may do something that will more than pay for our trip.”
“Never say that women are not practical!” laughed Mr. Challoner.
“While I am only thinking of sentiment, the wifie has gone on to the
shillings! But ah, Lucy dear, don’t think I don’t know that you want
the shillings only for the sake of the sentiment!”
They sat together hand in hand. They had been married seven
years, and they were on the eve of separation. Both hearts were full
of feelings to which they dared not give utterance. One must not stir
a brimming cup lest it overflow.
“I vote we go to Deal!” cried Charlie at last.
“Isn’t it rather an east windy place for an invalid?” asked Lucy.
“But I’m not an invalid, and am not going to pass as one,” he said
gaily. “I’m a fellow starting on a sea-voyage! No, no, Lucy, don’t
doom me to some sheltered cubby hole of a ‘resort,’ where half the
population are in bath-chairs and the other half in respirators. It
would give us the blues! If you’ll let us go to Deal, I’ll promise to be
very good,” he went on with his indomitable boyishness. “I’ll only go
out when you say I may, and I’ll come in the minute you say I must.
Only let us go there!”
In the depths of his heart lay the secret thought that to go to any
place where consumptives are wont to congregate, would inevitably
fill Lucy’s mind with dire forebodings, besides exposing her to the
depressing influences of the conventional “sympathy” or forced
“hopefulness” which emanate from well-meaning landladies and
others trained by experience to regard their habitat as one of the
last stages on life’s journey.
All the next day Lucy hurriedly made her little preparations for the
trip. She said to Charlie that, if Deal suited him, and if they got snug
apartments, they might stay on till the very end, so that he need
only use their own house to rest and sleep in on his way through
London to the north.
“As for any sea-going things you want—lockers, waterproofs, and so
on—we can get them at Deal,” she said.
Only when all was in readiness for their start, while the cab which
was to take them to the station actually stood at the door, did she
post a letter to Mrs. Brand, giving the first intimation of their present
move and of Charlie’s future journey.
“We have had it in view for some time,” she wrote, for it was
impossible for Lucy Challoner to be inferentially untruthful, “but it
was only decided the day before yesterday.”
After the railway journey, whose slight fatigue the convalescent bore
capitally, they went straight to an hotel and had lunch, and there
Lucy left her husband and little Hugh, while she went in quest of
“apartments.” She wanted cleanliness, economy, and a sea view.
Like all people who know what they want she was not long in
getting suited. She decided on the second set of rooms at which she
looked, preferring them to the first, because being upstairs, they
commanded a wider horizon. Also she felt attracted to the second
landlady, a quiet, grave, middle-aged woman of few words, whose
chambers, with their well-kept old-fashioned furniture had—what is
the greatest charm of hired rooms—no suggestion of previous
temporary occupancy.
The landlady had everything made snug before their arrival; the
curtains were drawn, a cheerful fire was ablaze, and the lit lamp
stood in the centre of the table spread with pretty blue crockery and
provided with ham, eggs and toast. There are few who can wholly
resist the genial influence of such surroundings. Charlie and Lucy
Challoner yielded themselves up to them, and little Hugh danced and
clapped his hands. Lucy felt as if she was happier than she had ever
thought to be again. Safe from the impending worries of the last few
days, it seemed as if the great anxiety which hung like a Damocles’
sword over her life was for the time held off.
“I believe this is really doing you good, Lucy,” said her husband. “For
me, I feel a different man already.”
The bed-chamber opened from the parlour, and Hugh was not
allowed to be long in seeking the little cot which the landlady had
fixed up for him in his parents’ room. But while Lucy passed to and
fro unpacking and preparing for the night, Mr. Challoner and Hugh
got behind the window curtains and shut themselves away from the
cheery room and out with the misty sea view. Lucy could hear them
talking behind the drapery.
“There go the ships!” said the young father. “Look, Hugh, you can
see them by their lights! Look what a lot of them there are! And how
many lights they are showing!”
“How glad the sailors must be to see land again!” lisped Hugh. “They
must feel they are safe at last!”
“Glad to feel they are nearly home at last, Hugh,” corrected his
father. “For ships are in much more danger when they are near land
than when they are out in mid-ocean. What looks safest isn’t always
safe, my boy.”
“I’d like to go on a ship!” said Hugh.
“I daresay you will go in time, sonny,” returned Mr. Challoner. “By-
and-by, Hughie, I am going on a big ship—a big ship with three
masts—and I am going for a long, long voyage. And you’ll have to
take care of mamma while I am away. And then when I come back,
and you grow up, very likely you will go for some long voyage, and
then I will stay at home and take care of mamma.”
“Are you going to-morrow, papa?” said the little voice in an awed
whisper, and Lucy heard a movement as if the curly head snuggled
on papa’s shoulder. How good it was of Charlie to tell the child
himself! The thought of having to do so had haunted her, for she
measured her little lad’s love for his father by what she knew it
meant in his life rather than by that childish inadequacy for profound
emotion which makes a child such a poignantly pathetic figure when
it appears on any tragic scene.
“To-morrow!” echoed papa in his brightest tone. “No, indeed, not for
many days—two or three weeks! We are going to have such a happy
time. We’ll go out and pick up shells, and if there is a very warm
sunshiny day with only little waves on the sea, maybe we’ll go out in
a boat—that’s if mamma will come with us,” he added, remembering
his promised obedience to her discretion.
Hugh broke away from his father and ran back into the room.
“Oh, mamma,” he cried, “you will let us go out in a boat, won’t you?
If the day’s sunshiny—an’ it’s sure to be—and if the waves are ever
such little teeny weenies! Oh, mamma, yes!”
“We shall see, Hugh. We will do what seems wise. It is time you
went to bed.”
In commanding her voice to be steady, it sounded sharp and hard. It
checked Hugh’s ecstasy, and brought his father out from behind the
curtain. She felt that Charlie’s expression was surprised, and that she
would break down utterly if she had to meet his eyes. Without
looking back, she caught the hand of the silenced and awed Hugh,
and hurried him away to the other room.
Neither of them spoke while she helped the child undress. Even her
eyes did not answer his, though she saw his blue orbs raised
wistfully. He knelt down and said his little prayer, the “Our Father,”
and the little verse of godly nursery tradition—
“Now I lay me down to sleep,
I pray the Lord my soul to keep.
If I die before I wake,
I pray the Lord my soul to take.”
Then he passed on to the extempore petitions, in which he was
always allowed free expression. To-night they came with unwonted
faltering and hesitation. The child-soul was aware of a disturbed
atmosphere around it—of groping somehow in darkness uncertain of
itself.
“Please, God, bless dear papa—and dear mamma—and dear Pollie—
and dear Aunt Florence—and—and dear Mr. Brand—and dear Mrs.
May—and make me a good boy. Amen.”
“Dear Mrs. May?” Who was that? Lucy had to pause for a moment
ere she remembered that this was the name of the landlady, whom
the child had seen for the first time two hours before, and who had
won his heart by bringing in for him a special tea plate painted with
a picture of Walmer Castle!
It was only as Hugh stood in his little night-gown, half stepping into
the cot, that he said, almost with a whimper—
“I never kissed good night to papa.”
“Then run away and kiss him now,” said Lucy in her natural tone.
Hugh was himself again in a second, scampering away, kicking aside
his flowing white robes with his little pink feet, and bestowing upon
his father what was evidently an ecstasy of hugs, accompanied by a
perfect storm of hearty “smacks.” Then he gallopaded back, hopped
into bed, held out his arms to his mother, and clasped her down to
himself in a rapturous embrace, to which she responded with an
added tenderness born of a little remorse for the foolish pang he
had given her.
“But you will let us go in the boat?” he whispered before he released
her.
She kissed him again as her only answer, and went back into the
sitting-room. Her husband looked up at her with some solicitude,
and drew up a chair for her at his side.
“I’m afraid you have been very much overwrought, Lucy,” he said.
“It’s no use saying ‘No.’ I can hear it in your voice. When you went
out of the room, I thought you were actually crying. I was quite
uneasy till I saw you come in again all right.”
“I’m afraid I’m not fit to be your nurse if I frighten my patient,” she
said, forcing a smile. “I was very silly. I was not crying. I’m rather
afraid I was cross for a minute.”
“Cross?” questioned her husband incredulously.
“Yes,” she answered. “It vexed me to think how soon Hugh forgot
about your going away and thought only of getting a sail in a boat.”
Charlie pondered for a second, for the whole thing had escaped him.
“I know I was foolish,” she said. “Hughie is only a little child, and
cannot realise things. I’m sure he would have begun to cry if you’d
said you were going away to-morrow. But when you said ‘not for two
or three weeks’ he could put it right out of his head. It’s only
childlike, after all.”
“But we ought all to be childlike, ought we not, Lucy?” answered
Charlie thoughtfully. “And we are, more or less, even the worst of us.
All who love know they will have to part; but they don’t go on
thinking about it all the years they are together. And days are as
long as years to poor little Hughie.”
“And then he was so taken up about going in the boat!” said Lucy,
with a half-reluctant smile.
“Well, and why not?” asked Charlie undauntedly, “We ought to be
like that, too—taken up with what is our present business—this is a
great business for Hughie—and especially with what we may bring
about by our own efforts, as he felt he might by his coaxings. That is
our affair—not something that is going to happen some time or
other, without any help of ours.”
“I know I’m an idiot,” said Lucy humbly; “but so much seemed to
come at once! He actually prayed in the same breath for you and for
Mrs. May—the strange landlady downstairs,” she explained. And she
reminded him of the little incident of the picture plate.
Charlie leaned back in his chair and enjoyed a quiet deep laugh.
“And there he is altogether right,” he said, “for love is all off the
same piece whether it’s in a great fold that ties two lives together—
like yours and mine, Lucy—or but some little scrap that just binds up
a pricked finger. For God is Love, and therefore Love is God, and any
affection that gets taken outside that unity is—just—an idol!”
“‘And the idols He shall utterly abolish,’” he added after a moment’s
pause. As he spoke, he drew back the curtain. The moon was up,
silvering the mist that hung low over the Channel. They sat side by
side in silence. Lucy was trying to gather from her husband’s words
some cheer for the one trial she could not feel it right to confide to
him—the only secret she had ever withheld from him. After a fashion
of which most of us have had pathetic experience, she strove to get
an oracle at a venture.
“It comforts me so to talk to you,” she said. “You smooth things out.
Worries will come, and jarrings. What shall I do when you are not
here to say good words to me?”
“You will say them to yourself,” he answered. “You will hear them in
your own heart. Sometimes, indeed, it seems to me as if I merely
hear your thoughts and put them into words for you.”
(To be continued.)
OUR PUZZLE REPORT: A WELL-
BRED GIRL (No. 2).
SOLUTION.
A Well-Bred Girl (No. 2).
1. A well-bred girl always makes herself pleasant to those about her,
especially to the lonely and unhappy.
2. A well-bred girl always dresses without extravagance, and yet
avoids severity.
3. A well-bred girl always eats and drinks noiselessly, not even
excepting soup.
4. A well-bred girl always refrains from discussing articles of diet
during meals.
5. A well-bred girl always talks quietly.
6. A well-bred girl always upholds her own dignity without apparent
effort.
7. A well-bred girl always remembers that striking manners are bad
manners.
56, Paternoster Row, Dec., 1898.
Prize Winners.
Six Shillings and Eightpence Each.
Eliza Acworth, Blenheim Mount, Bradford.
Dora Mary Barling, Farnham, Surrey.
M. A. C. Crabb, Hemel Hempstead, Herts.
Louie Drury, Edith Road, West Kensington.
Alfred J. Knight, Edith Road, West Kensington.
F. Miller, Brecknock Road, N.
Mrs. Nicholls, Parlors Hall, Bridgnorth.
Lilla Patterson, Kilmore, Holywood.
Gertrude M. Stott, Yarburgh Street, Alexandra Park, Manchester.
Norah M. Sullivan, Otranto Place, Sandycove, Co. Dublin.
Emily M. Tattam, Green Lanes, N.
Bettie Temple, Brockley, S.E.
Ethel Tomlinson, Burton-on-Trent.
Ada G. Waide, Methley, Leeds.
Jeanie Walker, Esk Terrace, Whitby.
Eleanor Whitcher, The Drive, West Brighton.
Very Highly Commended.
Ethel B. Angear, Lily Belling, E. Blunt, Margaret E. Bourne, Annie J.
Cather, M. J. Champneys, Maggie Coombes, Minnie Cornwell,
Leonard Duncan, Mrs. Fleming, A. and F. Fooks, Miss Fryer, Margaret
S. Hall, L. A. E. Hartshorn, Helen Jones, Alice M. Kellett, B. M.
Linington, E. E. Lockyear, Annie Manderson, Rev. H. Milnes, S. H.
Phillips, Robina Potts, Mrs. Prestige, Ada Rickards, John Rodway,
Emma M. Sanderson, Helen Simpson, Mildred M. Skrine, Helen
Smith, Annie Stanser, Ellen C. Tarrant, Agnes M. Vincent, Anna
Walker, N. H. White, Emily M. P. Wood, Elizabeth Yarwood, Edith M.
Younge, Helen B. Younger, Euphemia T. Yule.
Highly Commended.
Mrs. Allen, Margaret M. Anthony, Emily Bergin, Alice Mary Blake, Ada
K. Bullough, Martha Cairns, Robert H. Carmichael, Mrs. F. Chettle,
Dora Clarke, Alice M. Cooper, Lillian Clews, Mrs. Crossman, E. M.
Dickson, Mrs. F. Farrar, Florence Graves, Marie E. Hancock, Ellie
Hanlon, Lizzie J. Hetherington, Edith L. Howse, M. A. J. Hunter, E.
Marian Jupe, Annie G. Luck, Alice Luckhurst, Jennie M. McCall, Ethel
C. McMaster, Helen A. Manning, Geo. H. Manning, S. Mason, Mrs. A.
Motum, E. K. Palmer, Hilda Petley, Hannah E. Powell, Ellen M. Price,
Helen J. Ransom, H. F. Richards, Henzell G. Robson, Chas. Severs, A.
A. L. Shave, Agnes A. M. Shearer, Merriott T. Smiley, Gertrude
Stirling, M. Stuart, Theodore J. Tasker, Constance Taylor, Marie
Threlfall, Violet C. Todd, Queenie Tyssen, Mary Watts, Alice
Woodhead, H. F. Yeoman.
EXAMINERS’ REPORT.
The thirst for information continues, and again we have been
inundated with solutions. Happily for our peace of mind, many of
them were far from perfect, and the more subtle methods of our Art
have not been called into requisition. In short, the prize solutions
were perfect; the rest were not.
There were some points about the puzzle which deserved rather
more attention than the casual solver was inclined to bestow, and it
seems to be necessary to refer to them in detail. First let us deal
with the supposed mistakes. One solver, with admirable conciseness,
thus calls attention to them—
“Three mistakes, line 1 an h too many; line 2 an h too few; line 6 an
s too many.”
Now, taking lines 1 and 2 together we find h × 2 ÷ erself + pleasant
tot. That worked out (on somewhat doubtful mathematical lines)
yields “h|erself pleasant to t|h”; then we find ose about h, and all is
as straightforward as possible. So much for “mistakes” one and two.
And the third “mistake” is no worse, for the correct reading of the
part referred to is not “es minus ss” but “seven es minus ss,” which
introduces the word “even” into the solution. The number of expert
solvers who failed at this point was quite astonishing.
In very many solutions the word around was substituted for about in
the first sentence. It could hardly be regarded as a bad mistake; but,
inasmuch as the letters o s e were on only three sides of the h,
about had to be regarded as the more perfect reading.
The next point to be noticed is the omission in many solutions of the
word always in the second admonition. But the more careful solvers
noticed that in this instance five stars were employed, and rightly
divined the meaning of the fifth. The fact that the word always
appeared in every other sentence ought to have opened the eyes of
those who fell into the trap.
The “noise less ly” in the third admonition also gave much trouble,
and various quaint solutions were suggested, as, for instance,
“leisurely” and “sparingly.”
These two solutions, it should be noted, were generally associated
with the particular kind of soup indicated in the puzzle:—
“A well-bred girl always eats and drinks sparingly (or, leisurely) not
even excepting mock-turtle soup.”
It would not have occurred to us that active greediness in the
presence of that particular delicacy was sufficiently usual to call for
rebuke. Most unhappily, the fault we do indicate is quite as common
as it is unpleasant.
The die in the fourth admonition was generally identified; but a few
solvers could make nothing of it excepting weight. A weight naturally
suggests a balance, and accordingly we learn that “A well-bred girl
always refrains from balancing articles of weight during meals.” We
believe she does, though experiments of a kindred nature with the
lighter articles at hand are not wholly unknown to us. They seldom
prove much, excepting the clumsiness of the experimenter, and they
do not conduce to that repose which is the essence of refinement.
One would have thought that the fifth sentence was simplicity itself;
but many solvers wrote “speaks” instead of “talks.” It must have
required a violent effort of imagination to convert the depicted stalks
into “sspeaks”!
In the solutions of the last sentence “rough” continually appeared
instead of “striking.” It could not be regarded as a satisfactory
interpretation of the picture which is obviously T striking M. One
most interesting reading of the sentence deserves to be recorded—
“A well-bred girl always remembers that boys’ manners are bad
manners”!
Truly there are many girls whose conduct towards their brothers
seems to be based upon some such theory as this. They are not the
most pleasing type of maidens, and as to our opinion of their “good
breeding,” let us add an eighth admonition—
“A well-bred girl never nags.”
SHEILA.
A STORY FOR GIRLS.
By EVELYN EVERETT-GREEN, Author of
“Greyfriars,” “Half-a-dozen Sisters,” etc.
CHAPTER II.
UNCLE TOM.
E came from the darkness without into the warmth
and brightness of the hall, and threw back his heavy
Inverness cape, revealing a square, bearded face, a
broad, well-knit figure, and a pair of shrewd and not
unkindly brown eyes.
“You are our Uncle Tom,” said Oscar, going forward to
meet him. “We are very glad to see you. It is kind of
you to come.”
“Well, well, boy, duty is duty all the world over. I
would have come a fortnight ago, but it was
impossible. No disrespect meant to your father, you
understand. So you are poor Maud’s children, are you? We always
called her ‘poor Maud’ at home, though I scarcely know why. She
was happy enough, I know, but she seemed like one dead to us
somehow. You are a bit like what she was as a girl, I can see.
Perhaps the sister favours her more,” and he looked across at Sheila,
who now came forward with outstretched hand.
“How do you do, Uncle Tom? I hope you are not very cold. It has
been quite warm till yesterday, and then the cold came back. We are

More Related Content

PDF
Chap 1 (1).pdf
PDF
Lecturenotesstatistics
PDF
Introductory Statistics Explained.pdf
PDF
probabilidades.pdf
PDF
A Modern Introduction To Probability And Statistics Understanding Why And How...
PDF
Solution Manual for Introductory Statistics 9th by Mann
PDF
Probability_and_Statistics_for_Engineers.pdf
PDF
Ronald_E_Walpole,_Raymond_H_Myers,_Sharon_L_Myers,_Keying_E_Ye_Probability.pdf
Chap 1 (1).pdf
Lecturenotesstatistics
Introductory Statistics Explained.pdf
probabilidades.pdf
A Modern Introduction To Probability And Statistics Understanding Why And How...
Solution Manual for Introductory Statistics 9th by Mann
Probability_and_Statistics_for_Engineers.pdf
Ronald_E_Walpole,_Raymond_H_Myers,_Sharon_L_Myers,_Keying_E_Ye_Probability.pdf

Similar to Probability and Statistics for Engineering and the Sciences 9th Edition Devore J.L. (20)

PDF
statistics and Probability Analysis for beginners
PDF
An Introduction To Probability And Statistical Inference 1st Edition George G...
PDF
2020-2021 EDA 101 Handout.pdf
PDF
Statistical methods 2nd ed Edition Rudolf J. Freund
PPT
1608 probability and statistics in engineering
PDF
Essential_20Statistics_20for_20Data_20Science.pdf
PDF
Lecturenotesstatistics
PDF
Allan G. Bluman - Elementary Statistics_ A Step by Step Approach, 7th Edition...
PDF
Navarro & Foxcroft (2018). Learning statistics with jamovi (1).pdf
PDF
Test Bank for Stats Data and Models 5th by De Veaux
PDF
An Introduction to Statistical Inference and Its Applications.pdf
PDF
Introductory Statistics for Engineering Experimentation 1st Edition Peter R. ...
PDF
Introductory Statistics for Engineering Experimentation 1st Edition Peter R. ...
PDF
Statistics for economists
PDF
Elementary_Statistics_A_Step_By_Step_Approach_9th_ed_Bluman.pdf
PDF
ELEMENTARY STATISCS ANSWER & QUETIONS.pdf
PDF
Manual Solution Probability and Statistic Hayter 4th Edition
PDF
(eBook PDF) Elementary Statistics: A Step by Step Approach 9th Edition
PPT
Math presentation
statistics and Probability Analysis for beginners
An Introduction To Probability And Statistical Inference 1st Edition George G...
2020-2021 EDA 101 Handout.pdf
Statistical methods 2nd ed Edition Rudolf J. Freund
1608 probability and statistics in engineering
Essential_20Statistics_20for_20Data_20Science.pdf
Lecturenotesstatistics
Allan G. Bluman - Elementary Statistics_ A Step by Step Approach, 7th Edition...
Navarro & Foxcroft (2018). Learning statistics with jamovi (1).pdf
Test Bank for Stats Data and Models 5th by De Veaux
An Introduction to Statistical Inference and Its Applications.pdf
Introductory Statistics for Engineering Experimentation 1st Edition Peter R. ...
Introductory Statistics for Engineering Experimentation 1st Edition Peter R. ...
Statistics for economists
Elementary_Statistics_A_Step_By_Step_Approach_9th_ed_Bluman.pdf
ELEMENTARY STATISCS ANSWER & QUETIONS.pdf
Manual Solution Probability and Statistic Hayter 4th Edition
(eBook PDF) Elementary Statistics: A Step by Step Approach 9th Edition
Math presentation
Ad

Recently uploaded (20)

PDF
O7-L3 Supply Chain Operations - ICLT Program
PPTX
Lesson notes of climatology university.
PPTX
202450812 BayCHI UCSC-SV 20250812 v17.pptx
PDF
Complications of Minimal Access Surgery at WLH
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PPTX
Cell Structure & Organelles in detailed.
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PDF
VCE English Exam - Section C Student Revision Booklet
PDF
Chinmaya Tiranga quiz Grand Finale.pdf
PDF
Computing-Curriculum for Schools in Ghana
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
DOC
Soft-furnishing-By-Architect-A.F.M.Mohiuddin-Akhand.doc
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PDF
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
PDF
RMMM.pdf make it easy to upload and study
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PPTX
Pharma ospi slides which help in ospi learning
PPTX
GDM (1) (1).pptx small presentation for students
O7-L3 Supply Chain Operations - ICLT Program
Lesson notes of climatology university.
202450812 BayCHI UCSC-SV 20250812 v17.pptx
Complications of Minimal Access Surgery at WLH
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
Cell Structure & Organelles in detailed.
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
VCE English Exam - Section C Student Revision Booklet
Chinmaya Tiranga quiz Grand Finale.pdf
Computing-Curriculum for Schools in Ghana
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
Soft-furnishing-By-Architect-A.F.M.Mohiuddin-Akhand.doc
Module 4: Burden of Disease Tutorial Slides S2 2025
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
RMMM.pdf make it easy to upload and study
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
Final Presentation General Medicine 03-08-2024.pptx
O5-L3 Freight Transport Ops (International) V1.pdf
Pharma ospi slides which help in ospi learning
GDM (1) (1).pptx small presentation for students
Ad

Probability and Statistics for Engineering and the Sciences 9th Edition Devore J.L.

  • 1. Read Anytime Anywhere Easy Ebook Downloads at ebookmeta.com Probability and Statistics for Engineering and the Sciences 9th Edition Devore J.L. https://ebookmeta.com/product/probability-and-statistics- for-engineering-and-the-sciences-9th-edition-devore-j-l/ OR CLICK HERE DOWLOAD EBOOK Visit and Get More Ebook Downloads Instantly at https://ebookmeta.com
  • 3. 5 REASONS to buy your textbooks and course materials at SAVINGS: Prices up to 75% off, daily coupons, and free shipping on orders over $25 CHOICE: Multiple format options including textbook, eBook and eChapter rentals CONVENIENCE: Anytime, anywhere access of eBooks or eChapters via mobile devices SERVICE: Free eBook access while your text ships, and instant access to online homework products STUDY TOOLS: Study tools* for your text, plus writing, research, career and job search resources *availability varies 1 2 3 4 5 Find your course materials and start saving at: www.cengagebrain.com Engaged with you. www.cengage com Source Code: 14M-AA0107 Copyright 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
  • 4. NINTH EdITIoN Australia Brazil Mexico Singapore United Kingdom United States Probability and Statistics for Engineering and the Sciences Jay Devore California Polytechnic State University, San Luis Obispo Copyright 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
  • 5. © 2016, 2012, 2009, Cengage Learning ALL RIGHTS RESERVED. No part of this work covered by the copyright herein may be reproduced, transmitted, stored, or used in any form or by any means graphic, electronic, or mechanical, including but not limited to photocopying, recording, scanning, digitizing, taping, web distribution, information networks, or information storage and retrieval systems, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without the prior written permission of the publisher. Unless otherwise noted, all items © Cengage Learning Library of Congress Control Number: 2014946237 ISBN: 978-1-305-25180-9 Cengage Learning 20 Channel Center Street Boston, MA 02210 USA Cengage Learning is a leading provider of customized learning solutions with office locations around the globe, including Singapore, the United Kingdom, Australia, Mexico, Brazil, and Japan. Locate your local office at www.cengage.com/global. Cengage Learning products are represented in Canada by Nelson Education, Ltd. To learn more about Cengage Learning Solutions, visit www.cengage.com. Purchase any of our products at your local college store or at our preferred online store www.cengagebrain.com. Probability and Statistics for Engineering and the Sciences, Ninth Edition Jay L. Devore Senior Product Team Manager: Richard Stratton Senior Product Manager: Molly Taylor Senior Content Developer: Jay Campbell Product Assistant: Spencer Arritt Media Developer: Andrew Coppola Marketing Manager: Julie Schuster Content Project Manager: Cathy Brooks Art Director: Linda May Manufacturing Planner: Sandee Milewski IP Analyst: Christina Ciaramella IP Project Manager: Farah Fard Production Service and Compositor: MPS Limited Text and Cover Designer: C Miller Design For product information and technology assistance, contact us at Cengage Learning Customer & Sales Support, 1-800-354-9706 For permission to use material from this text or product, submit all requests online at www.cengage.com/permissions Further permissions questions can be emailed to permissionrequest@cengage.com Printed in the United States of America Print Number: 01 Print Year: 2014 WCN: 02-200-203
  • 6. To my beloved grandsons Philip and Elliot, who are highly statistically significant.
  • 8. vii 1 Overview and Descriptive Statistics Introduction 1 1.1 Populations, Samples, and Processes 3 1.2 Pictorial and Tabular Methods in Descriptive Statistics 13 1.3 Measures of Location 29 1.4 Measures of Variability 36 Supplementary Exercises 47 Bibliography 51 2 Probability Introduction 52 2.1 Sample Spaces and Events 53 2.2 Axioms, Interpretations, and Properties of Probability 58 2.3 Counting Techniques 66 2.4 Conditional Probability 75 2.5 Independence 85 Supplementary Exercises 91 Bibliography 94 3 Discrete Random Variables and Probability Distributions Introduction 95 3.1 Random Variables 96 3.2 Probability Distributions for Discrete Random Variables 99 3.3 Expected Values 109 3.4 The Binomial Probability Distribution 117 3.5 Hypergeometric and Negative Binomial Distributions 126 3.6 The Poisson Probability Distribution 131 Supplementary Exercises 137 Bibliography 140 Contents
  • 9. viii Contents 4 Continuous Random Variables and Probability Distributions Introduction 141 4.1 Probability Density Functions 142 4.2 Cumulative Distribution Functions and Expected Values 147 4.3 The Normal Distribution 156 4.4 The Exponential and Gamma Distributions 170 4.5 Other Continuous Distributions 177 4.6 Probability Plots 184 Supplementary Exercises 193 Bibliography 197 5 Joint Probability Distributions and Random Samples Introduction 198 5.1 Jointly Distributed Random Variables 199 5.2 Expected Values, Covariance, and Correlation 213 5.3 Statistics and Their Distributions 220 5.4 The Distribution of the Sample Mean 230 5.5 The Distribution of a Linear Combination 238 Supplementary Exercises 243 Bibliography 246 6 Point Estimation Introduction 247 6.1 Some General Concepts of Point Estimation 248 6.2 Methods of Point Estimation 264 Supplementary Exercises 274 Bibliography 275 7 Statistical Intervals Based on a Single Sample Introduction 276 7.1 Basic Properties of Confidence Intervals 277 7.2 Large-Sample Confidence Intervals for a Population Mean and Proportion 285
  • 10. Contents ix 7.3 Intervals Based on a Normal Population Distribution 295 7.4 Confidence Intervals for the Variance and Standard Deviation of a Normal Population 304 Supplementary Exercises 307 Bibliography 309 8 Tests of Hypotheses Based on a Single Sample Introduction 310 8.1 Hypotheses and Test Procedures 311 8.2 z Tests for Hypotheses about a Population Mean 326 8.3 The One-Sample t Test 335 8.4 Tests Concerning a Population Proportion 346 8.5 Further Aspects of Hypothesis Testing 352 Supplementary Exercises 357 Bibliography 360 9 Inferences Based on Two Samples Introduction 361 9.1 z Tests and Confidence Intervals for a Difference Between Two Population Means 362 9.2 The Two-Sample t Test and Confidence Interval 374 9.3 Analysis of Paired Data 382 9.4 Inferences Concerning a Difference Between Population Proportions 391 9.5 Inferences Concerning Two Population Variances 399 Supplementary Exercises 403 Bibliography 408 10 The Analysis of Variance Introduction 409 10.1 Single-Factor ANOVA 410 10.2 Multiple Comparisons in ANOVA 420 10.3 More on Single-Factor ANOVA 426 Supplementary Exercises 435 Bibliography 436
  • 11. x Contents 11 Multifactor Analysis of Variance Introduction 437 11.1 Two-Factor ANOVA with Kij 5 1 438 11.2 Two-Factor ANOVA with Kij . 1 451 11.3 Three-Factor ANOVA 460 11.4 2p Factorial Experiments 469 Supplementary Exercises 483 Bibliography 486 12 Simple Linear Regression and Correlation Introduction 487 12.1 The Simple Linear Regression Model 488 12.2 Estimating Model Parameters 496 12.3 Inferences About the Slope Parameter b1 510 12.4 Inferences Concerning mY ? x* and the Prediction of Future Y Values 519 12.5 Correlation 527 Supplementary Exercises 437 Bibliography 541 13 Nonlinear and Multiple Regression Introduction 542 13.1 Assessing Model Adequacy 543 13.2 Regression with Transformed Variables 550 13.3 Polynomial Regression 562 13.4 Multiple Regression Analysis 572 13.5 Other Issues in Multiple Regression 595 Supplementary Exercises 610 Bibliography 618 14 Goodness-of-Fit Tests and Categorical Data Analysis Introduction 619 14.1 Goodness-of-Fit Tests When Category Probabilities Are Completely Specified 620 14.2 Goodness-of-Fit Tests for Composite Hypotheses 627 14.3 Two-Way Contingency Tables 639
  • 12. Contents xi Supplementary Exercises 648 Bibliography 651 15 Distribution-Free Procedures Introduction 652 15.1 The Wilcoxon Signed-Rank Test 653 15.2 The Wilcoxon Rank-Sum Test 661 15.3 Distribution-Free Confidence Intervals 667 15.4 Distribution-Free ANOVA 671 Supplementary Exercises 675 Bibliography 677 16 Quality Control Methods Introduction 678 16.1 General Comments on Control Charts 679 16.2 Control Charts for Process Location 681 16.3 Control Charts for Process Variation 690 16.4 Control Charts for Attributes 695 16.5 CUSUM Procedures 700 16.6 Acceptance Sampling 708 Supplementary Exercises 714 Bibliography 715 Appendix Tables A.1 Cumulative Binomial Probabilities A-2 A.2 Cumulative Poisson Probabilities A-4 A.3 Standard Normal Curve Areas A-6 A.4 The Incomplete Gamma Function A-8 A.5 Critical Values for t Distributions A-9 A.6 Tolerance Critical Values for Normal Population Distributions A-10 A.7 Critical Values for Chi-Squared Distributions A-11 A.8 t Curve Tail Areas A-12 A.9 Critical Values for F Distributions A-14 A.10 Critical Values for Studentized Range Distributions A-20 A.11 Chi-Squared Curve Tail Areas A-21 A.12 Approximate Critical Values for the Ryan-Joiner Test of Normality A-23 A.13 Critical Values for the Wilcoxon Signed-Rank Test A-24
  • 13. xii Contents A.14 Critical Values for the Wilcoxon Rank-Sum Test A-25 A.15 Critical Values for the Wilcoxon Signed-Rank Interval A-26 A.16 Critical Values for the Wilcoxon Rank-Sum Interval A-27 A.17 b Curves for t Tests A-28 Answers to Selected Odd-Numbered Exercises A-29 Glossary of Symbols/Abbreviations G-1 Index I-1
  • 14. xiii Purpose The use of probability models and statistical methods for analyzing data has become common practice in virtually all scientific disciplines. This book attempts to provide a comprehensive introduction to those models and methods most likely to be encoun­ tered and used by students in their careers in engineering and the natural sciences. Although the examples and exercises have been designed with scientists and engi­ neers in mind, most of the methods covered are basic to statistical analyses in many other disciplines, so that students of business and the social sciences will also profit from reading the book. Approach Students in a statistics course designed to serve other majors may be initially skeptical of the value and relevance of the subject matter, but my experience is that students can be turned on to statistics by the use of good examples and exercises that blend their everyday experiences with their scientific interests. Consequently, I have worked hard to find examples of real, rather than artificial, data—data that someone thought was worth collecting and analyzing. Many of the methods presented, especially in the later chapters on statistical inference, are illustrated by analyzing data taken from published sources, and many of the exercises also involve working with such data. Sometimes the reader may be unfamiliar with the context of a particular problem (as indeed I often was), but I have found that students are more attracted by real problems with a somewhat strange context than by patently artificial problems in a familiar setting. Mathematical Level The exposition is relatively modest in terms of mathematical development. Substantial use of the calculus is made only in Chapter 4 and parts of Chapters 5 and 6. In par­ ticular, with the exception of an occasional remark or aside, calculus appears in the inference part of the book only—in the second section of Chapter 6. Matrix algebra is not used at all. Thus almost all the exposition should be accessible to those whose mathematical background includes one semester or two quarters of differential and integral calculus. Content Chapter 1 begins with some basic concepts and terminology—population, sample, descriptive and inferential statistics, enumerative versus analytic studies, and so on— and continues with a survey of important graphical and numerical descriptive methods. A rather traditional development of probability is given in Chapter 2, followed by prob­ ability distributions of discrete and continuous random variables in Chapters 3 and 4, respectively. Joint distributions and their properties are discussed in the first part of Chapter5.Thelatterpartofthischapterintroducesstatisticsandtheirsamplingdistri­ butions, which form the bridge between probability and inference. The next three Preface
  • 15. xiv Preface chapters cover point estimation, statistical intervals, and hypothesis testing based on a single sample. Methods of inference involving two independent samples and paired data are presented in Chapter 9. The analysis of variance is the subject of Chapters 10 and 11 (single­factor and multifactor, respectively). Regression makes its initial appearance in Chapter 12 (the simple linear regression model and correlation) and returns for an extensive encore in Chapter 13. The last three chapters develop chi­squared methods, distribution­free (nonparametric) procedures, and techniques from statistical quality control. Helping Students Learn Although the book’s mathematical level should give most science and engineering students little difficulty, working toward an understanding of the concepts and gaining an appreciation for the logical development of the methodology may sometimes require substantial effort. To help students gain such an understanding and appreci­ ation, I have provided numerous exercises ranging in difficulty from many that involve routine application of text material to some that ask the reader to extend concepts discussed in the text to somewhat new situations. There are many more exercises than most instructors would want to assign during any particular course, but I recommend that students be required to work a substantial number of them. In a problem­solving discipline, active involvement of this sort is the surest way to identify and close the gaps in understanding that inevitably arise. Answers to most odd­numbered exercises appear in the answer section at the back of the text. In addition, a Student Solutions Manual, consisting of worked­out solutions to virtu­ ally all the odd­numbered exercises, is available. To access additional course materials and companion resources, please visit www.cengagebrain.com. At the CengageBrain.com home page, search for the ISBN of your title (from the back cover of your book) using the search box at the top of the page. This will take you to the product page where free companion resources can be found. New for This Edition The major change for this edition is the elimination of the rejection region approach to hypothesis testing. Conclusions from a hypothesis­testing analysis are now based entirely on P­values. This has necessitated completely rewriting Section 8.1, which now introduces hypotheses and then test procedures based on P­values. Substantial revision of the remaining sections of Chapter 8 was then required, and this in turn has been propagated through the hypothesis­testing sections and subsections of Chapters 9–15. Many new examples and exercises, almost all based on real data or actual problems. Some of these scenarios are less technical or broader in scope than what has been included in previous editions—for example, investigating the nocebo effect (the inclination of those told about a drug’s side effects to experi­ ence them), comparing sodium contents of cereals produced by three different manufacturers, predicting patient height from an easy­to­measure anatomical characteristic, modeling the relationship between an adolescent mother’s age and the birth weight of her baby, assessing the effect of smokers’ short­term abstinence on the accurate perception of elapsed time, and exploring the impact of phrasing in a quantitative literacy test. More examples and exercises in the probability material (Chapters 2–5) are based on information from published sources.
  • 16. Preface xv The exposition has been polished whenever possible to help students gain a better intuitive understanding of various concepts. Acknowledgments My colleagues at Cal Poly have provided me with invaluable support and feedback over the years. I am also grateful to the many users of previous editions who have made suggestions for improvement (and on occasion identified errors). A special note of thanks goes to Jimmy Doi for his accuracy checking and to Matt Carlton for his work on the two solutions manuals, one for instructors and the other for students. The generous feedback provided by the following reviewers of this and pre­ vious editions has been of great benefit in improving the book: Robert L.Armacost, University of Central Florida; Bill Bade, Lincoln Land Community College; Douglas M. Bates, University of Wisconsin–Madison; Michael Berry, WestVirginia Wesleyan College; Brian Bowman, Auburn University; Linda Boyle, University of Iowa; Ralph Bravaco, Stonehill College; Linfield C. Brown, Tufts University; Karen M. Bursic, University of Pittsburgh; Lynne Butler, Haverford College; Troy Butler, Colorado State University; Barrett Caldwell, Purdue University; Kyle Caudle, South Dakota School of Mines & Technology; Raj S. Chhikara, University of Houston–Clear Lake; Edwin Chong, Colorado State University; David Clark, California State Polytechnic University at Pomona; Ken Constantine, Taylor University; Bradford Crain, Portland State University; David M. Cresap, University of Portland; Savas Dayanik, Princeton University; Don E. Deal, University of Houston; Annjanette M. Dodd, Humboldt State University; Jimmy Doi, California Polytechnic State University–San Luis Obispo; Charles E. Donaghey, University of Houston; Patrick J. Driscoll, U.S. Military Academy; Mark Duva, University of Virginia; Nassir Eltinay, Lincoln Land Community College; Thomas English, College of the Mainland; Nasser S. Fard, Northeastern University; Ronald Fricker, Naval Postgraduate School; Steven T. Garren, James Madison University; Mark Gebert, University of Kentucky; Harland Glaz, University of Maryland; Ken Grace, Anoka­Ramsey Community College; Celso Grebogi, University of Maryland; Veronica Webster Griffis, Michigan Technological University; Jose Guardiola, Texas A&M University–Corpus Christi; K. L. D. Gunawardena, University of Wisconsin–Oshkosh; James J. Halavin, Rochester Institute of Technology; James Hartman, Marymount University; Tyler Haynes, Saginaw Valley State University; Jennifer Hoeting, Colorado State University; Wei­Min Huang, Lehigh University; Aridaman Jain, New Jersey Institute of Technology; Roger W. Johnson, South Dakota School of Mines & Technology; Chihwa Kao, Syracuse University; Saleem A. Kassam, University of Pennsylvania; Mohammad T. Khasawneh, State University of NewYork–Binghamton; Kyungduk Ko, Boise State University; Stephen Kokoska, Colgate University; Hillel J. Kumin, University of Oklahoma; Sarah Lam, Binghamton University; M. Louise Lawson, Kennesaw State University; Jialiang Li, University of Wisconsin–Madison; Wooi K. Lim, William Paterson University; Aquila Lipscomb, The Citadel; Manuel Lladser, University of Colorado at Boulder; Graham Lord, University of California–Los Angeles; Joseph L. Macaluso, DeSales University; Ranjan Maitra, Iowa State University; David Mathiason, Rochester Institute of Technology; Arnold R. Miller, University of Denver; John J. Millson, University of Maryland; Pamela Kay Miltenberger, West Virginia Wesleyan College; Monica Molsee, Portland State University; Thomas Moore, Naval Postgraduate School; Robert M. Norton, College of Charleston; Steven Pilnick, Naval Postgraduate School; Robi Polikar, Rowan University; Justin Post, North Carolina State University; Ernest Pyle, Houston Baptist University;
  • 17. xvi Preface Xianggui Qu, Oakland University; Kingsley Reeves, University of South Florida; Steve Rein, California Polytechnic State University–San Luis Obispo; Tony Richardson, University of Evansville; Don Ridgeway, North Carolina State University; Larry J. Ringer, Texas A&M University; Nabin Sapkota, University of Central Florida; Robert M. Schumacher, Cedarville University; Ron Schwartz, Florida Atlantic University; Kevan Shafizadeh, California State University– Sacramento; Mohammed Shayib, Prairie View A&M; Alice E. Smith, Auburn University; James MacGregor Smith, University of Massachusetts; Paul J. Smith, University of Maryland; Richard M. Soland, The George Washington University; Clifford Spiegelman, Texas A&M University; Jery Stedinger, Cornell University; David Steinberg, Tel Aviv University; William Thistleton, State University of New York Institute of Technology; J A Stephen Viggiano, Rochester Institute of Technology; G. Geoffrey Vining, University of Florida; Bhutan Wadhwa, Cleveland State University; Gary Wasserman, Wayne State University; Elaine Wenderholm, State University of New York–Oswego; Samuel P. Wilcock, Messiah College; Michael G. Zabetakis, University of Pittsburgh; and Maria Zack, Point Loma Nazarene University. Preeti Longia Sinha of MPS Limited has done a terrific job of supervis­ ing the book’s production. Once again I am compelled to express my gratitude to all those people at Cengage who have made important contributions over the course of my textbook writing career. For this most recent edition, special thanks go to Jay Campbell (for his timely and informed feedback throughout the project), Molly Taylor, Ryan Ahern, Spencer Arritt, Cathy Brooks, and Andrew Coppola. I also greatly appreciate the stellar work of all those Cengage Learning sales representatives who have labored to make my books more visible to the statistical community. Last but by no means least, a heartfelt thanks to my wife Carol for her decades of support, and to my daughters for providing inspiration through their own achievements. Jay Devore
  • 18. 1 Overview and Descriptive Statistics 1 IntroductIon Statistical concepts and methods are not only useful but indeed often indis­ pensable in understanding the world around us. They provide ways of gaining new insights into the behavior of many phenomena that you will encounter in your chosen field of specialization in engineering or science. The discipline of statistics teaches us how to make intelligent judgments and informed decisions in the presence of uncertainty and variation. Without uncertainty or variation, there would be little need for statistical methods or stat­ isticians. If every component of a particular type had exactly the same lifetime, if all resistors produced by a certain manufacturer had the same resistance value, “I took statistics at business school, and it was a transformative experience. Analytical training gives you a skill set that differen­ tiates you from most people in the labor market.” —LaszLo Bock, senior Vice President of PeoPLe oPerations (in charge of aLL hiring) at googLe April 20, 2014, The New York Times, interview with columnist Thomas Friedman “I am not much given to regret, so I puzzled over this one a while. Should have taken much more statistics in college, I think.” —Max LeVchin, PayPaL co-founder, sLide founder Quote of the week from the Web site of the American Statistical Association on November 23, 2010 “I keep saying that the sexy job in the next 10 years will be statisti­ cians, and I’m not kidding.” —Hal Varian, CHief eConomist at GooGle August 6, 2009, The New York Times
  • 19. 2 Chapter 1 Overview and Descriptive Statistics if pH determinations for soil specimens from a particular locale gave identical results, and so on, then a single observation would reveal all desired information. An interesting manifestation of variation appeared in connection with determining the “greenest” way to travel. The article “Carbon Conundrum” (Consumer Reports, 2008: 9) identified organizations that help consumers calculate carbon output. The following results on output for a flight from New York to Los Angeles were reported: Carbon Calculator CO2 (lb) Terra Pass 1924 Conservation International 3000 Cool It 3049 World Resources Institute/Safe Climate 3163 National Wildlife Federation 3465 Sustainable Travel International 3577 Native Energy 3960 Environmental Defense 4000 Carbonfund.org 4820 The Climate Trust/CarbonCounter.org 5860 Bonneville Environmental Foundation 6732 There is clearly rather substantial disagreement among these calculators as to exactly how much carbon is emitted, characterized in the article as “from a ballerina’s to Bigfoot’s.” A website address was provided where readers could learn more about how the various calculators work. How can statistical techniques be used to gather information and draw conclusions? Suppose, for example, that a materials engineer has developed a coating for retarding corrosion in metal pipe under specified circumstances. If this coating is applied to different segments of pipe, variation in environmental conditions and in the segments themselves will result in more substantial corro­ sion on some segments than on others. Methods of statistical analysis could be used on data from such an experiment to decide whether the average amount of corrosion exceeds an upper specification limit of some sort or to predict how much corrosion will occur on a single piece of pipe. Alternatively, suppose the engineer has developed the coating in the belief that it will be superior to the currently used coating. A comparative experiment could be carried out to investigate this issue by applying the current coating to some segments of pipe and the new coating to other segments. This must be done with care lest the wrong conclusion emerge. For example, perhaps the average amount of corrosion is identical for the two coatings. However, the new coating may be applied to segments that have superior ability to resist corrosion and under less stressful environmental conditions compared to the segments and conditions for the current coating. The investigator would then likely observe a difference
  • 20. 1.1 populations, Samples, and processes 3 between the two coatings attributable not to the coatings themselves, but just to extraneous variation. Statistics offers not only methods for analyzing the results of experiments once they have been carried out but also suggestions for how experi­ ments can be performed in an efficient manner to mitigate the effects of variation and have a better chance of producing correct conclusions. Engineers and scientists are constantly exposed to collections of facts, or data, both in their professional capacities and in everyday activities. The discipline of statistics provides methods for organizing and summarizing data and for drawing conclusions based on information contained in the data. An investigation will typically focus on a well-defined collection of objects constituting a population of interest. In one study, the population might consist of all gelatin capsules of a particular type produced during a specified period. Another investigation might involve the population consisting of all individuals who received a B.S. in engineering during the most recent academic year. When desired informa- tion is available for all objects in the population, we have what is called a census. Constraints on time, money, and other scarce resources usually make a census impractical or infeasible. Instead, a subset of the population—a sample—is selected in some prescribed manner. Thus we might obtain a sample of bearings from a par- ticular production run as a basis for investigating whether bearings are conforming to manufacturing specifications, or we might select a sample of last year’s engineering graduates to obtain feedback about the quality of the engineering curricula. We are usually interested only in certain characteristics of the objects in a pop- ulation: the number of flaws on the surface of each casing, the thickness of each capsule wall, the gender of an engineering graduate, the age at which the individual graduated, and so on. A characteristic may be categorical, such as gender or type of malfunction, or it may be numerical in nature. In the former case, the value of the characteristic is a category (e.g., female or insufficient solder), whereas in the latter case, the value is a number (e.g., age 5 23 years or diameter 5 .502 cm).A variable is any characteristic whose value may change from one object to another in the population. We shall initially denote variables by lowercase letters from the end of our alphabet. Examples include x 5 brand of calculator owned by a student y 5 number of visits to a particular Web site during a specified period z 5 braking distance of an automobile under specified conditions Data results from making observations either on a single variable or simultaneously on two or more variables. A univariate data set consists of observations on a single variable. For example, we might determine the type of transmission, automatic (A) or manual (M), on each of ten automobiles recently purchased at a certain dealer- ship, resulting in the categorical data set M A A A M A A M A A The following sample of pulse rates (beats per minute) for patients recently admitted to an adult intensive care unit is a numerical univariate data set: 88 80 71 103 154 132 67 110 60 105 1.1 Populations, Samples, and Processes
  • 21. 4 Chapter 1 Overview and Descriptive Statistics We have bivariate data when observations are made on each of two variables. Our data set might consist of a (height, weight) pair for each basketball player on a team, with the first observation as (72, 168), the second as (75, 212), and so on. If an engineer determines the value of both x 5 component lifetime and y 5 reason for component failure, the resulting data set is bivariate with one variable numeri- cal and the other categorical. Multivariate data arises when observations are made on more than one variable (so bivariate is a special case of multivariate). For exam- ple, a research physician might determine the systolic blood pressure, diastolic blood pressure, and serum cholesterol level for each patient participating in a study. Each observation would be a triple of numbers, such as (120, 80, 146). In many multivariate data sets, some variables are numerical and others are categorical. Thus the annual automobile issue of Consumer Reports gives values of such variables as type of vehicle (small, sporty, compact, mid-size, large), city fuel efficiency (mpg), highway fuel efficiency (mpg), drivetrain type (rear wheel, front wheel, four wheel), and so on. Branches of Statistics An investigator who has collected data may wish simply to summarize and describe important features of the data. This entails using methods from descriptive statistics. Some of these methods are graphical in nature; the construction of histograms, boxplots, and scatter plots are primary examples. Other descriptive methods involve calculation of numerical summary measures, such as means, standard deviations, and correlation coef- ficients. The wide availability of statistical computer software packages has made these tasks much easier to carry out than they used to be. Computers are much more efficient than human beings at calculation and the creation of pictures (once they have received appropriate instructions from the user!). This means that the investigator doesn’t have to expend much effort on “grunt work” and will have more time to study the data and extract important messages. Throughout this book, we will present output from various packages such as Minitab, SAS, JMP, and R. The R software can be downloaded without charge from the site http://www.r-project.org. It has achieved great popularity in the statistical community, and many books describing its various uses are available (it does entail programming as opposed to the pull-down menus of Minitab and JMP). Charity is a big business in the United States. The Web site charitynavigator.com gives information on roughly 6000 charitable organizations, and there are many smaller charities that fly below the navigator’s radar screen. Some charities operate very efficiently, with fundraising and administrative expenses that are only a small percentage of total expenses, whereas others spend a high percentage of what they take in on such activities. Here is data on fundraising expenses as a percentage of total expenditures for a random sample of 60 charities: 6.1 12.6 34.7 1.6 18.8 2.2 3.0 2.2 5.6 3.8 2.2 3.1 1.3 1.1 14.1 4.0 21.0 6.1 1.3 20.4 7.5 3.9 10.1 8.1 19.5 5.2 12.0 15.8 10.4 5.2 6.4 10.8 83.1 3.6 6.2 6.3 16.3 12.7 1.3 0.8 8.8 5.1 3.7 26.3 6.0 48.0 8.2 11.7 7.2 3.9 15.3 16.6 8.8 12.0 4.7 14.7 6.4 17.0 2.5 16.2 Without any organization, it is difficult to get a sense of the data’s most prominent features—what a typical (i.e., representative) value might be, whether values are highly concentrated about a typical value or quite dispersed, whether there are any gaps in the data, what fraction of the values are less than 20%, and so on. Figure 1.1 ExamplE 1.1
  • 22. 1.1 populations, Samples, and processes 5 shows what is called a stem-and-leaf display as well as a histogram. In Section 1.2 we will discuss construction and interpretation of these data summaries. For the moment, we hope you see how they begin to describe how the percentages are dis- tributed over the range of possible values from 0 to 100. Clearly a substantial major- ity of the charities in the sample spend less than 20% on fundraising, and only a few percentages might be viewed as beyond the bounds of sensible practice. ■ Having obtained a sample from a population, an investigator would frequently like to use sample information to draw some type of conclusion (make an inference of some sort) about the population. That is, the sample is a means to an end rather than an end in itself. Techniques for generalizing from a sample to a population are gathered within the branch of our discipline called inferential statistics. Material strength investigations provide a rich area of application for statistical methods. The article “Effects of Aggregates and Microfillers on the Flexural Properties of Concrete” (Magazine of Concrete Research, 1997: 81–98) reported on a study of strength properties of high-performance concrete obtained by using superplasticizers and certain binders. The compressive strength of such concrete had previously been investigated, but not much was known about flexural strength (a measure of ability to resist failure in bending). The accompanying data on flexural strength (in MegaPascal, MPa, where 1 Pa (Pascal) 5 1.45 3 1024 psi) appeared in the article cited: 5.9 7.2 7.3 6.3 8.1 6.8 7.0 7.6 6.8 6.5 7.0 6.3 7.9 9.0 8.2 8.7 7.8 9.7 7.4 7.7 9.7 7.8 7.7 11.6 11.3 11.8 10.7 Suppose we want an estimate of the average value of flexural strength for all beams that could be made in this way (if we conceptualize a population of all such beams, we are trying to estimate the population mean). It can be shown that, with a high degree of confidence, the population mean strength is between 7.48 MPa and 8.80 MPa; we call this a confidence interval or interval estimate. Alternatively, this data could be used to predict the flexural strength of a single beam of this type. With a high degree of confidence, the strength of a single such beam will exceed 7.35 MPa; the number 7.35 is called a lower prediction bound. ■ ExamplE 1.2 0 0 10 20 Frequency 30 40 Stem–and–leaf of FundRsng N = 60 Leaf Unit = 1.0 0 0111112222333333344 0 55556666666778888 1 0001222244 1 55666789 2 01 2 6 3 3 4 4 8 5 5 6 6 7 7 8 3 4 10 20 30 40 50 FundRsng 60 70 80 90 Figure 1.1 A Minitab stem-and-leaf display (tenths digit truncated) and histogram for the charity fundraising percentage data
  • 23. 6 Chapter 1 Overview and Descriptive Statistics The main focus of this book is on presenting and illustrating methods of inferential statistics that are useful in scientific work. The most important types of inferential procedures—point estimation, hypothesis testing, and estimation by confidence intervals—are introduced in Chapters 6–8 and then used in more com- plicated settings in Chapters 9–16. The remainder of this chapter presents methods from descriptive statistics that are most used in the development of inference. Chapters 2–5 present material from the discipline of probability. This mate- rial ultimately forms a bridge between the descriptive and inferential techniques. Mastery of probability leads to a better understanding of how inferential procedures are developed and used, how statistical conclusions can be translated into everyday language and interpreted, and when and where pitfalls can occur in applying the methods. Probability and statistics both deal with questions involving populations and samples, but do so in an “inverse manner” to one another. In a probability problem, properties of the population under study are assumed known (e.g., in a numerical population, some specified distribution of the population values may be assumed), and questions regarding a sample taken from the population are posed and answered. In a statistics problem, characteristics of a sample are available to the experimenter, and this information enables the experi- menter to draw conclusions about the population. The relationship between the two disciplines can be summarized by saying that probability reasons from the popu- lation to the sample (deductive reasoning), whereas inferential statistics reasons from the sample to the population (inductive reasoning). This is illustrated in Figure 1.2. Population Probability Inferential statistics Sample Figure 1.2 The relationship between probability and inferential statistics Before we can understand what a particular sample can tell us about the popu- lation, we should first understand the uncertainty associated with taking a sample from a given population. This is why we study probability before statistics. As an example of the contrasting focus of probability and inferential statistics, con- sider drivers’ use of manual lap belts in cars equipped with automatic shoulder belt systems. (The article “Automobile Seat Belts: Usage Patterns in Automatic Belt Systems,” Human Factors, 1998: 126–135, summarizes usage data.) In probability, we might assume that 50% of all drivers of cars equipped in this way in a certain metropolitan area regularly use their lap belt (an assumption about the population), so we might ask, “How likely is it that a sample of 100 such drivers will include at least 70 who regularly use their lap belt?” or “How many of the drivers in a sample of size 100 can we expect to regularly use their lap belt?” On the other hand, in infer- ential statistics, we have sample information available; for example, a sample of 100 drivers of such cars revealed that 65 regularly use their lap belt. We might then ask, “Does this provide substantial evidence for concluding that more than 50% of all such drivers in this area regularly use their lap belt?” In this latter scenario, we are attempting to use sample information to answer a question about the structure of the entire population from which the sample was selected. ■ In the foregoing lap belt example, the population is well defined and concrete: all drivers of cars equipped in a certain way in a particular metropolitan area. In Example 1.2, however, the strength measurements came from a sample of prototype beams that ExamplE 1.3
  • 24. 1.1 populations, Samples, and processes 7 had not been selected from an existing population. Instead, it is convenient to think of the population as consisting of all possible strength measurements that might be made under similar experimental conditions. Such a population is referred to as a conceptual or hypothetical population. There are a number of problem situations in which we fit questions into the framework of inferential statistics by conceptualizing a population. the Scope of Modern Statistics These days statistical methodology is employed by investigators in virtually all dis- ciplines, including such areas as ● molecular biology (analysis of microarray data) ● ecology (describing quantitatively how individuals in various animal and plant populations are spatially distributed) ● materials engineering (studying properties of various treatments to retard corrosion) ● marketing (developing market surveys and strategies for marketing new products) ● public health (identifying sources of diseases and ways to treat them) ● civil engineering (assessing the effects of stress on structural elements and the impacts of traffic flows on communities) As you progress through the book, you’ll encounter a wide spectrum of different sce- narios in the examples and exercises that illustrate the application of techniques from probability and statistics. Many of these scenarios involve data or other material extracted from articles in engineering and science journals. The methods presented herein have become established and trusted tools in the arsenal of those who work with data. Meanwhile, statisticians continue to develop new models for describing rand- omness, and uncertainty and new methodology for analyzing data. As evidence of the continuing creative efforts in the statistical community, here are titles and capsule descriptions of some articles that have recently appeared in statistics journals (Journal of the American Statistical Association is abbreviated JASA, and AAS is short for the Annals of Applied Statistics, two of the many prominent journals in the discipline): ● “How Many People Do You Know? Efficiently Estimating Personal Network Size” (JASA, 2010: 59–70): How many of the N individuals at your college do you know? You could select a random sample of students from the population and use an estimate based on the fraction of people in this sam- ple that you know. Unfortunately this is very inefficient for large populations because the fraction of the population someone knows is typically very small. A “latent mixing model” was proposed that the authors asserted remedied deficien- cies in previously used techniques. A simulation study of the method’s effec- tiveness based on groups consisting of first names (“How many people named Michael do you know?”) was included as well as an application of the method to actual survey data. The article concluded with some practical guidelines for the construction of future surveys designed to estimate social network size. ● “Active Learning Through Sequential Design, with Applications to the Detection of Money Laundering” (JASA, 2009: 969–981): Money launder- ing involves concealing the origin of funds obtained through illegal activities. The huge number of transactions occurring daily at financial institutions makes detection of money laundering difficult. The standard approach has been to extract various summary quantities from the transaction history and conduct a time-consuming investigation of suspicious activities. The article proposes a more efficient statistical method and illustrates its use in a case study.
  • 25. 8 Chapter 1 Overview and Descriptive Statistics ● “Robust Internal Benchmarking and False Discovery Rates for Detecting Racial Bias in Police Stops” (JASA, 2009: 661–668): Allegations of police actions that are attributable at least in part to racial bias have become a contentious issue in many communities. This article proposes a new method that is designed to reduce the risk of flagging a substantial number of “false positives” (individuals falsely identified as manifesting bias). The method was applied to data on 500,000 pedestrian stops in New York City in 2006; of the 3000 officers regularly involved in pedestrian stops, 15 were identified as having stopped a substantially greater frac- tion of Black and Hispanic people than what would be predicted were bias absent. ● “Records in Athletics Through Extreme Value Theory” (JASA, 2008: 1382–1391): The focus here is on the modeling of extremes related to world records in athletics. The authors start by posing two questions: (1) What is the ultimate world record within a specific event (e.g., the high jump for women)? and (2) How “good” is the current world record, and how does the quality of current world records compare across different events? A total of 28 events (8 running, 3 throwing, and 3 jumping for both men and women) are consid- ered. For example, one conclusion is that only about 20 seconds can be shaved off the men’s marathon record, but that the current women’s marathon record is almost 5 minutes longer than what can ultimately be achieved. The method- ology also has applications to such issues as ensuring airport runways are long enough and that dikes in Holland are high enough. ● “Self-Exciting Hurdle Models for Terrorist Activity” (AAS, 2012: 106–124): The authors developed a predictive model of terrorist activity by considering the daily number of terrorist attacks in Indonesia from 1994 through 2007. The model esti- mates the chance of future attacks as a function of the times since past attacks. One feature of the model considers the excess of nonattack days coupled with the pres- ence of multiple coordinated attacks on the same day. The article provides an inter- pretation of various model characteristics and assesses its predictive performance. ● “Prediction of Remaining Life of Power Transformers Based on Left Truncated and Right Censored Lifetime Data” (AAS, 2009: 857–879): There are roughly 150,000 high-voltage power transmission transformers in the United States. Unexpected failures can cause substantial economic losses, so it is impor- tant to have predictions for remaining lifetimes. Relevant data can be complicated because lifetimes of some transformers extend over several decades during which records were not necessarily complete. In particular, the authors of the article use data from a certain energy company that began keeping careful records in 1980. But some transformers had been installed before January 1, 1980, and were still in service after that date (“left truncated” data), whereas other units were still in service at the time of the investigation, so their complete lifetimes are not available (“right censored” data). The article describes various procedures for obtaining an interval of plausible values (a prediction interval) for a remaining lifetime and for the cumulative number of failures over a specified time period. ● “The BARISTA: A Model for Bid Arrivals in Online Auctions” (AAS, 2007: 412–441): Online auctions such as those on eBay and uBid often have character- istics that differentiate them from traditional auctions. One particularly important difference is that the number of bidders at the outset of many traditional auctions is fixed, whereas in online auctions this number and the number of resulting bids are not predetermined. The article proposes a new BARISTA (for Bid ARrivals In STAges) model for describing the way in which bids arrive online. The model allows for higher bidding intensity at the outset of the auction and also as the auction comes to a close. Various properties of the model are investigated and
  • 26. 1.1 populations, Samples, and processes 9 then validated using data from eBay.com on auctions for Palm M515 personal assistants, Microsoft Xbox games, and Cartier watches. ● “Statistical Challenges in the Analysis of Cosmic Microwave Background Radiation” (AAS, 2009: 61–95): The cosmic microwave background (CMB) is a significant source of information about the early history of the universe. Its radiation level is uniform, so extremely delicate instruments have been developed to measure fluctuations. The authors provide a review of statistical issues with CMB data analysis; they also give many examples of the application of statistical procedures to data obtained from a recent NASA satellite mission, the Wilkinson Microwave Anisotropy Probe. Statistical information now appears with increasing frequency in the popular media, and occasionally the spotlight is even turned on statisticians. For example, the Nov. 23, 2009, New York Times reported in an article “Behind Cancer Guidelines, Quest for Data” that the new science for cancer investigations and more sophisticated methods for data analysis spurred the U.S. Preventive Services task force to re-examine guide- lines for how frequently middle-aged and older women should have mammograms. The panel commissioned six independent groups to do statistical modeling. The result was a new set of conclusions, including an assertion that mammograms every two years are nearly as beneficial to patients as annual mammograms, but confer only half the risk of harms. Donald Berry, a very prominent biostatistician, was quoted as saying he was pleasantly surprised that the task force took the new research to heart in making its recommendations. The task force’s report has generated much controversy among cancer organizations, politicians, and women themselves. It is our hope that you will become increasingly convinced of the importance and relevance of the discipline of statistics as you dig more deeply into the book and the subject. Hopefully you’ll be turned on enough to want to continue your statistical education beyond your current course. Enumerative Versus Analytic Studies W. E. Deming, a very influential American statistician who was a moving force in Japan’s quality revolution during the 1950s and 1960s, introduced the distinction between enumerative studies and analytic studies. In the former, interest is focused on a finite, identifiable, unchanging collection of individuals or objects that make up a population. A sampling frame—that is, a listing of the individuals or objects to be sampled—is either available to an investigator or else can be constructed. For exam- ple, the frame might consist of all signatures on a petition to qualify a certain initia- tive for the ballot in an upcoming election; a sample is usually selected to ascertain whether the number of valid signatures exceeds a specified value. As another example, the frame may contain serial numbers of all furnaces manufactured by a particular company during a certain time period; a sample may be selected to infer something about the average lifetime of these units. The use of inferential methods to be developed in this book is reasonably noncontroversial in such settings (though statisticians may still argue over which particular methods should be used). An analytic study is broadly defined as one that is not enumerative in nature. Such studies are often carried out with the objective of improving a future product by taking action on a process of some sort (e.g., recalibrating equipment or adjusting the level of some input such as the amount of a catalyst). Data can often be obtained only on an existing process, one that may differ in important respects from the future process. There is thus no sampling frame listing the individuals or objects of interest. For example, a sample of five turbines with a new design may be experimentally manufactured and
  • 27. 10 Chapter 1 Overview and Descriptive Statistics tested to investigate efficiency. These five could be viewed as a sample from the concep- tual population of all prototypes that could be manufactured under similar conditions, but not necessarily as representative of the population of units manufactured once regular production gets underway. Methods for using sample information to draw conclusions about future production units may be problematic. Someone with expertise in the area of turbine design and engineering (or whatever other subject area is relevant) should be called upon to judge whether such extrapolation is sensible. A good exposition of these issues is contained in the article “Assumptions for Statistical Inference” by Gerald Hahn and William Meeker (The American Statistician, 1993: 1–11). collecting data Statistics deals not only with the organization and analysis of data once it has been collected but also with the development of techniques for collecting the data. If data is not properly collected, an investigator may not be able to answer the questions under consideration with a reasonable degree of confidence. One common problem is that the target population—the one about which conclusions are to be drawn—may be different from the population actually sampled. For example, advertisers would like various kinds of information about the television-viewing habits of potential cus- tomers. The most systematic information of this sort comes from placing monitoring devices in a small number of homes across the United States. It has been conjectured that placement of such devices in and of itself alters viewing behavior, so that charac- teristics of the sample may be different from those of the target population. When data collection entails selecting individuals or objects from a frame, the simplest method for ensuring a representative selection is to take a simple random sample. This is one for which any particular subset of the specified size (e.g., a sam- ple of size 100) has the same chance of being selected. For example, if the frame consists of 1,000,000 serial numbers, the numbers 1, 2,…, up to 1,000,000 could be placed on identical slips of paper. After placing these slips in a box and thor- oughly mixing, slips could be drawn one by one until the requisite sample size has been obtained. Alternatively (and much to be preferred), a table of random numbers or a software package’s random number generator could be employed. Sometimes alternative sampling methods can be used to make the selection process easier, to obtain extra information, or to increase the degree of confidence in conclusions. One such method, stratified sampling, entails separating the population units into nonoverlapping groups and taking a sample from each one. For example, a study of how physicians feel about the Affordable Care Act might proceed by stratifying according to specialty: select a sample of surgeons, another sample of radiologists, yet another sample of psychiatrists, and so on. This would result in information separately from each specialty and ensure that no one specialty is over- or underrepresented in the entire sample. Frequently a “convenience” sample is obtained by selecting individuals or objects without systematic randomization. As an example, a collection of bricks may be stacked in such a way that it is extremely difficult for those in the center to be selected. If the bricks on the top and sides of the stack were somehow different from the others, resulting sample data would not be representative of the population. Often an investigator will assume that such a convenience sample approximates a random sample, in which case a statistician’s repertoire of inferential methods can be used; however, this is a judgment call. Most of the methods discussed herein are based on a variation of simple random sampling described in Chapter 5. Engineers and scientists often collect data by carrying out some sort of designed experiment. This may involve deciding how to allocate several different treatments (such as fertilizers or coatings for corrosion protection) to the various experimental units (plots
  • 28. 1.1 populations, Samples, and processes 11 of land or pieces of pipe). Alternatively, an investigator may systematically vary the levels or categories of certain factors (e.g., pressure or type of insulating material) and observe the effect on some response variable (such as yield from a production process). An article in the NewYork Times (Jan. 27, 1987) reported that heart attack risk could be reduced by taking aspirin. This conclusion was based on a designed experiment involv- ing both a control group of individuals that took a placebo having the appearance of aspirin but known to be inert and a treatment group that took aspirin according to a specified regimen. Subjects were randomly assigned to the groups to protect against any biases and so that probability-based methods could be used to analyze the data. Of the 11,034 individuals in the control group, 189 subsequently experienced heart attacks, whereas only 104 of the 11,037 in the aspirin group had a heart attack. The incidence rate of heart attacks in the treatment group was only about half that in the control group. One possible explanation for this result is chance variation—that aspirin really doesn’t have the desired effect and the observed difference is just typical variation in the same way that tossing two identical coins would usually produce different numbers of heads. However, in this case, inferential methods suggest that chance variation by itself cannot adequately explain the magnitude of the observed difference. ■ An engineer wishes to investigate the effects of both adhesive type and conductor material on bond strength when mounting an integrated circuit (IC) on a certain sub- strate. Two adhesive types and two conductor materials are under consideration. Two observations are made for each adhesive-type/conductor-material combination, resulting in the accompanying data: ExamplE 1.4 ExamplE 1.5 Adhesive Type Conductor Material Observed Bond Strength Average 1 1 82, 77 79.5 1 2 75, 87 81.0 2 1 84, 80 82.0 2 2 78, 90 84.0 The resulting average bond strengths are pictured in Figure 1.3. It appears that adhe- sive type 2 improves bond strength as compared with type 1 by about the same amount whichever one of the conducting materials is used, with the 2, 2 combin- ation being best. Inferential methods can again be used to judge whether these effects are real or simply due to chance variation. Conducting material Average strength 1 2 80 85 Adhesive type 2 Adhesive type 1 Figure 1.3 Average bond strengths in Example 1.5 Suppose additionally that there are two cure times under consideration and also two types of IC post coating. There are then 2 ? 2 ? 2 ? 2 5 16 combinations of these four
  • 29. 12 Chapter 1 Overview and Descriptive Statistics factors, and our engineer may not have enough resources to make even a single observa- tion for each of these combinations. In Chapter 11, we will see how the careful selection of a fraction of these possibilities will usually yield the desired information. ■ 1. Give one possible sample of size 4 from each of the fol- lowing populations: a. All daily newspapers published in the United States b. All companies listed on the New York Stock Exchange c. All students at your college or university d. All grade point averages of students at your college or university 2. For each of the following hypothetical populations, give a plausible sample of size 4: a. All distances that might result when you throw a football b. Page lengths of books published 5 years from now c. All possible earthquake-strength measurements (Richter scale) that might be recorded in California during the next year d. All possible yields (in grams) from a certain chemi- cal reaction carried out in a laboratory 3. Consider the population consisting of all computers of a certain brand and model, and focus on whether a com- puter needs service while under warranty. a. Pose several probability questions based on selecting a sample of 100 such computers. b. What inferential statistics question might be answered by determining the number of such computers in a sample of size 100 that need warranty service? 4. a. Give three different examples of concrete popula- tions and three different examples of hypothetical populations. b. For one each of your concrete and your hypothetical populations, give an example of a probability question and an example of an inferential statistics question. 5. Many universities and colleges have instituted supplemen- tal instruction (SI) programs, in which a student facilitator meets regularly with a small group of students enrolled in the course to promote discussion of course material and enhance subject mastery. Suppose that students in a large statistics course (what else?) are randomly divided into a control group that will not participate in SI and a treatment group that will participate. At the end of the term, each student’s total score in the course is determined. a. Are the scores from the SI group a sample from an existing population? If so, what is it? If not, what is the relevant conceptual population? b. What do you think is the advantage of randomly dividing the students into the two groups rather than letting each student choose which group to join? c. Why didn’t the investigators put all students in the treat- ment group? [Note: The article “Supplemental Instruction: An Effective Component of Student Affairs Programming” (J. of College Student Devel., 1997: 577–586) discusses the analysis of data from several SI programs.] 6. The California State University (CSU) system consists of 23 campuses, from San Diego State in the south to Humboldt State near the Oregon border. A CSU admin- istrator wishes to make an inference about the average distance between the hometowns of students and their campuses. Describe and discuss several different sam- pling methods that might be employed. Would this be an enumerative or an analytic study? Explain your reasoning. 7. A certain city divides naturally into ten district neighbor- hoods. How might a real estate appraiser select a sample of single-family homes that could be used as a basis for developing an equation to predict appraised value from characteristics such as age, size, number of bathrooms, distance to the nearest school, and so on? Is the study enumerative or analytic? 8. The amount of flow through a solenoid valve in an auto- mobile’s pollution-control system is an important char- acteristic. An experiment was carried out to study how flow rate depended on three factors: armature length, spring load, and bobbin depth. Two different levels (low and high) of each factor were chosen, and a single observation on flow was made for each combination of levels. a. The resulting data set consisted of how many observations? b. Is this an enumerative or analytic study? Explain your reasoning. 9. In a famous experiment carried out in 1882, Michelson and Newcomb obtained 66 observations on the time it took for light to travel between two locations in Washington, D.C. A few of the measurements (coded in a certain manner) were 31, 23, 32, 36, 22, 26, 27, and 31. a. Why are these measurements not identical? b. Is this an enumerative study? Why or why not? EXERCISES Section 1.1 (1–9)
  • 30. 1.2 pictorial and tabular Methods in Descriptive Statistics 13 1.2 Pictorial and Tabular Methods in Descriptive Statistics Descriptive statistics can be divided into two general subject areas. In this section, we consider representing a data set using visual displays. In Sections 1.3 and 1.4, we will develop some numerical summary measures for data sets. Many visual techniques may already be familiar to you: frequency tables, tally sheets, histograms, pie charts, bar graphs, scatter diagrams, and the like. Here we focus on a selected few of these techniques that are most useful and relevant to probability and inferential statistics. notation Some general notation will make it easier to apply our methods and formulas to a wide variety of practical problems. The number of observations in a single sample, that is, the sample size, will often be denoted by n, so that n 5 4 for the sample of universities {Stanford, Iowa State, Wyoming, Rochester} and also for the sample of pH measurements {6.3, 6.2, 5.9, 6.5}. If two samples are simultaneously under con- sideration, either m and n or n1 and n2 can be used to denote the numbers of observa- tions. An experiment to compare thermal efficiencies for two different types of diesel engines might result in samples {29.7, 31.6, 30.9} and {28.7, 29.5, 29.4, 30.3}, in which case m 5 3 and n 5 4. Given a data set consisting of n observations on some variable x, the individ- ual observations will be denoted by x1, x2, x3,…, xn. The subscript bears no relation to the magnitude of a particular observation. Thus x1 will not in general be the small- est observation in the set, nor will xn typically be the largest. In many applications, x1 will be the first observation gathered by the experimenter, x2 the second, and so on. The ith observation in the data set will be denoted by xi. Stem-and-Leaf displays Consider a numerical data set x1, x2,…, xn for which each xi consists of at least two digits. A quick way to obtain an informative visual representation of the data set is to construct a stem-and-leaf display. constructing a Stem-and-Leaf display 1. Select one or more leading digits for the stem values. The trailing digits become the leaves. 2. List possible stem values in a vertical column. 3. Record the leaf for each observation beside the corresponding stem value. 4. Indicate the units for stems and leaves someplace in the display. For a data set consisting of exam scores, each between 0 and 100, the score of 83 would have a stem of 8 and a leaf of 3. If all exam scores are in the 90s, 80s, and 70s (an instructor’s dream!), use of the tens digit as the stem would give a display
  • 31. 14 Chapter 1 Overview and Descriptive Statistics with only three rows. In this case, it is desirable to stretch the display by repeating each stem value twice—9H, 9L, 8H, . . . ,7L—once for high leaves 9, . . . , 5 and again for low leaves 4, . . . , 0. Then a score of 93 would have a stem of 9L and leaf of 3. In general, a display based on between 5 and 20 stems is recommended. A common complaint among college students is that they are getting less sleep than they need. The article “Class Start Times, Sleep, and Academic Performance in College: A Path Analysis” (Chronobiology Intl., 2012: 318–335) investigated fac- tors that impact sleep time. The stem-and-leaf display in Figure 1.4 shows the average number of hours of sleep per day over a two-week period for a sample of 253 students. ExamplE 1.6 Figure 1.4 Stem-and-leaf display for average sleep time per day 5L 5H 00 6889 000111123444444 55556778899999 Stem: ones digit Leaf: tenths digit 000011111112222223333333344444444 55555555666666666666777777888888888999999999999999 00000000000011111122222222222222222333333333334444444444444 6L 6H 7L 7H 8L 5555555566666666677777788888888899999999999 00001111111222223334 666678999 00 56 8H 9L 9H 10L 10H The first observation in the top row of the display is 5.0, corresponding to a stem of 5 and leaf of 0, and the last observation at the bottom of the display is 10.6. Note that in the absence of a context, without the identification of stem and leaf digits in the display, we wouldn’t know whether the observation with stem 7 and leaf 9 was .79, 7.9, or 79. The leaves in each row are ordered from smallest to larg- est; this is commonly done by software packages but is not necessary if a display is created by hand. The display suggests that a typical or representative sleep time is in the stem 8L row, perhaps 8.1 or 8.2. The data is not highly concentrated about this typical value as would be the case if almost all students were getting between 7.5 and 9.5 hours of sleep on average. The display appears to rise rather smoothly to a peak in the 8L row and then decline smoothly (we conjecture that the minor peak in the 6L row would disappear if more data was available). The general shape of the display is rather symmetric, bearing strong resemblance to a bell-shaped curve; it does not stretch out more in one direction than the other. The two smallest and two largest values seem a bit separated from the remainder of the data—perhaps they are very mild, but certainly not extreme,“outliers”. A reference in the cited article suggests that individuals in this age group need about 8.4 hours of sleep per day. So it appears that a substantial percentage of students in the sample are sleep deprived. ■ A stem-and-leaf display conveys information about the following aspects of the data: ● identification of a typical or representative value ● extent of spread about the typical value ● presence of any gaps in the data
  • 32. 1.2 pictorial and tabular Methods in Descriptive Statistics 15 ● extent of symmetry in the distribution of values ● number and locations of peaks ● presence of any outliers—values far from the rest of the data Figure 1.5 presents stem-and-leaf displays for a random sample of lengths of golf courses (yards) that have been designated by Golf Magazine as among the most chal- lenging in the United States.Among the sample of 40 courses, the shortest is 6433 yards long, and the longest is 7280 yards. The lengths appear to be distributed in a roughly uniform fashion over the range of values in the sample. Notice that a stem choice here of either a single digit (6 or 7) or three digits (643, … , 728) would yield an uninformative display, the first because of too few stems and the latter because of too many. ExamplE 1.7 64 35 64 33 70 Stem: Thousands and hundreds digits 65 26 27 06 83 Leaf: Tens and ones digits 66 05 94 14 67 90 70 00 98 70 45 13 68 90 70 73 50 69 00 27 36 04 70 51 05 11 40 50 22 71 31 69 68 05 13 65 72 80 09 Stem-and-leaf of yardage N 40 Leaf Unit 10 4 64 3367 8 65 0228 11 66 019 18 67 0147799 (4) 68 5779 18 69 0023 14 70 012455 8 71 013666 2 72 08 (a) (b) Figure 1.5 Stem-and-leaf displays of golf course lengths: (a) two-digit leaves; (b) display from Minitab with truncated one-digit leaves Statistical software packages do not generally produce displays with multiple- digit stems. The Minitab display in Figure 1.5(b) results from truncating each obser- vation by deleting the ones digit. ■ dotplots A dotplot is an attractive summary of numerical data when the data set is reasonably small or there are relatively few distinct data values. Each observation is represented by a dot above the corresponding location on a horizontal measurement scale. When a value occurs more than once, there is a dot for each occurrence, and these dots are stacked vertically. As with a stem-and-leaf display, a dotplot gives information about location, spread, extremes, and gaps. There is growing concern in the U.S. that not enough students are graduating from college. America used to be number 1 in the world for the percentage of adults with college degrees, but it has recently dropped to 16th. Here is data on the percentage of 25- to 34-year-olds in each state who had some type of postsecond- ary degree as of 2010 (listed in alphabetical order, with the District of Columbia included): 31.5 32.9 33.0 28.6 37.9 43.3 45.9 37.2 68.8 36.2 35.5 40.5 37.2 45.3 36.1 45.5 42.3 33.3 30.3 37.2 45.5 54.3 37.2 49.8 32.1 39.3 40.3 44.2 28.4 46.0 47.2 28.7 49.6 37.6 50.8 38.0 30.8 37.6 43.9 42.5 35.2 42.2 32.8 32.2 38.5 44.5 44.6 40.9 29.5 41.3 35.4 ExamplE 1.8
  • 33. 16 Chapter 1 Overview and Descriptive Statistics Figure 1.6 shows a dotplot of the data. Dots corresponding to some values close together (e.g., 28.6 and 28.7) have been vertically stacked to prevent crowding. There is clearly a great deal of state-to-state variability. The largest value, for D.C., is obviously an extreme outlier, and four other values on the upper end of the data are candidates for mild outliers (MA, MN, NY, and ND). There is also a cluster of states at the low end, primarily located in the South and Southwest. The overall percentage for the entire country is 39.3%; this is not a simple average of the 51 numbers but an average weighted by population sizes. 25 30 35 40 45 50 55 60 65 70 Figure 1.6 A dotplot of the data from Example 1.8 n A dotplot can be quite cumbersome to construct and look crowded when the number of observations is large. Our next technique is well suited to such situations. Histograms Some numerical data is obtained by counting to determine the value of a variable (the number of traffic citations a person received during the last year, the number of custom- ers arriving for service during a particular period), whereas other data is obtained by taking measurements (weight of an individual, reaction time to a particular stimulus). The prescription for drawing a histogram is generally different for these two cases. A numerical variable is discrete if its set of possible values either is finite or else can be listed in an infinite sequence (one in which there is a first number, a second number, and so on). A numerical variable is continuous if its possible values consist of an entire interval on the number line. DEFINITION A discrete variable x almost always results from counting, in which case pos- sible values are 0, 1, 2, 3, … or some subset of these integers. Continuous variables arise from making measurements. For example, if x is the pH of a chemical sub- stance, then in theory x could be any number between 0 and 14: 7.0, 7.03, 7.032, and so on. Of course, in practice there are limitations on the degree of accuracy of any measuring instrument, so we may not be able to determine pH, reaction time, height, and concentration to an arbitrarily large number of decimal places. However, from the point of view of creating mathematical models for distributions of data, it is help- ful to imagine an entire continuum of possible values. Consider data consisting of observations on a discrete variable x. The frequency of any particular x value is the number of times that value occurs in the data set. The relative frequency of a value is the fraction or proportion of times the value occurs: relative frequency of a value 5 number of times the value occurs number of observations in the data set Suppose, for example, that our data set consists of 200 observations on x 5 the number of courses a college student is taking this term. If 70 of these x values are 3, then frequency of the x value 3: 70 relative frequency of the x value 3: 70 200 5 .35
  • 34. 1.2 pictorial and tabular Methods in Descriptive Statistics 17 Multiplying a relative frequency by 100 gives a percentage; in the college-course example, 35% of the students in the sample are taking three courses. The relative frequencies, or percentages, are usually of more interest than the frequencies them- selves. In theory, the relative frequencies should sum to 1, but in practice the sum may differ slightly from 1 because of rounding. A frequency distribution is a tabu- lation of the frequencies and/or relative frequencies. constructing a Histogram for discrete data First, determine the frequency and relative frequency of each x value. Then mark possible x values on a horizontal scale. Above each value, draw a rectangle whose height is the relative frequency (or alternatively, the frequency) of that value; the rectangles should have equal widths. This construction ensures that the area of each rectangle is proportional to the rela- tive frequency of the value. Thus if the relative frequencies of x 5 1 and x 5 5 are .35 and .07, respectively, then the area of the rectangle above 1 is five times the area of the rectangle above 5. How unusual is a no-hitter or a one-hitter in a major league baseball game, and how frequently does a team get more than 10, 15, or even 20 hits? Table 1.1 is a frequency distribution for the number of hits per team per game for all nine-inning games that were played between 1989 and 1993. ExamplE 1.9 Table 1.1 Frequency Distribution for Hits in Nine-Inning Games Number Relative Number of Relative Hits/Game of Games Frequency Hits/Game Games Frequency 0 20 .0010 14 569 .0294 1 72 .0037 15 393 .0203 2 209 .0108 16 253 .0131 3 527 .0272 17 171 .0088 4 1048 .0541 18 97 .0050 5 1457 .0752 19 53 .0027 6 1988 .1026 20 31 .0016 7 2256 .1164 21 19 .0010 8 2403 .1240 22 13 .0007 9 2256 .1164 23 5 .0003 10 1967 .1015 24 1 .0001 11 1509 .0779 25 0 .0000 12 1230 .0635 26 1 .0001 13 834 .0430 27 1 .0001 19,383 1.0005 The corresponding histogram in Figure 1.7 rises rather smoothly to a single peak and then declines. The histogram extends a bit more on the right (toward large values) than it does on the left—a slight “positive skew.”
  • 35. 18 Chapter 1 Overview and Descriptive Statistics Either from the tabulated information or from the histogram itself, we can determine the following: proportion of games with at most two hits 5 relative frequency for x 5 0 1 relative frequency for x 5 1 1 relative frequency for x 5 2 5 .0010 1 .0037 1 .0108 5 .0155 Similarly, proportion of games with between 5 and 10 hits (inclusive) 5 .0752 1 .1026 1 … 1 .1015 5 .6361 That is, roughly 64% of all these games resulted in between 5 and 10 (inclusive) hits. ■ Constructing a histogram for continuous data (measurements) entails subdividing the measurement axis into a suitable number of class intervals or classes, such that each observation is contained in exactly one class. Suppose, for example, that we have 50 observations on x 5 fuel efficiency of an automobile (mpg), the smallest of which is 27.8 and the largest of which is 31.4. Then we could use the class bounda- ries 27.5, 28.0, 28.5, … , and 31.5 as shown here: 10 .05 0 .10 0 Hits/game 20 Relative frequency Figure 1.7 Histogram of number of hits per nine-inning game 27.5 28.0 28.5 29.0 29.5 30.0 30.5 31.0 31.5 One potential difficulty is that occasionally an observation lies on a class boundary so therefore does not fall in exactly one interval, for example, 29.0. One way to deal with thisproblemistouseboundarieslike27.55,28.05,…,31.55.Addingahundredthsdigit to the class boundaries prevents observations from falling on the resulting boundaries. Another approach is to use the classes 27.52, 28.0, 28.02, 28.5,…, 31.02,31.5. Then 29.0 falls in the class 29.02, 29.5 rather than in the class 28.52, 29.0. In other words, with this convention, an observation on a boundary is placed in the inter- val to the right of the boundary. This is how Minitab constructs a histogram.
  • 36. 1.2 pictorial and tabular Methods in Descriptive Statistics 19 Power companies need information about customer usage to obtain accurate fore- casts of demands. Investigators from Wisconsin Power and Light determined energy consumption (BTUs) during a particular period for a sample of 90 gas-heated homes. An adjusted consumption value was calculated as follows: adjusted consumption 5 consumption (weather, in degree days)(house area) This resulted in the accompanying data (part of the stored data set FURNACE.MTW available in Minitab), which we have ordered from smallest to largest. 2.97 4.00 5.20 5.56 5.94 5.98 6.35 6.62 6.72 6.78 6.80 6.85 6.94 7.15 7.16 7.23 7.29 7.62 7.62 7.69 7.73 7.87 7.93 8.00 8.26 8.29 8.37 8.47 8.54 8.58 8.61 8.67 8.69 8.81 9.07 9.27 9.37 9.43 9.52 9.58 9.60 9.76 9.82 9.83 9.83 9.84 9.96 10.04 10.21 10.28 10.28 10.30 10.35 10.36 10.40 10.49 10.50 10.64 10.95 11.09 11.12 11.21 11.29 11.43 11.62 11.70 11.70 12.16 12.19 12.28 12.31 12.62 12.69 12.71 12.91 12.92 13.11 13.38 13.42 13.43 13.47 13.60 13.96 14.24 14.35 15.12 15.24 16.06 16.90 18.26 The most striking feature of the histogram in Figure 1.8 is its resemblance to a bell- shaped curve, with the point of symmetry roughly at 10. ExamplE 1.10 constructing a Histogram for continuous data: Equal class Widths Determine the frequency and relative frequency for each class. Mark the class boundaries on a horizontal measurement axis. Above each class inter- val, draw a rectangle whose height is the corresponding relative frequency (or frequency). Class 12,3 32,5 52,7 72,9 92,11 112,13 132,15 152,17 172,19 Frequency 1 1 11 21 25 17 9 4 1 Relative .011 .011 .122 .233 .278 .189 .100 .044 .011 frequency 1 3 5 7 9 BTU 0 10 20 30 Percent 11 13 15 17 19 Figure 1.8 Histogram of the energy consumption data from Example 1.10
  • 37. 20 Chapter 1 Overview and Descriptive Statistics From the histogram, proportion of observations less than 9 < .01 1 .01 1 .12 1 .23 5 .37 (exact value 5 34 90 5 .378) The relative frequency for the 92,11 class is about .27, so we estimate that roughly half of this, or .135, is between 9 and 10. Thus proportion of observations less than 10 < .37 1 .135 5 .505 (slightly more than 50%) The exact value of this proportion is 47y90 5 .522. ■ There are no hard-and-fast rules concerning either the number of classes or the choice of classes themselves. Between 5 and 20 classes will be satisfactory for most data sets. Generally, the larger the number of observations in a data set, the more classes should be used. A reasonable rule of thumb is number of classes < Ïnumber of observations Equal-width classes may not be a sensible choice if there are some regions of the measurement scale that have a high concentration of data values and other parts where data is quite sparse. Figure 1.9 shows a dotplot of such a data set; there is high concentration in the middle, and relatively few observations stretched out to either side. Using a small number of equal-width classes results in almost all obser- vations falling in just one or two of the classes. If a large number of equal-width classes are used, many classes will have zero frequency. A sound choice is to use a few wider intervals near extreme observations and narrower intervals in the region of high concentration. (a) (b) (c) Figure 1.9 Selecting class intervals for “varying density” data: (a) many short equal-width intervals; (b) a few wide equal-width intervals; (c) unequal-width intervals constructing a Histogram for continuous data: unequal class Widths After determining frequencies and relative frequencies, calculate the height of each rectangle using the formula rectangle height 5 relative frequency of the class class width The resulting rectangle heights are usually called densities, and the vertical scale is the density scale. This prescription will also work when class widths are equal.
  • 38. 1.2 pictorial and tabular Methods in Descriptive Statistics 21 Corrosion of reinforcing steel is a serious problem in concrete structures located in environments affected by severe weather conditions. For this reason, research- ers have been investigating the use of reinforcing bars made of composite material. One study was carried out to develop guidelines for bonding glass-fiber-reinforced plastic rebars to concrete (“Design Recommendations for Bond of GFRP Rebars to Concrete,” J. of Structural Engr., 1996: 247–254). Consider the following 48 observations on measured bond strength: 11.5 12.1 9.9 9.3 7.8 6.2 6.6 7.0 13.4 17.1 9.3 5.6 5.7 5.4 5.2 5.1 4.9 10.7 15.2 8.5 4.2 4.0 3.9 3.8 3.6 3.4 20.6 25.5 13.8 12.6 13.1 8.9 8.2 10.7 14.2 7.6 5.2 5.5 5.1 5.0 5.2 4.8 4.1 3.8 3.7 3.6 3.6 3.6 ExamplE 1.11 Class 22,4 42,6 62,8 82,12 122,20 202,30 Frequency 9 15 5 9 8 2 Relative frequency .1875 .3125 .1042 .1875 .1667 .0417 Density .094 .156 .052 .047 .021 .004 2 4 6 8 12 20 30 Bond strength 0.00 0.05 0.10 Density 0.15 Figure 1.10 A Minitab density histogram for the bond strength data of Example 1.11 n When class widths are unequal, not using a density scale will give a pic- ture with distorted areas. For equal-class widths, the divisor is the same in each density calculation, and the extra arithmetic simply results in a rescaling of the vertical axis (i.e., the histogram using relative frequency and the one using den- sity will have exactly the same appearance). A density histogram does have one interesting property. Multiplying both sides of the formula for density by the class width gives relative frequency 5 (class width)(density) 5 (rectangle width)(rectangle height) 5 rectangle area That is, the area of each rectangle is the relative frequency of the corresponding class. Furthermore, since the sum of relative frequencies should be 1, the total area of all rectangles in a density histogram is l. It is always possible to draw a histogram The resulting histogram appears in Figure 1.10. The right or upper tail stretches out much farther than does the left or lower tail—a substantial departure from symmetry.
  • 39. 22 Chapter 1 Overview and Descriptive Statistics so that the area equals the relative frequency (this is true also for a histogram of dis- crete data)—just use the density scale. This property will play an important role in motivating models for distributions in Chapter 4. Histogram Shapes Histograms come in a variety of shapes. A unimodal histogram is one that rises to a single peak and then declines. A bimodal histogram has two different peaks. Bimodality can occur when the data set consists of observations on two quite different kinds of individuals or objects. For example, consider a large data set consisting of driving times for automobiles traveling between San Luis Obispo, California, and Monterey, California (exclusive of stopping time for sightseeing, eating, etc.). This histogram would show two peaks: one for those cars that took the inland route (roughly 2.5 hours) and another for those cars traveling up the coast (3.5–4 hours). However, bimodality does not automatically follow in such situa- tions. Only if the two separate histograms are “far apart” relative to their spreads will bimodality occur in the histogram of combined data. Thus a large data set consisting of heights of college students should not result in a bimodal histogram because the typical male height of about 69 inches is not far enough above the typi- cal female height of about 64–65 inches. A histogram with more than two peaks is said to be multimodal. Of course, the number of peaks may well depend on the choice of class intervals, particularly with a small number of observations. The larger the number of classes, the more likely it is that bimodality or multimodality will manifest itself. Figure 1.11(a) shows a Minitab histogram of the weights (lb) of the 124 play- ers listed on the rosters of the San Francisco 49ers and the New England Patriots (teams the author would like to see meet in the Super Bowl) as of Nov. 20, 2009. Figure 1.11(b) is a smoothed histogram (actually what is called a density estimate) of the data from the R software package. Both the histogram and the smoothed his- togram show three distinct peaks; the one on the right is for linemen, the middle peak corresponds to linebacker weights, and the peak on the left is for all other players (wide receivers, quarterbacks, etc.). ExamplE 1.12 180 0 2 4 6 8 Percent 10 12 14 200 (a) 220 240 260 Weight 280 300 320 340 Figure 1.11 NFL player weights (a) Histogram (b) Smoothed histogram
  • 40. 1.2 Pictorial and Tabular Methods in Descriptive Statistics 23 150 0.000 0.002 0.004 0.006 Density Estimate 0.008 0.010 0.012 200 250 Player Weight 300 350 (b) We ght Figure 1.11 (continued) n A histogram is symmetric if the left half is a mirror image of the right half. A unimodal histogram is positively skewed if the right or upper tail is stretched out compared with the left or lower tail and negatively skewed if the stretching is to the left. Figure 1.12 shows “smoothed” histograms, obtained by superimposing a smooth curve on the rectangles, that illustrate the various possibilities. (a) (d) (b) (c) Figure 1.12 Smoothed histograms: (a) symmetric unimodal; (b) bimodal; (c) positively skewed; and (d) negatively skewed Qualitative Data Both a frequency distribution and a histogram can be constructed when the data set is qualitative (categorical) in nature. In some cases, there will be a natural ordering of classes—for example, freshmen, sophomores, juniors, seniors, graduate students—whereas in other cases the order will be arbitrary—for example, Catholic, Jewish, Protestant, and the like. With such categorical data, the intervals above which rectangles are constructed should have equal width. The Public Policy Institute of California carried out a telephone survey of 2501 California adult residents during April 2006 to ascertain how they felt about various aspects of K–12 public education. One question asked was “Overall, how would you rate the quality of public schools in your neighborhood today?” Table 1.2 displays the frequencies and relative frequencies, and Figure 1.13 shows the corresponding histogram (bar chart). ExamplE 1.13
  • 41. 24 Chapter 1 Overview and Descriptive Statistics More than half the respondents gave an A or B rating, and only slightly more than 10% gave a D or F rating. The percentages for parents of public school children were somewhat more favorable to schools: 24%, 40%, 24%, 6%, 4%, and 2%. n Multivariate data Multivariate data is generally rather difficult to describe visually. Several methods for doing so appear later in the book, notably scatterplots for bivariate numerical data. Table 1.2 Frequency Distribution for the School Rating Data Rating Frequency Relative Frequency A 478 .191 B 893 .357 C 680 .272 D 178 .071 F 100 .040 Don’t know 172 .069 2501 1.000 Relative Frequency Rating 0.4 0.3 0.2 0.1 0.0 A B C D F Don’t know Chart of Relative Frequency vs. Rating Figure 1.13 Histogram of the school rating data from Minitab 10. Consider the strength data for beams given in Example 1.2. a. Construct a stem-and-leaf display of the data. What appears to be a representative strength value? Do the observations appear to be highly concentrated about the representative value or rather spread out? b. Does the display appear to be reasonably symmetric about a representative value, or would you describe its shape in some other way? c. Do there appear to be any outlying strength values? d. What proportion of strength observations in this sample exceed 10 MPa? 11. The accompanying specific gravity values for various wood types used in construction appeared in the article “Bolted Connection Design Values Based on European Yield Model” (J. of Structural Engr., 1993: 2169–2186): .31 .35 .36 .36 .37 .38 .40 .40 .40 .41 .41 .42 .42 .42 .42 .42 .43 .44 .45 .46 .46 .47 .48 .48 .48 .51 .54 .54 .55 .58 .62 .66 .66 .67 .68 .75 EXERCISES Section 1.2 (10–32)
  • 42. Other documents randomly have different content
  • 43. Before we leave you we wish to make one request. That is: If you follow our advice and fit up your box as we have directed, you will add nothing else thereto. No, nothing whatever—not even to fill up an awkward corner.
  • 44. VARIETIES. All Will Come Right. Things can never go badly wrong If the heart be true and the love be strong, For the mist, if it comes, and the weeping rain Will be changed by the love into sunshine again.— George MacDonald. In Praise of Work.—Work drives away depression, whets the appetite for food, invites sleep, promotes digestion, strengthens the muscles and sinews, gives free circulation to the blood, stimulates the intellectual faculties, provides the comforts of life, develops all the powers which it brings into exercise, transforms stupid ignorance into brilliant genius, fills the world with works of art and literature, and develops the resources of nature. Nothing can stand before work. Well Named. Bainbridge: “I know why they are called fugitive poems.” Goldsborough: “Why?” Bainbridge: “Because the author had to run for his life.” Be Wise To-day.—“Dear young friends,” says a popular preacher of the present day, “begin right. You will never find it so easy to make any decisive step as just now. You will get less and less flexible as
  • 45. you grow older. You will get set in your ways. Habits will twine their tendrils round you and hinder your free movement. The truths of the Gospel will become commonplace by familiarity. Associations and companions will get more and more powerful, and you will stiffen as a tree trunk becomes stiffened with the growth of years. Be wise to- day.” Printers’ Errors. In the early half of the present century it was announced in a London newspaper that “Sir Robert Peel, with a party of fiends, was shooting peasants in Ireland,” whereas the Minister and his friends were only indulging in the comparatively harmless amusement of pheasant-shooting. Shortly after the battle of Inkerman one of the morning papers informed its readers that “after a desperate struggle the enemy was repulsed with great laughter.” The omission of a single letter has rarely perhaps played more havoc with a subject which was certainly no laughing matter. No more fault-finding.—The business of fault-finding would soon come to an end if every fault-finder could be only introduced to herself.
  • 46. “THE GUIDING LIGHT.” (From the painting by Marcella Walker.)
  • 47. “THE SONG THE RAINDROPS SING.” By AUGUSTA BRYERS. I stood within my garden, While fell the gentle rain On thirsty leaf and blossom, To bid them smile again. But my heart mourn’d sore the sunbeams And all the joys they bring, Till I heard, like sweet bells chiming, The song the raindrops sing. They told of fading flow’rets, Of little leaves that die, While the lark soars on unceasing, And the sunlight floods the sky. And they said, “We are not teardrops, But tend’rest show’rs of love, To cheer the heart that’s weary, And lift sad eyes above.” And I thought, as fell the raindrops On fainting buds and leaves, So may sweet words and hopeful Wake for the heart that grieves. That to the world’s great garden Each may his harvest bring, Thro’ the sunshine or the storm-cloud, Or the song the raindrops sing.
  • 49. THE HOUSE WITH THE VERANDAH. By ISABELLA FYVIE MAYO, Author of “Other People’s Stairs,” “Her Object in Life,” etc. CHAPTER III. AS A LITTLE CHILD. The day after the episode with Pollie Captain Grant’s letter duly arrived. He was only too delighted to think that his suggestion had borne fruit with his old friend. “It’s just settled that the Slains Castle will leave Peterhead on the twenty-eighth of this month,” he wrote. “That will give you plenty of time. But as we don’t touch anywhere in Great Britain, you will have to join me here. Don’t take the long railway journey. Like a wise man, come as far as Aberdeen in the steamer, and then you have not much further to travel. As for the hundred pounds, I tell you, my dear fellow, that we don’t intend to be away longer than one year, and that is the precise figure I should name. But I’ll go on to add that if we should happen to be a little longer, you shall not be charged a shilling more. Persuade Mrs. Challoner to come north with you and to bring the boy, and then she’ll see you fairly aboard, and will note what snug quarters you’ll have, and be able to see you with her mind’s eye all the time you are away. My wife hopes she will come.” “You see there is no doubt that you are to go, Charlie,” said Mrs. Challoner. “Everything has worked to that end without one hitch. You are to go, because you are to come back strong and well. It is
  • 50. clearly the will of God that you go. I am so glad that my plans have been carried out beyond my own power. If it had all been my planning, I might have doubted afterwards.” “But, Lucy,” said the young husband, his pleasant frank face shining with the mysterious light which often illumines the countenances which have just been bravely turned to confront the darkness of the Valley of the Shadow of Death, “I own, with you, that it does seem to be God’s will, but we must not think so now, unless we can continue to think so whatever be the result. Let us say together solemnly, ‘God’s will be done: not our will, but His.’” And there was a little silence. Lucy resolutely refused to consider the Grants’ invitation to Peterhead. She warmly seconded their suggestion that Charlie should travel in the snug, well-attended saloon of the magnificent coasting steamer rather than in the train. She refused to listen to his plea, that unless she would come with him this mode of travel would part them a day or two sooner than if he went by rail. She even nerved herself to say that when a matter of a year’s separation was in hand, what could a day more or less signify? It would be best that they should part in their own home, where life could go straight on, and she could set to work at once. It would be dreadful to come back to the house alone. (“Pollie would have a welcome for you,” interposed young Challoner.) Besides, who knew whether she and little Hugh might not prove to be the very worst of sailors, and then Charlie would go off quite unhappy, thinking of the misery of their return journey. In all these arguments Lucy knew there was force and good sense, but she knew, too, that but for the secret knowledge that her whole household life was crumbling about her, they would not for one moment have sufficed to withhold her from clinging to her husband’s presence till the latest possible moment. “Now, Pollie,” she said to her servant, “I am able to tell you why I wish silence about your departure. It is finally arranged that my
  • 51. husband is going for a long sea-voyage. He will be away for about a year. When we first began to think of this it was a great consideration that I and the boy should have you—our household friend of seven years’ standing—to be with us. That thought was a great comfort. Now as we find this cannot be, I think we may save Mr. Challoner the distress of knowing about it before he goes. It might make him wish to postpone his going. And he ought to be off before the winter.” Pollie was sniffing. “I wish I’d known. I was huffed a bit thinking I wasn’t to be much consideration in any changes you was making.” “But what made you think of changes?” asked Mrs. Challoner. “This is no change. All will go on the same, though with your master away for a year, and then we shall hope that all will go on the same when he returns. It is only since this very morning, Pollie, that we have been quite sure this voyage can be arranged. You are the very first person I have told. We think those in a household, whose interests are naturally bound up together, should be the first to hear such news.” “Didn’t you mention it to Mrs. Brand, m’m?” asked Pollie. “Certainly not,” Lucy answered. “There was no use mentioning it to anybody when it might have come to nothing.” Lucy rather wondered at this question. It did not occur to her that Mrs. Brand had had the girl to herself on the afternoon when Lucy had gone to see Dr. Ivery, nor that it could have been she who had put the idea of “changes” into the head of her sister’s servant. And Pollie kept the secret, as servants often do, or it would go hard with many a gadding and gossipping “lady.” “You may trust me not to speak, m’m,” said poor Pollie, still sniffing. But she put such emphasis on “me” that it reminded her mistress that Mrs. Brand also knew of Pollie’s imminent departure and would be almost sure to blurt it out before Charlie. Lucy hated to ask her sister to keep a secret from him. It would be as useless, too, as
  • 52. painful, for Florence would be sure to “forget” or “not to think.” As such “forgetfulness” or “want of thought” always pleads “meaning no harm,” and resents indignation as harsh and uncharitable, there is nothing to be done but to prevent their harmful doings by keeping them out of the way. And while Lucy was pondering what steps she could take in this direction, she got warning that Charlie might get an impression of trouble of some sort and question the facts out of poor Pollie however loyally she might try to keep them back. For Pollie seemed unable to keep from “sniffing,” and when she went into the drawing- room the sight of Mr. Challoner brought on a very bad attack. “Why, what’s the matter, Pollie?” asked he, quite innocently. “No bad news from home, I hope?” “Oh, no, sir! Yes—oh, it’s a-thinking of you a-going away, sir!” said Pollie desperately, and rushed from the room. “I had no idea that Pollie cared a bit for me,” observed her master. “I thought she regarded me as a wage-paying machine, and that you and the boy were the idols. It is quite flattering to find that I came in for a bit of the adoration.” “Of course Pollie cares for you after living in the house with us for seven years, and you always so considerate and polite,” said Lucy. “Considerate and polite!” echoed Charlie. “Well, I do hope I’m not quite a brute in my own home, and I don’t know how many other fellows I’ve rebuked for calling their landladies’ servants ‘the slavey.’ And they’ve often said to me, ‘Well, but it’s true,’ and I say, ‘Then it oughtn’t to be true.’” “Of course it should not be true,” Lucy responded. “It is quite touching to think Pollie cares for my going away,” Charlie went on. “But I tell you, Lucy, it occurs to me that it is not my going that has grieved her, but the thought of your being left alone.” He paused for a moment. “She thinks you’ll be so dull,” he said, fearing lest his words might have brought to Lucy’s mind the idea they had
  • 53. wakened in his own—to wit, that probably Pollie regarded this temporary separation as likely to be for the earthly forever. “Well, I can only say again,” he went on, “that my greatest comfort is that she is with you. What a blessing we have not changed our servant perpetually as the Brands do! How could I go off and leave you with an utter stranger, who might desert you the next week?” “We never know what changes may come,” said Lucy, to whom silence began to seem criminal. “But we must trust God to provide for emergencies. They never are so bad as they look beforehand.” “That is quite true,” answered Charlie, “and that’s just how I feel—a special trial has come to us, and a special blessing is prepared for it in the shape of Pollie.” Lucy could endure no more. She jumped up and went out of the room so hurriedly that Charlie thought she must have heard a ring at the door-bell. She really went to little Hugh’s bedroom, and sat down in the darkness beside the cot where he was already asleep. She began to revolve schemes. She would get Charlie to go with her and the boy to spend the interval before his departure at the seaside. That would take him away in safety from Florence’s chatter and Pollie’s tears. It had other substantial recommendations, too, such as she could urge. It was highly desirable that before his great journey Charlie should shake off the little ways and weaknesses of invalidism as a “change” helps a convalescent to do. Then she would add what she knew would be a supreme argument with him—that her teaching duties at the Institute would begin at the Christmas quarter, and that she ought not to take up these labours when below par in nerve and health after her anxious nursing. She would plead, too, the charm of the little family of three being together quite by themselves in a strange place, where they would be safe from any calls or condolences or curiosity, and could wander about or rest, just at their own sweet will. Of course, this trip would cost a little money, but not very much, and apart from all its other charms, Lucy felt that it would soothe her own heart in the pain of having been forced to refuse to accompany her husband to his port of departure.
  • 54. “You are a funny little woman,” said her husband, when she went downstairs again and made these suggestions. “What else will you think out so cleverly? I shall like this of all things; and all the while I am away, it will be so much cheerier to have last thoughts of each other taking quiet holiday by the sea, than of each of us mewed up in a sick-room, coddling and being coddled.” “And I’ll be able to do two or three sketches,” Lucy went on. “I should like to do them with you looking on, to know if you think my hand has lost any of its cunning. It will get me up to the mark, too. I daresay I may do something that will more than pay for our trip.” “Never say that women are not practical!” laughed Mr. Challoner. “While I am only thinking of sentiment, the wifie has gone on to the shillings! But ah, Lucy dear, don’t think I don’t know that you want the shillings only for the sake of the sentiment!” They sat together hand in hand. They had been married seven years, and they were on the eve of separation. Both hearts were full of feelings to which they dared not give utterance. One must not stir a brimming cup lest it overflow. “I vote we go to Deal!” cried Charlie at last. “Isn’t it rather an east windy place for an invalid?” asked Lucy. “But I’m not an invalid, and am not going to pass as one,” he said gaily. “I’m a fellow starting on a sea-voyage! No, no, Lucy, don’t doom me to some sheltered cubby hole of a ‘resort,’ where half the population are in bath-chairs and the other half in respirators. It would give us the blues! If you’ll let us go to Deal, I’ll promise to be very good,” he went on with his indomitable boyishness. “I’ll only go out when you say I may, and I’ll come in the minute you say I must. Only let us go there!” In the depths of his heart lay the secret thought that to go to any place where consumptives are wont to congregate, would inevitably fill Lucy’s mind with dire forebodings, besides exposing her to the depressing influences of the conventional “sympathy” or forced “hopefulness” which emanate from well-meaning landladies and
  • 55. others trained by experience to regard their habitat as one of the last stages on life’s journey. All the next day Lucy hurriedly made her little preparations for the trip. She said to Charlie that, if Deal suited him, and if they got snug apartments, they might stay on till the very end, so that he need only use their own house to rest and sleep in on his way through London to the north. “As for any sea-going things you want—lockers, waterproofs, and so on—we can get them at Deal,” she said. Only when all was in readiness for their start, while the cab which was to take them to the station actually stood at the door, did she post a letter to Mrs. Brand, giving the first intimation of their present move and of Charlie’s future journey. “We have had it in view for some time,” she wrote, for it was impossible for Lucy Challoner to be inferentially untruthful, “but it was only decided the day before yesterday.” After the railway journey, whose slight fatigue the convalescent bore capitally, they went straight to an hotel and had lunch, and there Lucy left her husband and little Hugh, while she went in quest of “apartments.” She wanted cleanliness, economy, and a sea view. Like all people who know what they want she was not long in getting suited. She decided on the second set of rooms at which she looked, preferring them to the first, because being upstairs, they commanded a wider horizon. Also she felt attracted to the second landlady, a quiet, grave, middle-aged woman of few words, whose chambers, with their well-kept old-fashioned furniture had—what is the greatest charm of hired rooms—no suggestion of previous temporary occupancy. The landlady had everything made snug before their arrival; the curtains were drawn, a cheerful fire was ablaze, and the lit lamp stood in the centre of the table spread with pretty blue crockery and provided with ham, eggs and toast. There are few who can wholly resist the genial influence of such surroundings. Charlie and Lucy
  • 56. Challoner yielded themselves up to them, and little Hugh danced and clapped his hands. Lucy felt as if she was happier than she had ever thought to be again. Safe from the impending worries of the last few days, it seemed as if the great anxiety which hung like a Damocles’ sword over her life was for the time held off. “I believe this is really doing you good, Lucy,” said her husband. “For me, I feel a different man already.” The bed-chamber opened from the parlour, and Hugh was not allowed to be long in seeking the little cot which the landlady had fixed up for him in his parents’ room. But while Lucy passed to and fro unpacking and preparing for the night, Mr. Challoner and Hugh got behind the window curtains and shut themselves away from the cheery room and out with the misty sea view. Lucy could hear them talking behind the drapery. “There go the ships!” said the young father. “Look, Hugh, you can see them by their lights! Look what a lot of them there are! And how many lights they are showing!” “How glad the sailors must be to see land again!” lisped Hugh. “They must feel they are safe at last!” “Glad to feel they are nearly home at last, Hugh,” corrected his father. “For ships are in much more danger when they are near land than when they are out in mid-ocean. What looks safest isn’t always safe, my boy.” “I’d like to go on a ship!” said Hugh. “I daresay you will go in time, sonny,” returned Mr. Challoner. “By- and-by, Hughie, I am going on a big ship—a big ship with three masts—and I am going for a long, long voyage. And you’ll have to take care of mamma while I am away. And then when I come back, and you grow up, very likely you will go for some long voyage, and then I will stay at home and take care of mamma.” “Are you going to-morrow, papa?” said the little voice in an awed whisper, and Lucy heard a movement as if the curly head snuggled
  • 57. on papa’s shoulder. How good it was of Charlie to tell the child himself! The thought of having to do so had haunted her, for she measured her little lad’s love for his father by what she knew it meant in his life rather than by that childish inadequacy for profound emotion which makes a child such a poignantly pathetic figure when it appears on any tragic scene. “To-morrow!” echoed papa in his brightest tone. “No, indeed, not for many days—two or three weeks! We are going to have such a happy time. We’ll go out and pick up shells, and if there is a very warm sunshiny day with only little waves on the sea, maybe we’ll go out in a boat—that’s if mamma will come with us,” he added, remembering his promised obedience to her discretion. Hugh broke away from his father and ran back into the room. “Oh, mamma,” he cried, “you will let us go out in a boat, won’t you? If the day’s sunshiny—an’ it’s sure to be—and if the waves are ever such little teeny weenies! Oh, mamma, yes!” “We shall see, Hugh. We will do what seems wise. It is time you went to bed.” In commanding her voice to be steady, it sounded sharp and hard. It checked Hugh’s ecstasy, and brought his father out from behind the curtain. She felt that Charlie’s expression was surprised, and that she would break down utterly if she had to meet his eyes. Without looking back, she caught the hand of the silenced and awed Hugh, and hurried him away to the other room. Neither of them spoke while she helped the child undress. Even her eyes did not answer his, though she saw his blue orbs raised wistfully. He knelt down and said his little prayer, the “Our Father,” and the little verse of godly nursery tradition—
  • 58. “Now I lay me down to sleep, I pray the Lord my soul to keep. If I die before I wake, I pray the Lord my soul to take.” Then he passed on to the extempore petitions, in which he was always allowed free expression. To-night they came with unwonted faltering and hesitation. The child-soul was aware of a disturbed atmosphere around it—of groping somehow in darkness uncertain of itself. “Please, God, bless dear papa—and dear mamma—and dear Pollie— and dear Aunt Florence—and—and dear Mr. Brand—and dear Mrs. May—and make me a good boy. Amen.” “Dear Mrs. May?” Who was that? Lucy had to pause for a moment ere she remembered that this was the name of the landlady, whom the child had seen for the first time two hours before, and who had won his heart by bringing in for him a special tea plate painted with a picture of Walmer Castle! It was only as Hugh stood in his little night-gown, half stepping into the cot, that he said, almost with a whimper— “I never kissed good night to papa.” “Then run away and kiss him now,” said Lucy in her natural tone. Hugh was himself again in a second, scampering away, kicking aside his flowing white robes with his little pink feet, and bestowing upon his father what was evidently an ecstasy of hugs, accompanied by a perfect storm of hearty “smacks.” Then he gallopaded back, hopped into bed, held out his arms to his mother, and clasped her down to himself in a rapturous embrace, to which she responded with an added tenderness born of a little remorse for the foolish pang he had given her.
  • 59. “But you will let us go in the boat?” he whispered before he released her. She kissed him again as her only answer, and went back into the sitting-room. Her husband looked up at her with some solicitude, and drew up a chair for her at his side. “I’m afraid you have been very much overwrought, Lucy,” he said. “It’s no use saying ‘No.’ I can hear it in your voice. When you went out of the room, I thought you were actually crying. I was quite uneasy till I saw you come in again all right.” “I’m afraid I’m not fit to be your nurse if I frighten my patient,” she said, forcing a smile. “I was very silly. I was not crying. I’m rather afraid I was cross for a minute.” “Cross?” questioned her husband incredulously. “Yes,” she answered. “It vexed me to think how soon Hugh forgot about your going away and thought only of getting a sail in a boat.” Charlie pondered for a second, for the whole thing had escaped him. “I know I was foolish,” she said. “Hughie is only a little child, and cannot realise things. I’m sure he would have begun to cry if you’d said you were going away to-morrow. But when you said ‘not for two or three weeks’ he could put it right out of his head. It’s only childlike, after all.” “But we ought all to be childlike, ought we not, Lucy?” answered Charlie thoughtfully. “And we are, more or less, even the worst of us. All who love know they will have to part; but they don’t go on thinking about it all the years they are together. And days are as long as years to poor little Hughie.” “And then he was so taken up about going in the boat!” said Lucy, with a half-reluctant smile. “Well, and why not?” asked Charlie undauntedly, “We ought to be like that, too—taken up with what is our present business—this is a great business for Hughie—and especially with what we may bring
  • 60. about by our own efforts, as he felt he might by his coaxings. That is our affair—not something that is going to happen some time or other, without any help of ours.” “I know I’m an idiot,” said Lucy humbly; “but so much seemed to come at once! He actually prayed in the same breath for you and for Mrs. May—the strange landlady downstairs,” she explained. And she reminded him of the little incident of the picture plate. Charlie leaned back in his chair and enjoyed a quiet deep laugh. “And there he is altogether right,” he said, “for love is all off the same piece whether it’s in a great fold that ties two lives together— like yours and mine, Lucy—or but some little scrap that just binds up a pricked finger. For God is Love, and therefore Love is God, and any affection that gets taken outside that unity is—just—an idol!” “‘And the idols He shall utterly abolish,’” he added after a moment’s pause. As he spoke, he drew back the curtain. The moon was up, silvering the mist that hung low over the Channel. They sat side by side in silence. Lucy was trying to gather from her husband’s words some cheer for the one trial she could not feel it right to confide to him—the only secret she had ever withheld from him. After a fashion of which most of us have had pathetic experience, she strove to get an oracle at a venture. “It comforts me so to talk to you,” she said. “You smooth things out. Worries will come, and jarrings. What shall I do when you are not here to say good words to me?” “You will say them to yourself,” he answered. “You will hear them in your own heart. Sometimes, indeed, it seems to me as if I merely hear your thoughts and put them into words for you.” (To be continued.)
  • 61. OUR PUZZLE REPORT: A WELL- BRED GIRL (No. 2). SOLUTION. A Well-Bred Girl (No. 2). 1. A well-bred girl always makes herself pleasant to those about her, especially to the lonely and unhappy. 2. A well-bred girl always dresses without extravagance, and yet avoids severity. 3. A well-bred girl always eats and drinks noiselessly, not even excepting soup. 4. A well-bred girl always refrains from discussing articles of diet during meals. 5. A well-bred girl always talks quietly. 6. A well-bred girl always upholds her own dignity without apparent effort. 7. A well-bred girl always remembers that striking manners are bad manners. 56, Paternoster Row, Dec., 1898. Prize Winners. Six Shillings and Eightpence Each.
  • 62. Eliza Acworth, Blenheim Mount, Bradford. Dora Mary Barling, Farnham, Surrey. M. A. C. Crabb, Hemel Hempstead, Herts. Louie Drury, Edith Road, West Kensington. Alfred J. Knight, Edith Road, West Kensington. F. Miller, Brecknock Road, N. Mrs. Nicholls, Parlors Hall, Bridgnorth. Lilla Patterson, Kilmore, Holywood. Gertrude M. Stott, Yarburgh Street, Alexandra Park, Manchester. Norah M. Sullivan, Otranto Place, Sandycove, Co. Dublin. Emily M. Tattam, Green Lanes, N. Bettie Temple, Brockley, S.E. Ethel Tomlinson, Burton-on-Trent. Ada G. Waide, Methley, Leeds. Jeanie Walker, Esk Terrace, Whitby. Eleanor Whitcher, The Drive, West Brighton. Very Highly Commended. Ethel B. Angear, Lily Belling, E. Blunt, Margaret E. Bourne, Annie J. Cather, M. J. Champneys, Maggie Coombes, Minnie Cornwell, Leonard Duncan, Mrs. Fleming, A. and F. Fooks, Miss Fryer, Margaret S. Hall, L. A. E. Hartshorn, Helen Jones, Alice M. Kellett, B. M. Linington, E. E. Lockyear, Annie Manderson, Rev. H. Milnes, S. H. Phillips, Robina Potts, Mrs. Prestige, Ada Rickards, John Rodway, Emma M. Sanderson, Helen Simpson, Mildred M. Skrine, Helen Smith, Annie Stanser, Ellen C. Tarrant, Agnes M. Vincent, Anna Walker, N. H. White, Emily M. P. Wood, Elizabeth Yarwood, Edith M. Younge, Helen B. Younger, Euphemia T. Yule. Highly Commended. Mrs. Allen, Margaret M. Anthony, Emily Bergin, Alice Mary Blake, Ada K. Bullough, Martha Cairns, Robert H. Carmichael, Mrs. F. Chettle, Dora Clarke, Alice M. Cooper, Lillian Clews, Mrs. Crossman, E. M. Dickson, Mrs. F. Farrar, Florence Graves, Marie E. Hancock, Ellie Hanlon, Lizzie J. Hetherington, Edith L. Howse, M. A. J. Hunter, E.
  • 63. Marian Jupe, Annie G. Luck, Alice Luckhurst, Jennie M. McCall, Ethel C. McMaster, Helen A. Manning, Geo. H. Manning, S. Mason, Mrs. A. Motum, E. K. Palmer, Hilda Petley, Hannah E. Powell, Ellen M. Price, Helen J. Ransom, H. F. Richards, Henzell G. Robson, Chas. Severs, A. A. L. Shave, Agnes A. M. Shearer, Merriott T. Smiley, Gertrude Stirling, M. Stuart, Theodore J. Tasker, Constance Taylor, Marie Threlfall, Violet C. Todd, Queenie Tyssen, Mary Watts, Alice Woodhead, H. F. Yeoman. EXAMINERS’ REPORT. The thirst for information continues, and again we have been inundated with solutions. Happily for our peace of mind, many of them were far from perfect, and the more subtle methods of our Art have not been called into requisition. In short, the prize solutions were perfect; the rest were not. There were some points about the puzzle which deserved rather more attention than the casual solver was inclined to bestow, and it seems to be necessary to refer to them in detail. First let us deal with the supposed mistakes. One solver, with admirable conciseness, thus calls attention to them— “Three mistakes, line 1 an h too many; line 2 an h too few; line 6 an s too many.” Now, taking lines 1 and 2 together we find h × 2 ÷ erself + pleasant tot. That worked out (on somewhat doubtful mathematical lines) yields “h|erself pleasant to t|h”; then we find ose about h, and all is as straightforward as possible. So much for “mistakes” one and two. And the third “mistake” is no worse, for the correct reading of the part referred to is not “es minus ss” but “seven es minus ss,” which introduces the word “even” into the solution. The number of expert solvers who failed at this point was quite astonishing.
  • 64. In very many solutions the word around was substituted for about in the first sentence. It could hardly be regarded as a bad mistake; but, inasmuch as the letters o s e were on only three sides of the h, about had to be regarded as the more perfect reading. The next point to be noticed is the omission in many solutions of the word always in the second admonition. But the more careful solvers noticed that in this instance five stars were employed, and rightly divined the meaning of the fifth. The fact that the word always appeared in every other sentence ought to have opened the eyes of those who fell into the trap. The “noise less ly” in the third admonition also gave much trouble, and various quaint solutions were suggested, as, for instance, “leisurely” and “sparingly.” These two solutions, it should be noted, were generally associated with the particular kind of soup indicated in the puzzle:— “A well-bred girl always eats and drinks sparingly (or, leisurely) not even excepting mock-turtle soup.” It would not have occurred to us that active greediness in the presence of that particular delicacy was sufficiently usual to call for rebuke. Most unhappily, the fault we do indicate is quite as common as it is unpleasant. The die in the fourth admonition was generally identified; but a few solvers could make nothing of it excepting weight. A weight naturally suggests a balance, and accordingly we learn that “A well-bred girl always refrains from balancing articles of weight during meals.” We believe she does, though experiments of a kindred nature with the lighter articles at hand are not wholly unknown to us. They seldom prove much, excepting the clumsiness of the experimenter, and they do not conduce to that repose which is the essence of refinement. One would have thought that the fifth sentence was simplicity itself; but many solvers wrote “speaks” instead of “talks.” It must have required a violent effort of imagination to convert the depicted stalks into “sspeaks”!
  • 65. In the solutions of the last sentence “rough” continually appeared instead of “striking.” It could not be regarded as a satisfactory interpretation of the picture which is obviously T striking M. One most interesting reading of the sentence deserves to be recorded— “A well-bred girl always remembers that boys’ manners are bad manners”! Truly there are many girls whose conduct towards their brothers seems to be based upon some such theory as this. They are not the most pleasing type of maidens, and as to our opinion of their “good breeding,” let us add an eighth admonition— “A well-bred girl never nags.”
  • 66. SHEILA. A STORY FOR GIRLS. By EVELYN EVERETT-GREEN, Author of “Greyfriars,” “Half-a-dozen Sisters,” etc. CHAPTER II. UNCLE TOM. E came from the darkness without into the warmth and brightness of the hall, and threw back his heavy Inverness cape, revealing a square, bearded face, a broad, well-knit figure, and a pair of shrewd and not unkindly brown eyes. “You are our Uncle Tom,” said Oscar, going forward to meet him. “We are very glad to see you. It is kind of you to come.” “Well, well, boy, duty is duty all the world over. I would have come a fortnight ago, but it was impossible. No disrespect meant to your father, you understand. So you are poor Maud’s children, are you? We always called her ‘poor Maud’ at home, though I scarcely know why. She was happy enough, I know, but she seemed like one dead to us somehow. You are a bit like what she was as a girl, I can see. Perhaps the sister favours her more,” and he looked across at Sheila, who now came forward with outstretched hand. “How do you do, Uncle Tom? I hope you are not very cold. It has been quite warm till yesterday, and then the cold came back. We are