probability problem with brief solution 5

STAT 714 HOMEWORK 5
1. Consider the general linear model Y = Xβ + , where E() = 0 and cov() = σ2
I.
Suppose that X is n × p with rank r p. Let b
β = (X0
X)−
X0
Y denote a least squares
estimator of β.
(a) Find E(b
β) and cov(b
β).
(b) Do your results in part (a) change when cov() = σ2
V, where V 6= I?
(c) Do your answers in part (a) change when r = p?
2. Consider the regression model Yi = β0 + β1xi + β2(3x2
i − 2) + i, for i = 1, 2, 3, where
x1 = −1, x2 = 0, and x3 = 1.
(a) Put this model into Y = Xβ + form.
(b) Find the least-squares estimates of β0, β1, and β2.
(c) Show that the least-squares estimates of β0 and β1 are unchanged if β2 = 0. Why do
you think this happens?
3. Consider an experiment to study the effect of baking time, x, on the breaking strength
of a ceramic, Y . The following eight data values were obtained:
x 2 6 8
y 15, 20, 25 21, 25, 29 33, 37
(a) Consider the cell means model Yij = µi + ij, for i = 1, 2, 3 and j = 1, 2, ..., ni, where
n1 = n2 = 3 and n3 = 2. Put this model into Y = Xβ + form, where β = (µ1, µ2, µ3)0
.
(b) Consider the model Yij = β0 + β1xi + β2x2
i + ij, where x1 = 2, x2 = 6, and x3 = 8.
Write this model as Y = Wγ + , where γ = (β0, β1, β2)0
.
(c) Show that C(X) = C(W). What does this imply about the fitted values and residuals
for these two models?
4. Consider the general linear model Y = Xβ + , where E() = 0 and cov() = σ2
I.
Let b
Y denote the vector of least squares fitted values and b
e denote the vector of least
squares residuals. Compute
(a) E( b
Y)
(b) cov( b
Y)
(c) E(b
e)
(d) cov(b
e)
(e) cov( b
Y,b
e).
5. The observed tension, Y , in a nonextensible string required to maintain a body of
unknown weight, w, in equilibrium on a smooth inclined plane of angle θ, 0 θ π/2,
is a random variable with mean E(Y ) = w sin θ. For n known values θ1, θ2, ..., θn, set by
the experimenter and a given body, the observed data are Y1, Y2, ..., Yn.
(a) Find b
w, the least squares estimator of w, the weight of this body.
(b) Compute E( b
w) and var( b
w). You may assume that Y1, Y2, ..., Yn are independent.
PAGE 1

STAT 714 HOMEWORK 5
(c) Let b
Y1, b
Y2, ..., b
Yn denote the least squares fitted values. Is it necessarily true that
Pn
i=1(Yi − b
Yi) = 0? Explain.
6. Define the matrices
X =




1 1 0 0
1 1 0 0
1 0 1 0
1 0 0 1



 and W =




1 1
1 1
1 0
1 0




Take Y = (1, 0, 1, 2)0
.
(a) Show that C(W) ⊂ C(X).
(b) Express Y as the sum of two vectors: one in C(X) and one in N(X0
).
(c) Compute the ppm onto C(W)⊥
C(X) and then project Y onto this space.
7. Consider the simple linear regression model in Section 2.3 (notes).
(a) Show algebraically that PX and PW are both equal to






1
n
+ (x1−x)2
P
i(xi−x)2
1
n
+ (x1−x)(x2−x)
P
i(xi−x)2 · · · 1
n
+ (x1−x)(xn−x)
P
i(xi−x)2
1
n
+ (x1−x)(x2−x)
P
i(xi−x)2
1
n
+ (x2−x)2
P
i(xi−x)2 · · · 1
n
+ (x2−x)(xn−x)
P
i(xi−x)2
.
.
.
.
.
.
...
.
.
.
1
n
+ (x1−x)(xn−x)
P
i(xi−x)2
1
n
+ (x2−x)(xn−x)
P
i(xi−x)2 · · · 1
n
+ (xn−x)2
P
i(xi−x)2






.
In regression analysis, this matrix is called the hat matrix.
(b) Compute the trace of this matrix.
(c) Use the ceramic data from Problem 3 and fit both the centered and uncentered sim-
ple linear regression models. Report least squares estimates for both models. Also, show
that the fitted values and residuals are the same for both fits.
8. Consider two linear models for the same data:
Model 1: Y = Xβ +
Model 2: Y = Wγ + .
Here, X is n × p, W is n × q, β is p × 1, and γ is q × 1. Suppose that C(W) ⊂ C(X).
For Model 1, let b
YX, b
eX, and PX denote the vector of (least-squares) fitted values, the
vector of (least-squares) residuals, and the perpendicular projection matrix onto C(X).
The quantities b
YW , b
eW , and PW are defined analogously.
(a) Show that W = XC, for some p × q matrix C.
(b) Show that PXPW = PW.
(c) Show that ( b
YX − b
YW )0 b
YW = 0.
(d) Show that Y0
Y = b
Y0
W
b
YW + ( b
YX − b
YW )0
( b
YX − b
YW ) + b
e0
Xb
eX.
PAGE 2

STAT 714 HOMEWORK 5
9. The effectiveness of three skin creams was studied in an experiment on s subjects.
On the forearm of each subject, three locations were specified. The three creams were
randomly allocated to locations on each subject; that is, each subject received a complete
set of three treatments. It is assumed that the three observations on the same individual
are correlated and that observations on different subjects are uncorrelated. A statistical
model for this experiment is
Yij = µi + ij,
where Yij denotes the ith measurement on subject j and µi denotes the mean response
for the ith skin cream. Assume that ij, for i = 1, 2, 3 and j = 1, 2, ..., s, are random
variables with E(ij) = 0, var(ij) = σ2
, and corr(ij, i0j) = ρ, for i 6= i0
. Note that
corr(ij, ij0 ) = 0 when j 6= j0
(regardless of i) because ij and ij0 correspond to different
subjects.
(a) Assuming that µi is fixed (not random), express this model as Y = Xβ + . Define
all vectors and matrices. Your design matrix X should be full rank.
(b) Compute cov(Y).
(c) Find b
β, the least-squares estimator of β. Your answer should be a vector (I don’t
want to see matrices in your final answer).
(d) Compute cov(b
β).
10. Let PX denote the perpendicular projection matrix onto C(X).
(a) Give a detailed argument showing that I−PX is the perpendicular projection matrix
onto N(X0
).
(b) Let
X =






1 1 0
1 1 0
1 0 1
1 0 1
1 0 1






.
Compute PX and I − PX.
(c) Express
Y =






1
2
3
4
5






as the sum of two vectors, one in C(X) and one in N(X0
).
(d) For the X matrix in part (b), describe in words what C(X) and N(X0
) are.
11. Consider the general linear model Y = Xβ + , where E() = 0. Let M =
X(X0
X)−
X0
denote the perpendicular projection matrix onto C(X) and denote by b
e the
vector of residuals obtained from the least squares fit. Prove that b
β is a least squares
estimate of β if and only if b
e⊥C(X).
PAGE 3

STAT 714 HOMEWORK 5
12. Consider the linear regression model Y = Xβ +, where X is n×p, where p = k +1,
and k is the number of independent variables in the model (the model also includes an
intercept term). Assume that E() = 0 and cov() = σ2
I. Let M denote the perpendic-
ular projection matrix onto C(X), let J denote an n × n matrix of ones, and let b
β denote
the (unique) OLS estimator.
(a) Show that Y0
(M − n−1
J)Y = b
β
0
X0
Y − nY
2
, where Y is the sample mean of
Y1, Y2, ..., Yn.
(b) Prove that r(M − n−1
J) = k.
(c) Let b
e denote the vector of residuals from the OLS fit. Find E(b
e) and cov(b
e). Do the
least squares residuals have constant variance?
TERMINOLOGY : The Gram-Schmidt procedure is a method for orthonormalizing a
set of basis vectors. Let V be a vector space with basis {u1, u2, ..., ur}. For s = 1, 2, ..., r,
define inductively
v1 = u1/||u1||
ws = us −
s−1
X
i=1
(u0
svi)vi
vs = ws/||ws||.
Then {v1, v2, ..., vr} is an orthonormal basis for V, where vs ∈ span{u1, u2, ..., us}.
13. Consider the vector space V = R3
and the vectors u1 = (1, 1, 1)0
, u2 = (0, 1, 1)0
,
and u3 = (0, 0, 1)0
. Show that {u1, u2, u3} is a basis for V. Then use the Gram-Schmidt
procedure to orthonormalize the basis.
14. Let o1, o2, ..., or be an orthonormal basis for C(X) and O = (o1 o2 · · · or). Prove
that OO0
=
Pr
i=1 oio0
i is the perpendicular projection matrix onto C(X).
15. Consider the one-way ANOVA model Yij = µ + αi + ij, for i = 1, 2, ..., a and
j = 1, 2, ..., ni, so that the design matrix is
Xn×p =





1n1 1n1 0n1 · · · 0n1
1n2 0n2 1n2 · · · 0n2
.
.
.
.
.
.
.
.
.
...
.
.
.
1na 0na 0na · · · 1na





,
where p = a + 1 and n =
P
i ni.
(a) Show that the perpendicular projection matrix onto C(X) is given by the n×n matrix
PX = Blk Diag(n−1
i Jni×ni
),
PAGE 4

STAT 714 HOMEWORK 5
where Jni×ni
is the ni × ni matrix of ones and “Blk Diag” stands for “block diagonal.”
For example, if a = 3, n1 = n2 = 2, and n3 = 3, then n = 7 and
PX =










1/2 1/2 0 0 0 0 0
1/2 1/2 0 0 0 0 0
0 0 1/2 1/2 0 0 0
0 0 1/2 1/2 0 0 0
0 0 0 0 1/3 1/3 1/3
0 0 0 0 1/3 1/3 1/3
0 0 0 0 1/3 1/3 1/3










7×7
.
(b) We have seen that P1 = n−1
Jn×n is the perpendicular projection matrix responsible
for removing the effects of the intercept term; in this model, the intercept term is µ. We
have also seen that PX − P1 is the perpendicular projection matrix which projects Y
onto the orthogonal complement of C(1) with respect to C(X), a subspace of dimension
r(PX − P1) = a − 1. Show that
Y0
(PX − P1)Y =
a
X
i=1
ni(Y i+ − Y ++)2
.
This is the called the corrected treatment (model) sum of squares.
(c) The quantity Y0
(PX − P1)Y is useful. An uninteresting use of this quantity involves
testing the hypothesis that all the αi’s are equal; i.e., testing H0 : α1 = α2 = · · · = αa.
Informally, this is done by comparing the size of Y0
(PX−P1)Y to the size of Y0
(I−PX)Y,
the residual sum of squares, while adjusting for the ranks of PX − P1 and I − PX; i.e.,
a − 1 and n − a. It is more useful (and more interesting) to break this quantity up into
smaller pieces and test more refined hypotheses that correspond to the pieces. One way
to do this is to break up Y0
(PX − P1)Y into a − 1 components
Y0
(PX − P1)Y = Y0
M1Y + Y0
M2Y + · · · + Y0
Ma−1Y,
where MiMj = 0, for all i 6= j, and M1, M2, ..., Ma−1 are perpendicular projection
matrices onto a − 1 orthogonal subspaces of C(PX − P1). The sums of squares Y0
MiY,
i = 1, 2, ..., a − 1 each have 1 degree of freedom and can be used to test orthogonal
contrasts. Breaking up C(PX −P1) in this fashion can be done using the Gram-Schmidt
orthonormalization procedure. Set M∗ = PX − P1. We now break up C(M∗) into a − 1
orthogonal subspaces. Let o1, o2, ..., oa−1 be an orthonormal basis for C(M∗). Note
that, using Gram-Schmidt, o1 can be any normalized vector in C(M∗), o2 can be any
normalized vector in C(M∗) orthogonal to o1, and so on. Set O = (o1 o2 · · · oa−1).
From Problem 14, we have
M∗ = OO0
=
a−1
X
i=1
oio0
i.
Take Mi = oio0
i. Then, Mi is a perpendicular projection matrix in its own right and
MiMj = 0, for i 6= j, because of orthogonality. Finally, note that
Y0
M∗Y = Y0
(M1 + M2 + · · · + Ma−1)Y = Y0
M1Y + Y0
M2Y + · · · + Y0
Ma−1Y.
PAGE 5

STAT 714 HOMEWORK 5
This demonstrates that the corrected model sum of squares Y0
M∗Y can be written as the
sum of a − 1 pieces, as claimed. Now, for specificity, take a = 3 and n1 = n2 = n3 = 2.
Break up C(PX − P1) into a − 1 = 2 orthogonal subspaces. With your orthogonal
subspaces (and their associated ppms, say, M1 and M2), verify that
Y0
(PX − P1)Y = Y0
M1Y + Y0
M2Y,
using the observed data Y = (1, 0, 2, 1, 3, 4)0
.
PAGE 6

probability problem with brief solution 5

More Related Content

Similar to probability problem with brief solution 5 (20)

Recently uploaded (20)

probability problem with brief solution 5