SlideShare a Scribd company logo
STAT 714 HOMEWORK 5
1. Consider the general linear model Y = Xβ + , where E() = 0 and cov() = σ2
I.
Suppose that X is n × p with rank r  p. Let b
β = (X0
X)−
X0
Y denote a least squares
estimator of β.
(a) Find E(b
β) and cov(b
β).
(b) Do your results in part (a) change when cov() = σ2
V, where V 6= I?
(c) Do your answers in part (a) change when r = p?
2. Consider the regression model Yi = β0 + β1xi + β2(3x2
i − 2) + i, for i = 1, 2, 3, where
x1 = −1, x2 = 0, and x3 = 1.
(a) Put this model into Y = Xβ +  form.
(b) Find the least-squares estimates of β0, β1, and β2.
(c) Show that the least-squares estimates of β0 and β1 are unchanged if β2 = 0. Why do
you think this happens?
3. Consider an experiment to study the effect of baking time, x, on the breaking strength
of a ceramic, Y . The following eight data values were obtained:
x 2 6 8
y 15, 20, 25 21, 25, 29 33, 37
(a) Consider the cell means model Yij = µi + ij, for i = 1, 2, 3 and j = 1, 2, ..., ni, where
n1 = n2 = 3 and n3 = 2. Put this model into Y = Xβ +  form, where β = (µ1, µ2, µ3)0
.
(b) Consider the model Yij = β0 + β1xi + β2x2
i + ij, where x1 = 2, x2 = 6, and x3 = 8.
Write this model as Y = Wγ + , where γ = (β0, β1, β2)0
.
(c) Show that C(X) = C(W). What does this imply about the fitted values and residuals
for these two models?
4. Consider the general linear model Y = Xβ + , where E() = 0 and cov() = σ2
I.
Let b
Y denote the vector of least squares fitted values and b
e denote the vector of least
squares residuals. Compute
(a) E( b
Y)
(b) cov( b
Y)
(c) E(b
e)
(d) cov(b
e)
(e) cov( b
Y,b
e).
5. The observed tension, Y , in a nonextensible string required to maintain a body of
unknown weight, w, in equilibrium on a smooth inclined plane of angle θ, 0  θ  π/2,
is a random variable with mean E(Y ) = w sin θ. For n known values θ1, θ2, ..., θn, set by
the experimenter and a given body, the observed data are Y1, Y2, ..., Yn.
(a) Find b
w, the least squares estimator of w, the weight of this body.
(b) Compute E( b
w) and var( b
w). You may assume that Y1, Y2, ..., Yn are independent.
PAGE 1
STAT 714 HOMEWORK 5
(c) Let b
Y1, b
Y2, ..., b
Yn denote the least squares fitted values. Is it necessarily true that
Pn
i=1(Yi − b
Yi) = 0? Explain.
6. Define the matrices
X =




1 1 0 0
1 1 0 0
1 0 1 0
1 0 0 1



 and W =




1 1
1 1
1 0
1 0




Take Y = (1, 0, 1, 2)0
.
(a) Show that C(W) ⊂ C(X).
(b) Express Y as the sum of two vectors: one in C(X) and one in N(X0
).
(c) Compute the ppm onto C(W)⊥
C(X) and then project Y onto this space.
7. Consider the simple linear regression model in Section 2.3 (notes).
(a) Show algebraically that PX and PW are both equal to






1
n
+ (x1−x)2
P
i(xi−x)2
1
n
+ (x1−x)(x2−x)
P
i(xi−x)2 · · · 1
n
+ (x1−x)(xn−x)
P
i(xi−x)2
1
n
+ (x1−x)(x2−x)
P
i(xi−x)2
1
n
+ (x2−x)2
P
i(xi−x)2 · · · 1
n
+ (x2−x)(xn−x)
P
i(xi−x)2
.
.
.
.
.
.
...
.
.
.
1
n
+ (x1−x)(xn−x)
P
i(xi−x)2
1
n
+ (x2−x)(xn−x)
P
i(xi−x)2 · · · 1
n
+ (xn−x)2
P
i(xi−x)2






.
In regression analysis, this matrix is called the hat matrix.
(b) Compute the trace of this matrix.
(c) Use the ceramic data from Problem 3 and fit both the centered and uncentered sim-
ple linear regression models. Report least squares estimates for both models. Also, show
that the fitted values and residuals are the same for both fits.
8. Consider two linear models for the same data:
Model 1: Y = Xβ + 
Model 2: Y = Wγ + .
Here, X is n × p, W is n × q, β is p × 1, and γ is q × 1. Suppose that C(W) ⊂ C(X).
For Model 1, let b
YX, b
eX, and PX denote the vector of (least-squares) fitted values, the
vector of (least-squares) residuals, and the perpendicular projection matrix onto C(X).
The quantities b
YW , b
eW , and PW are defined analogously.
(a) Show that W = XC, for some p × q matrix C.
(b) Show that PXPW = PW.
(c) Show that ( b
YX − b
YW )0 b
YW = 0.
(d) Show that Y0
Y = b
Y0
W
b
YW + ( b
YX − b
YW )0
( b
YX − b
YW ) + b
e0
Xb
eX.
PAGE 2
STAT 714 HOMEWORK 5
9. The effectiveness of three skin creams was studied in an experiment on s subjects.
On the forearm of each subject, three locations were specified. The three creams were
randomly allocated to locations on each subject; that is, each subject received a complete
set of three treatments. It is assumed that the three observations on the same individual
are correlated and that observations on different subjects are uncorrelated. A statistical
model for this experiment is
Yij = µi + ij,
where Yij denotes the ith measurement on subject j and µi denotes the mean response
for the ith skin cream. Assume that ij, for i = 1, 2, 3 and j = 1, 2, ..., s, are random
variables with E(ij) = 0, var(ij) = σ2
, and corr(ij, i0j) = ρ, for i 6= i0
. Note that
corr(ij, ij0 ) = 0 when j 6= j0
(regardless of i) because ij and ij0 correspond to different
subjects.
(a) Assuming that µi is fixed (not random), express this model as Y = Xβ + . Define
all vectors and matrices. Your design matrix X should be full rank.
(b) Compute cov(Y).
(c) Find b
β, the least-squares estimator of β. Your answer should be a vector (I don’t
want to see matrices in your final answer).
(d) Compute cov(b
β).
10. Let PX denote the perpendicular projection matrix onto C(X).
(a) Give a detailed argument showing that I−PX is the perpendicular projection matrix
onto N(X0
).
(b) Let
X =






1 1 0
1 1 0
1 0 1
1 0 1
1 0 1






.
Compute PX and I − PX.
(c) Express
Y =






1
2
3
4
5






as the sum of two vectors, one in C(X) and one in N(X0
).
(d) For the X matrix in part (b), describe in words what C(X) and N(X0
) are.
11. Consider the general linear model Y = Xβ + , where E() = 0. Let M =
X(X0
X)−
X0
denote the perpendicular projection matrix onto C(X) and denote by b
e the
vector of residuals obtained from the least squares fit. Prove that b
β is a least squares
estimate of β if and only if b
e⊥C(X).
PAGE 3
STAT 714 HOMEWORK 5
12. Consider the linear regression model Y = Xβ +, where X is n×p, where p = k +1,
and k is the number of independent variables in the model (the model also includes an
intercept term). Assume that E() = 0 and cov() = σ2
I. Let M denote the perpendic-
ular projection matrix onto C(X), let J denote an n × n matrix of ones, and let b
β denote
the (unique) OLS estimator.
(a) Show that Y0
(M − n−1
J)Y = b
β
0
X0
Y − nY
2
, where Y is the sample mean of
Y1, Y2, ..., Yn.
(b) Prove that r(M − n−1
J) = k.
(c) Let b
e denote the vector of residuals from the OLS fit. Find E(b
e) and cov(b
e). Do the
least squares residuals have constant variance?
TERMINOLOGY : The Gram-Schmidt procedure is a method for orthonormalizing a
set of basis vectors. Let V be a vector space with basis {u1, u2, ..., ur}. For s = 1, 2, ..., r,
define inductively
v1 = u1/||u1||
ws = us −
s−1
X
i=1
(u0
svi)vi
vs = ws/||ws||.
Then {v1, v2, ..., vr} is an orthonormal basis for V, where vs ∈ span{u1, u2, ..., us}.
13. Consider the vector space V = R3
and the vectors u1 = (1, 1, 1)0
, u2 = (0, 1, 1)0
,
and u3 = (0, 0, 1)0
. Show that {u1, u2, u3} is a basis for V. Then use the Gram-Schmidt
procedure to orthonormalize the basis.
14. Let o1, o2, ..., or be an orthonormal basis for C(X) and O = (o1 o2 · · · or). Prove
that OO0
=
Pr
i=1 oio0
i is the perpendicular projection matrix onto C(X).
15. Consider the one-way ANOVA model Yij = µ + αi + ij, for i = 1, 2, ..., a and
j = 1, 2, ..., ni, so that the design matrix is
Xn×p =





1n1 1n1 0n1 · · · 0n1
1n2 0n2 1n2 · · · 0n2
.
.
.
.
.
.
.
.
.
...
.
.
.
1na 0na 0na · · · 1na





,
where p = a + 1 and n =
P
i ni.
(a) Show that the perpendicular projection matrix onto C(X) is given by the n×n matrix
PX = Blk Diag(n−1
i Jni×ni
),
PAGE 4
STAT 714 HOMEWORK 5
where Jni×ni
is the ni × ni matrix of ones and “Blk Diag” stands for “block diagonal.”
For example, if a = 3, n1 = n2 = 2, and n3 = 3, then n = 7 and
PX =










1/2 1/2 0 0 0 0 0
1/2 1/2 0 0 0 0 0
0 0 1/2 1/2 0 0 0
0 0 1/2 1/2 0 0 0
0 0 0 0 1/3 1/3 1/3
0 0 0 0 1/3 1/3 1/3
0 0 0 0 1/3 1/3 1/3










7×7
.
(b) We have seen that P1 = n−1
Jn×n is the perpendicular projection matrix responsible
for removing the effects of the intercept term; in this model, the intercept term is µ. We
have also seen that PX − P1 is the perpendicular projection matrix which projects Y
onto the orthogonal complement of C(1) with respect to C(X), a subspace of dimension
r(PX − P1) = a − 1. Show that
Y0
(PX − P1)Y =
a
X
i=1
ni(Y i+ − Y ++)2
.
This is the called the corrected treatment (model) sum of squares.
(c) The quantity Y0
(PX − P1)Y is useful. An uninteresting use of this quantity involves
testing the hypothesis that all the αi’s are equal; i.e., testing H0 : α1 = α2 = · · · = αa.
Informally, this is done by comparing the size of Y0
(PX−P1)Y to the size of Y0
(I−PX)Y,
the residual sum of squares, while adjusting for the ranks of PX − P1 and I − PX; i.e.,
a − 1 and n − a. It is more useful (and more interesting) to break this quantity up into
smaller pieces and test more refined hypotheses that correspond to the pieces. One way
to do this is to break up Y0
(PX − P1)Y into a − 1 components
Y0
(PX − P1)Y = Y0
M1Y + Y0
M2Y + · · · + Y0
Ma−1Y,
where MiMj = 0, for all i 6= j, and M1, M2, ..., Ma−1 are perpendicular projection
matrices onto a − 1 orthogonal subspaces of C(PX − P1). The sums of squares Y0
MiY,
i = 1, 2, ..., a − 1 each have 1 degree of freedom and can be used to test orthogonal
contrasts. Breaking up C(PX −P1) in this fashion can be done using the Gram-Schmidt
orthonormalization procedure. Set M∗ = PX − P1. We now break up C(M∗) into a − 1
orthogonal subspaces. Let o1, o2, ..., oa−1 be an orthonormal basis for C(M∗). Note
that, using Gram-Schmidt, o1 can be any normalized vector in C(M∗), o2 can be any
normalized vector in C(M∗) orthogonal to o1, and so on. Set O = (o1 o2 · · · oa−1).
From Problem 14, we have
M∗ = OO0
=
a−1
X
i=1
oio0
i.
Take Mi = oio0
i. Then, Mi is a perpendicular projection matrix in its own right and
MiMj = 0, for i 6= j, because of orthogonality. Finally, note that
Y0
M∗Y = Y0
(M1 + M2 + · · · + Ma−1)Y = Y0
M1Y + Y0
M2Y + · · · + Y0
Ma−1Y.
PAGE 5
STAT 714 HOMEWORK 5
This demonstrates that the corrected model sum of squares Y0
M∗Y can be written as the
sum of a − 1 pieces, as claimed. Now, for specificity, take a = 3 and n1 = n2 = n3 = 2.
Break up C(PX − P1) into a − 1 = 2 orthogonal subspaces. With your orthogonal
subspaces (and their associated ppms, say, M1 and M2), verify that
Y0
(PX − P1)Y = Y0
M1Y + Y0
M2Y,
using the observed data Y = (1, 0, 2, 1, 3, 4)0
.
PAGE 6

More Related Content

PDF
probability problem with brief solution 8
PDF
probability problem with brief solution 7
PDF
probability problem with brief solution 3
PDF
Econometric Analysis 8th Edition Greene Solutions Manual
DOCX
Math 235 - Summer 2015Homework 2Due Monday June 8 in cla.docx
PDF
probability problem with brief solution 6
PDF
Lecture cochran
PDF
probability problem with brief solution 9
probability problem with brief solution 8
probability problem with brief solution 7
probability problem with brief solution 3
Econometric Analysis 8th Edition Greene Solutions Manual
Math 235 - Summer 2015Homework 2Due Monday June 8 in cla.docx
probability problem with brief solution 6
Lecture cochran
probability problem with brief solution 9

Similar to probability problem with brief solution 5 (20)

PDF
Regression using Apache SystemML by Alexandre V Evfimievski
PDF
Regression using Apache SystemML by Alexandre V Evfimievski
PDF
Least Squares
PDF
Chapitre04_Solutions.pdf
PDF
probability problem with brief solution 4
PDF
linear_least_squares for fault detection and diagnosis .pdf
PPTX
statistics assignment help
PDF
curve fitting lecture slides February 24
PDF
Basic concepts of curve fittings
PDF
Lecture Notes in Econometrics Arsen Palestini.pdf
PPTX
Regression refers to the statistical technique of modeling
PDF
The new national curriculum cited quality assurance criteria for quality cont...
PPTX
Econometrics- lecture 10 and 11
PDF
X02 Supervised learning problem linear regression multiple features
PPTX
Least Squares
PDF
Nonparametric approach to multiple regression
DOC
Lesson 8
PPT
Vcla - Inner Products
PDF
Chapter 3 projection
PDF
Lecture 3 - Linear Regression
Regression using Apache SystemML by Alexandre V Evfimievski
Regression using Apache SystemML by Alexandre V Evfimievski
Least Squares
Chapitre04_Solutions.pdf
probability problem with brief solution 4
linear_least_squares for fault detection and diagnosis .pdf
statistics assignment help
curve fitting lecture slides February 24
Basic concepts of curve fittings
Lecture Notes in Econometrics Arsen Palestini.pdf
Regression refers to the statistical technique of modeling
The new national curriculum cited quality assurance criteria for quality cont...
Econometrics- lecture 10 and 11
X02 Supervised learning problem linear regression multiple features
Least Squares
Nonparametric approach to multiple regression
Lesson 8
Vcla - Inner Products
Chapter 3 projection
Lecture 3 - Linear Regression
Ad

Recently uploaded (20)

PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PDF
VCE English Exam - Section C Student Revision Booklet
PPTX
Lesson notes of climatology university.
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PDF
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PPTX
GDM (1) (1).pptx small presentation for students
PDF
TR - Agricultural Crops Production NC III.pdf
PPTX
Cell Structure & Organelles in detailed.
PDF
Insiders guide to clinical Medicine.pdf
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PPTX
Renaissance Architecture: A Journey from Faith to Humanism
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PPTX
Institutional Correction lecture only . . .
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PDF
Complications of Minimal Access Surgery at WLH
Abdominal Access Techniques with Prof. Dr. R K Mishra
VCE English Exam - Section C Student Revision Booklet
Lesson notes of climatology university.
Module 4: Burden of Disease Tutorial Slides S2 2025
STATICS OF THE RIGID BODIES Hibbelers.pdf
Supply Chain Operations Speaking Notes -ICLT Program
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
GDM (1) (1).pptx small presentation for students
TR - Agricultural Crops Production NC III.pdf
Cell Structure & Organelles in detailed.
Insiders guide to clinical Medicine.pdf
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
Renaissance Architecture: A Journey from Faith to Humanism
102 student loan defaulters named and shamed – Is someone you know on the list?
Institutional Correction lecture only . . .
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
Complications of Minimal Access Surgery at WLH
Ad

probability problem with brief solution 5

  • 1. STAT 714 HOMEWORK 5 1. Consider the general linear model Y = Xβ + , where E() = 0 and cov() = σ2 I. Suppose that X is n × p with rank r p. Let b β = (X0 X)− X0 Y denote a least squares estimator of β. (a) Find E(b β) and cov(b β). (b) Do your results in part (a) change when cov() = σ2 V, where V 6= I? (c) Do your answers in part (a) change when r = p? 2. Consider the regression model Yi = β0 + β1xi + β2(3x2 i − 2) + i, for i = 1, 2, 3, where x1 = −1, x2 = 0, and x3 = 1. (a) Put this model into Y = Xβ + form. (b) Find the least-squares estimates of β0, β1, and β2. (c) Show that the least-squares estimates of β0 and β1 are unchanged if β2 = 0. Why do you think this happens? 3. Consider an experiment to study the effect of baking time, x, on the breaking strength of a ceramic, Y . The following eight data values were obtained: x 2 6 8 y 15, 20, 25 21, 25, 29 33, 37 (a) Consider the cell means model Yij = µi + ij, for i = 1, 2, 3 and j = 1, 2, ..., ni, where n1 = n2 = 3 and n3 = 2. Put this model into Y = Xβ + form, where β = (µ1, µ2, µ3)0 . (b) Consider the model Yij = β0 + β1xi + β2x2 i + ij, where x1 = 2, x2 = 6, and x3 = 8. Write this model as Y = Wγ + , where γ = (β0, β1, β2)0 . (c) Show that C(X) = C(W). What does this imply about the fitted values and residuals for these two models? 4. Consider the general linear model Y = Xβ + , where E() = 0 and cov() = σ2 I. Let b Y denote the vector of least squares fitted values and b e denote the vector of least squares residuals. Compute (a) E( b Y) (b) cov( b Y) (c) E(b e) (d) cov(b e) (e) cov( b Y,b e). 5. The observed tension, Y , in a nonextensible string required to maintain a body of unknown weight, w, in equilibrium on a smooth inclined plane of angle θ, 0 θ π/2, is a random variable with mean E(Y ) = w sin θ. For n known values θ1, θ2, ..., θn, set by the experimenter and a given body, the observed data are Y1, Y2, ..., Yn. (a) Find b w, the least squares estimator of w, the weight of this body. (b) Compute E( b w) and var( b w). You may assume that Y1, Y2, ..., Yn are independent. PAGE 1
  • 2. STAT 714 HOMEWORK 5 (c) Let b Y1, b Y2, ..., b Yn denote the least squares fitted values. Is it necessarily true that Pn i=1(Yi − b Yi) = 0? Explain. 6. Define the matrices X =     1 1 0 0 1 1 0 0 1 0 1 0 1 0 0 1     and W =     1 1 1 1 1 0 1 0     Take Y = (1, 0, 1, 2)0 . (a) Show that C(W) ⊂ C(X). (b) Express Y as the sum of two vectors: one in C(X) and one in N(X0 ). (c) Compute the ppm onto C(W)⊥ C(X) and then project Y onto this space. 7. Consider the simple linear regression model in Section 2.3 (notes). (a) Show algebraically that PX and PW are both equal to       1 n + (x1−x)2 P i(xi−x)2 1 n + (x1−x)(x2−x) P i(xi−x)2 · · · 1 n + (x1−x)(xn−x) P i(xi−x)2 1 n + (x1−x)(x2−x) P i(xi−x)2 1 n + (x2−x)2 P i(xi−x)2 · · · 1 n + (x2−x)(xn−x) P i(xi−x)2 . . . . . . ... . . . 1 n + (x1−x)(xn−x) P i(xi−x)2 1 n + (x2−x)(xn−x) P i(xi−x)2 · · · 1 n + (xn−x)2 P i(xi−x)2       . In regression analysis, this matrix is called the hat matrix. (b) Compute the trace of this matrix. (c) Use the ceramic data from Problem 3 and fit both the centered and uncentered sim- ple linear regression models. Report least squares estimates for both models. Also, show that the fitted values and residuals are the same for both fits. 8. Consider two linear models for the same data: Model 1: Y = Xβ + Model 2: Y = Wγ + . Here, X is n × p, W is n × q, β is p × 1, and γ is q × 1. Suppose that C(W) ⊂ C(X). For Model 1, let b YX, b eX, and PX denote the vector of (least-squares) fitted values, the vector of (least-squares) residuals, and the perpendicular projection matrix onto C(X). The quantities b YW , b eW , and PW are defined analogously. (a) Show that W = XC, for some p × q matrix C. (b) Show that PXPW = PW. (c) Show that ( b YX − b YW )0 b YW = 0. (d) Show that Y0 Y = b Y0 W b YW + ( b YX − b YW )0 ( b YX − b YW ) + b e0 Xb eX. PAGE 2
  • 3. STAT 714 HOMEWORK 5 9. The effectiveness of three skin creams was studied in an experiment on s subjects. On the forearm of each subject, three locations were specified. The three creams were randomly allocated to locations on each subject; that is, each subject received a complete set of three treatments. It is assumed that the three observations on the same individual are correlated and that observations on different subjects are uncorrelated. A statistical model for this experiment is Yij = µi + ij, where Yij denotes the ith measurement on subject j and µi denotes the mean response for the ith skin cream. Assume that ij, for i = 1, 2, 3 and j = 1, 2, ..., s, are random variables with E(ij) = 0, var(ij) = σ2 , and corr(ij, i0j) = ρ, for i 6= i0 . Note that corr(ij, ij0 ) = 0 when j 6= j0 (regardless of i) because ij and ij0 correspond to different subjects. (a) Assuming that µi is fixed (not random), express this model as Y = Xβ + . Define all vectors and matrices. Your design matrix X should be full rank. (b) Compute cov(Y). (c) Find b β, the least-squares estimator of β. Your answer should be a vector (I don’t want to see matrices in your final answer). (d) Compute cov(b β). 10. Let PX denote the perpendicular projection matrix onto C(X). (a) Give a detailed argument showing that I−PX is the perpendicular projection matrix onto N(X0 ). (b) Let X =       1 1 0 1 1 0 1 0 1 1 0 1 1 0 1       . Compute PX and I − PX. (c) Express Y =       1 2 3 4 5       as the sum of two vectors, one in C(X) and one in N(X0 ). (d) For the X matrix in part (b), describe in words what C(X) and N(X0 ) are. 11. Consider the general linear model Y = Xβ + , where E() = 0. Let M = X(X0 X)− X0 denote the perpendicular projection matrix onto C(X) and denote by b e the vector of residuals obtained from the least squares fit. Prove that b β is a least squares estimate of β if and only if b e⊥C(X). PAGE 3
  • 4. STAT 714 HOMEWORK 5 12. Consider the linear regression model Y = Xβ +, where X is n×p, where p = k +1, and k is the number of independent variables in the model (the model also includes an intercept term). Assume that E() = 0 and cov() = σ2 I. Let M denote the perpendic- ular projection matrix onto C(X), let J denote an n × n matrix of ones, and let b β denote the (unique) OLS estimator. (a) Show that Y0 (M − n−1 J)Y = b β 0 X0 Y − nY 2 , where Y is the sample mean of Y1, Y2, ..., Yn. (b) Prove that r(M − n−1 J) = k. (c) Let b e denote the vector of residuals from the OLS fit. Find E(b e) and cov(b e). Do the least squares residuals have constant variance? TERMINOLOGY : The Gram-Schmidt procedure is a method for orthonormalizing a set of basis vectors. Let V be a vector space with basis {u1, u2, ..., ur}. For s = 1, 2, ..., r, define inductively v1 = u1/||u1|| ws = us − s−1 X i=1 (u0 svi)vi vs = ws/||ws||. Then {v1, v2, ..., vr} is an orthonormal basis for V, where vs ∈ span{u1, u2, ..., us}. 13. Consider the vector space V = R3 and the vectors u1 = (1, 1, 1)0 , u2 = (0, 1, 1)0 , and u3 = (0, 0, 1)0 . Show that {u1, u2, u3} is a basis for V. Then use the Gram-Schmidt procedure to orthonormalize the basis. 14. Let o1, o2, ..., or be an orthonormal basis for C(X) and O = (o1 o2 · · · or). Prove that OO0 = Pr i=1 oio0 i is the perpendicular projection matrix onto C(X). 15. Consider the one-way ANOVA model Yij = µ + αi + ij, for i = 1, 2, ..., a and j = 1, 2, ..., ni, so that the design matrix is Xn×p =      1n1 1n1 0n1 · · · 0n1 1n2 0n2 1n2 · · · 0n2 . . . . . . . . . ... . . . 1na 0na 0na · · · 1na      , where p = a + 1 and n = P i ni. (a) Show that the perpendicular projection matrix onto C(X) is given by the n×n matrix PX = Blk Diag(n−1 i Jni×ni ), PAGE 4
  • 5. STAT 714 HOMEWORK 5 where Jni×ni is the ni × ni matrix of ones and “Blk Diag” stands for “block diagonal.” For example, if a = 3, n1 = n2 = 2, and n3 = 3, then n = 7 and PX =           1/2 1/2 0 0 0 0 0 1/2 1/2 0 0 0 0 0 0 0 1/2 1/2 0 0 0 0 0 1/2 1/2 0 0 0 0 0 0 0 1/3 1/3 1/3 0 0 0 0 1/3 1/3 1/3 0 0 0 0 1/3 1/3 1/3           7×7 . (b) We have seen that P1 = n−1 Jn×n is the perpendicular projection matrix responsible for removing the effects of the intercept term; in this model, the intercept term is µ. We have also seen that PX − P1 is the perpendicular projection matrix which projects Y onto the orthogonal complement of C(1) with respect to C(X), a subspace of dimension r(PX − P1) = a − 1. Show that Y0 (PX − P1)Y = a X i=1 ni(Y i+ − Y ++)2 . This is the called the corrected treatment (model) sum of squares. (c) The quantity Y0 (PX − P1)Y is useful. An uninteresting use of this quantity involves testing the hypothesis that all the αi’s are equal; i.e., testing H0 : α1 = α2 = · · · = αa. Informally, this is done by comparing the size of Y0 (PX−P1)Y to the size of Y0 (I−PX)Y, the residual sum of squares, while adjusting for the ranks of PX − P1 and I − PX; i.e., a − 1 and n − a. It is more useful (and more interesting) to break this quantity up into smaller pieces and test more refined hypotheses that correspond to the pieces. One way to do this is to break up Y0 (PX − P1)Y into a − 1 components Y0 (PX − P1)Y = Y0 M1Y + Y0 M2Y + · · · + Y0 Ma−1Y, where MiMj = 0, for all i 6= j, and M1, M2, ..., Ma−1 are perpendicular projection matrices onto a − 1 orthogonal subspaces of C(PX − P1). The sums of squares Y0 MiY, i = 1, 2, ..., a − 1 each have 1 degree of freedom and can be used to test orthogonal contrasts. Breaking up C(PX −P1) in this fashion can be done using the Gram-Schmidt orthonormalization procedure. Set M∗ = PX − P1. We now break up C(M∗) into a − 1 orthogonal subspaces. Let o1, o2, ..., oa−1 be an orthonormal basis for C(M∗). Note that, using Gram-Schmidt, o1 can be any normalized vector in C(M∗), o2 can be any normalized vector in C(M∗) orthogonal to o1, and so on. Set O = (o1 o2 · · · oa−1). From Problem 14, we have M∗ = OO0 = a−1 X i=1 oio0 i. Take Mi = oio0 i. Then, Mi is a perpendicular projection matrix in its own right and MiMj = 0, for i 6= j, because of orthogonality. Finally, note that Y0 M∗Y = Y0 (M1 + M2 + · · · + Ma−1)Y = Y0 M1Y + Y0 M2Y + · · · + Y0 Ma−1Y. PAGE 5
  • 6. STAT 714 HOMEWORK 5 This demonstrates that the corrected model sum of squares Y0 M∗Y can be written as the sum of a − 1 pieces, as claimed. Now, for specificity, take a = 3 and n1 = n2 = n3 = 2. Break up C(PX − P1) into a − 1 = 2 orthogonal subspaces. With your orthogonal subspaces (and their associated ppms, say, M1 and M2), verify that Y0 (PX − P1)Y = Y0 M1Y + Y0 M2Y, using the observed data Y = (1, 0, 2, 1, 3, 4)0 . PAGE 6