# 代做homework | R语言代写 | 统计代写 – BTRY 4030 – Fall 2018 – Homework 5 Q

### BTRY 4030 – Homework 5 Q

#### Due Tuesday, December 4, 2018

You may either respond to the questions below by editing the hw5_2018_q1.Rmd to include your answers
and compiling it into a PDF document, or by handwriting your answers and scanning these in.

You may discuss the homework problems and computing issues with other students in the class. However, you

must write up your homework solution on your own. In particular, do not share your homework RMarkdown
file with other students.
Here we will add one more deletion diagnostic to our arsenal. When comparing two possible models, we often
want to ask Does one predict future data better than the other? One way to do this is to divide your data
into two collections of observations( X
1
, y
1
)and( X
2
, y
2
), say. We use( X
1
, y
1
)to obtain a linear regression
model, with parameters
and look at the prediction error ( y
2
2
)
T
( y
2
##### X
2
).
This is a bit wasteful  you could use( X
2
, y
2
. However, we can assess how
well this type of model does (for these data) as follows:
For each observation i
i. Remove( x i,yi )from the data and obtain

( i )
from the remaining n  1 data points.
ii. Use this to make a prediction y
( i ) i
= x
T
i

( i )
##### .
Return the cross validation error CV =
n
i =
( yi  y
( i ) i
##### )
2
This can be used to compare a models that use different covariates, for example; particularly when the models
are not nested. We will see an example of this in Question 2.
Here, we will find a way to calculate CV without having to manually go through removing observations one
by one.
a.We will start by considering a separate test-set. As in the midterm, imagine that we have X 2 = X 1 , but
that the errors that produce y 2 are independent of those that produce y 1. We estimate
|using( X 1 , y 1 ):
= ( X
T
1
##### X 1 )
1
X
T
1
y. Show that the in-sample average squared error,( y 1  X 1
)
T
( y 1  X 1
) /n , is
biased downwards as an estimate of  , but the test-set average squared error,( y 2  X 2
)
T
( y 2  X 2
) /n ,
is biassed upwards. (You may find the midterm solutions helpful.)
b.Suppose that
p
= 0, that is the final column of X
1
has no impact on prediction. Show that the test set
error is smaller if we remove the final column from each of X
1
and X
2
than if we dont. (This makes
using a test set a reasonable means of choosing what covariates to include. )
c. Now we will turn to cross validation. Using the identity

( i )

1  h
ii
##### ( X
T
X )
1
x
i
e
i
from class, to obtain an expression for the out of sample prediction x
T
i

( i )
in terms of x i , yi ,
and hii
only.
d.Hence obtain an expression for the prediction error yi  x
T
i

( i )
using only yi , y  i and hii. You may want
to check this empirically using the first few entries of the data used in Question 2.
##### 1

e. Show that the over-all CV score can be calculated from

n

i =
e
2
i
(1 h
ii
##### )
2
that is, without deleting observations, and only requiring the leverages hii.