# 代做Project | R语言代写 | 统计学代写 – Computational Statistics

### Computational Statistics project1

``````(Due: Friday, 09/28/18)
``````
1. (WaRming up) Write (R) functions that return:
``````(a) The inverse or the transpose inverse of an upper triangular matrix. Call this
function inv.upper.triand provide a transpose argument to specify if the
transpose is requested. Hint: usebacksolve.
(b) TheL 2 norm of vectorvwithnentries, defined asnorm2(v) =v=
``````
``````n
i=1v
2
i =
vTv. Quick check: ifu <- 1e200 * rep(1, 100), what isnorm2(u)?^1
(c) The column-normalizationU of matrixA,Uij = Aij/A,j(call this function
normalize.cols, and feel free to usenorm2above).
(d) The projection of vectoraintou(called asproj(a, u)) ,
``````
``````proju(a) =
``````
``````u>a
u^2
``````
``````u.
``````
``````Quick check: what isproj(1:100, u),uas in (b) above?
(e) The Vandermonde matrix of vectora= [ai]i=1,...,nand degreed,
``````
``````V(a,d) =
``````
``````1 a 1 a^21  ad 1
1 a 2 a^22  ad 2
..
.
``````

#### .

``````1 an a^2 n  adn
``````
``````= [aij^1 ]i=1,...,n, j=1,...,d+
``````
``````(called asvandermonde(a, d)).
``````
1. Themachine epsilon,, can be defined as the smallest floating point (with base 2) such that 1 + >1, that is, 1 +/2 == 1 in machine precision.
``````(a) Write a function that returns this value by starting ateps = 1and iteratively
dividing by 2 until the definition is satisfied.
(b) Write a function that computesf(x) = log(1 + exp(x)) and then evaluate: f(0),
f(80),f(80), andf(800).
(c) How would you specify your function to avoid computations ifx0 (x <0 and
|x|is large)? (Hint: .)
(d) How would you implement your function to not overflow ifx0?
``````

(^1) 1e201, of course.

1. (QR via Gram-Schmidt) IfA= [a 1 a 2 ap] is an-by-pmatrix withn > pthen we can obtain a thin QR decomposition ofAusing Gram-Schmidt orthogonalization^2. To getQ, we start withu 1 =a 1 , then compute iteratively, fori= 2,…,p,
``````ui=ai
``````
``````i^1
``````
``````j=
``````
``````projuj(ai),
``````
``````and finally, withqi=ui/ui, setQ= [q 1 q 2 qp] as a column-normalized version
ofU= [ui]i=1,...,p.
``````
``````(a) Show thatC =Q>Ais upper triangular and thatC is the Cholesky factor of
A>A.
(b) Write a R function that computes theQorthogonal factor of a Vandermonde
matrix with base vector x and degree d without computing the Vandermonde
matrix explicitly, that is, as your function iterates to computeui, compute and
use the columns of the Vandermonde matrix on the fly.
(c) It can be shown that with 1 = 1,
``````
``````i+1=u>iui, and i=
``````
``````u>iDiag(x)ui
u>iui
``````
``````, fori= 1,...,d+ 1,
``````
``````where Diag(x) is a diagonal matrix with diagonal entriesx, theU matrix can
``````

be computed using the recurrence relation: u 1 = (^1) n,u 2 =x 11 n, and, for i= 2,…,d, (ui+1)j= (xji)(ui)j i+ i (ui 1 )j, forj= 1,…,n. Write a R function that, givenand, computesQ. (d) Now modify your function in (a) to also computeand. Quick check for the compact representation ofQ: show that 1 =x , 2 =n, and 3 = (n1)s^2 x, where xands^2 xare the sample mean and variance ofxrespectively.

1. (Orthogonal staged regression) Suppose we observeyN(X,^2 In),Xan-by-pfull rank matrix, and wish to test
``````H 0 :j=j+1==p= 0
``````
``````for some j  1. It can be shown that the ML estimator = (X>X)^1 X>y has
distributionN(,^2 (X>X)^1 ).
``````
``````(a) IfXhas thin QR decompositionX=QR, show thatH 0 is equivalent to testing
j==p= 0 where=R, and so we can regressyonQinstead ofX, that
is, estimatefromyN(Q,^2 In).
``````

(^2) This is not recommended in practice because it is numerically unstable. Orthogonalizations are usually performed using Givens rotations or Householder reflections.

``````(b) Show that the ML estimator for is  = Q>y and the components of  are
independent.
(c) UsingR, explain how you compute: (i) the ML estimateas a function of ,
and (ii) the correlation matrix ofusing onlycrossprod,normalize.cols, and
inv.upper.tri.
(d) As a concrete example, let us use thecarsdataset^3 to fit a polynomial regression
of degreed= 3. Takedistto be the responseyandspeedto be the base vector
xto defineX=V(x,d), a Vandermonde design.
i. ComputeQusing the routine from 3.b, obtain =Q>yand compare it to
the estimate fromcoef(lm(dist ~ Q - 1)).
ii. Computeaccording to (c) and compare it to the estimate from
coef(lm(dist ~ vandermonde(speed, 3) - 1))
iii. () Compare thep-values when testingj = 0 andj = 0 forj= 1,..., 4
from the above regression fits. How would you explain the discrepancies based
on the results from (b) and (c)? What degree would you recommend for a
polynomial regression ofdistonspeed?
(e) () Suppose we use another routine to obtain a QR decomposition ofX. Under
what conditions isRthe Cholesky factor ofX>X? Write a function that returns
a QR decomposition ofXsatisfying these conditions.
``````

(^3) Load it in R withdata(cars).