STAT 132 (Frequentist and Bayesian Inference)

Objective代写 | 代做lab – 这道题目是利用Objective进行的编程代写任务, 涵盖了Objective等程序代做方面, 这个项目是lab代写的代写题目

Take-Home Test 1

Q

(a)

True. The conditional joint distribution of Y 1 ,…,Yn give that sn and is

Pr( y 1 ,...,yn | sn = s, ) =
Pr( y 1 ,...,yn |  )
Pr( sn = s |  )

 n
( i =1Pr( yi |  )
n
s

)

s (1  ) n  s

s (1  ) n  s
( n
s

)

s (1  ) n  s

1

( n
s

)

which does not dependent on , so sn is sufficient statistic.

(b)

True. The beta distribution is a family of continuous probability distributions defined on the interval[0 , 1] parameterized by two positive shape parameters, denoted by alpha ( ) and beta ( ).

(c)

True. Frequentist methods are good at calibration, model evaluation problems, Bayesian methods are good at inference, prediction, decision-making.

(d)

True.

(e)

True. The conditional probability Pr ( A | B )is not defined if Pr ( B ) = 0.

(f)

True. Uncertainty is difficult to predict and quantitatively analyze through existing theory or experience. Uncertainty includes not only Objective uncertainty, but also subjective uncertainty. Subjective uncertainty is the uncertainty that exists because it is impossible for people to fully recognize and accurately reflect things due to the influence of subjective factors, cognitive levels and abilities.

(g)

True. Due to the complexity of the integrals being solved, Bayesian statistics did not come into increasing use until the late 20th century (late 1980s), as the development of simulation methods solved the difficulty of solving high-dimensional integrals (Fienberg, 2006).

Reference: Fienberg, Stephen E. When Did Bayesian Inference Become Bayesian? Bayesian Analysis, vol. 1, no. 1, 2006, https://doi.org/10.1214/06-ba101.

Q

(a)

y <- c (1, 2, 1, 1, 4, 1, 2, 2, 0, 3, 6, 2, 1, 3) stem (y, scale=2)

The decimal point is at the |

0 | 0

1 | 00000

2 | 0000

3 | 00

4 | 0

5 |

6 | 0

the shape of the distribution is unimodal and positively skewed.

(b)

E ( Yi |[ SM :P]  B) =

y =

yP ( Yi = y |[ SM :P]  B)

=

y =

y
ye  
y!

= e

y =

y
y
y!

= e

y =

y
( y 1)!

= e

y =

y
y!

 = e  e =

(c)

(i) The likelihood is the joint p.f of the observations as a function of

L(  ) = fn ( y |  )
= Pr( y 1 ,...,yn |  )

i =

Pr( Yi = yi |  )

=

i =

yie  
yi!

and the log-likelihood function is

 (  ) = logL(  )

=

i =

[

yi log

yi

k =

log k

]

= log

i =

yi  n

i =

yi

k =

log k

using this data set we can plot the likelihood and log-likelihood function for from 1 to 4 as follows

y <- c (1, 2, 1, 1, 4, 1, 2, 2, 0, 3, 6, 2, 1, 3) n <- length (y) theta <- seq (from=1, to=4, length.out = 100) log.likelihood <- log (theta) *sum (y) – n ***** theta – sum ( log ( factorial (y))) plot (theta, log.likelihood, y lab = "log-likelihood", type=l)

1.0 1.5 2.0 2.5 3.0 3.5 4.

theta

loglikelihood

plot (theta, exp (log.likelihood), ylab = "likelihood", type=l)

1.0 1.5 2.0 2.5 3.0 3.5 4.

0e+

1e

2e

3e

theta

likelihood

(ii)

(iii) The joint p.f fn ( y |  )is

Pr( y 1 ,...,yn |  ) =

i =

Pr( yi |  )

=

i =

yie  
yi!

=

( b

i =

yi!

)

sne  n

from the Factorization Criterion, since fn ( y |  ) = u ( y ) v ( sn, ), so sn is sufficient statistic for .

log-likelihood ( )taking derivative about gives

d
d

( MLE ) =

 n
i =1 yi
  MLE

 n = 0   MLE =

 n
i =1 yi
n

(d)

y <- c (1,2,1,1,4,1,2,2,0,3,6,2,1,3) n <- length (y) theta <- mean (y) emp <- sapply (0 : 6, function (x) { sum (y == x) / n })

fit <- sapply (0 : 6, function (x) { ppois (x + 0.5, theta) }) fit <- diff ( c (0, fit)) df <- data.frame (empirical=emp, fitting=fit) df

empirical fitting

1 0.07142857 0.

2 0.35714286 0.

3 0.28571429 0.

4 0.14285714 0.

5 0.07142857 0.

6 0.00000000 0.

7 0.07142857 0.

P ( Yi = yi |B)

yi Empirical Best-Fitting Poisson
0 0. 071 0. 126
1 0. 357 0. 261
2 0. 286 0. 270
3 0. 143 0. 187
4 0. 071 0. 097
5 0 0. 040
6 0. 071 0. 014
 7 0 0. 005
total 1 1

looks reasonable

(e)

V ar ( Yi |[ SM :P]  B) = E [ Yi^2 |[ SM :P]  B] E^2 [ Yi |[ SM :P]  B]

y =

y^2 P ( Yi = y |[ SM :P]  B) ^2

=

y =

y^2

ye  
y!
 ^2 = e

y =

y^2

y
y!
 ^2

= e

y =

y ( y 1)

y
y!

+ e

y =

y
y!

^2

= ^2 e  e +   ^2 =

y <- c (1, 2, 1, 1, 4, 1, 2, 2, 0, 3, 6, 2, 1, 3) theta <- var (y) sapply (0 : 6, function (x) { round ( ppois (x+0.5,theta) – ppois (x-0.5,theta),3)})

## [1] 0.093 0.220 0.262 0.208 0.124 0.059 0.

round (1 – ppois (6.5, theta), 3)

## [1] 0.

y <- c (1,2,1,1,4,1,2,2,0,3,6,2,1,3) n <- length (y) theta <- var (y) emp <- sapply (0 : 6, function (x) { sum (y == x) / n }) fit <- sapply (0 : 6, function (x) { ppois (x + 0.5, theta) }) fit <- diff ( c (0, fit)) df <- data.frame (empirical=emp, fitting=fit) df

empirical fitting

1 0.07142857 0.

2 0.35714286 0.

3 0.28571429 0.

4 0.14285714 0.

5 0.07142857 0.

6 0.00000000 0.

7 0.07142857 0.

P ( Yi = yi |B)

yi Empirical Best-Fitting Poisson
0 0. 071 0. 093
1 0. 357 0. 220
2 0. 286 0. 262
3 0. 143 0. 208
4 0. 071 0. 124
5 0 0. 059
6 0. 071 0. 023
 7 0 0. 011
total 1 1

(f) The observed information is

I (   MLE ) = d

(^2) d^2 ( MLE ) = n i =1 yi ^2 MLE

=

n
  MLE

The estimated standard error is

SE (   MLE ) =

I ^1 (   MLE ) =

  MLE
n

From CLT, the 99_._ 9% = 100(1 )%confidence interval for is

  MLE ^1

(

1 )

 SE (   MLE )

y <- c (1,2,1,1,4,1,2,2,0,3,6,2,1,3) n <- length (y) theta.MLE <- mean (y) theta.SE <- sqrt (theta.MLE / n) alpha <- 0.

c (theta.MLE – qnorm (1 – alpha / 2) ***** theta.SE, theta.MLE + qnorm (1 – alpha / 2) ***** theta.SE)

## [1] 0.8057122 3.

so the CI[0_._ 81 , 0_._ 34]

(g) Suppose the prior distribution of is the gamma distribution,

fn (  | y ) fn ( y |  ) f (  )

=

( n

i =

yie  
yi!

)


(  )

 ^1 e

yi + (^1) e ( + n ) so the posterior distribution of is( + n i =1 yi, + n ) (h) (i) From the equation (4), the posterior mean is + sn + n

=

+ n

+

sn
n

n
 + n

since  is prior mean and snn is sample mean, so the posterior mean is the weighted average of the
prior mean and sample mean.

(j)

(j.i) since the prior mean is , making the prior mean agrees with data mean gives

=  yn   = y  n =   yn

(j.ii)

y <- c (1, 2, 1, 1, 4, 1, 2, 2, 0, 3, 6, 2, 1, 3) n <- length (y) sn <- sum (y)

beta <- length (y) alpha <- beta *** mean** (y) theta <- seq (0, 4, length.out = 100)

prior <- beta ^ alpha / gamma (alpha) ***** theta ^ (alpha – 1) *** exp** ( – beta ***** theta) posterior <- (beta + n) ^ (alpha + sn) / gamma (alpha + sn) ***** theta ^ (alpha + sn – 1) *** exp** ( – (beta + n) ***** theta)

par (mar = c (5, 5, 3, 5)) plot (prior ~ theta, pch=19, ylab = "", ylim = c (0.0,1.5), main = "", xlab = "theta", col = "blue") par (new = TRUE) points (posterior ~ theta, pch=7, xaxt = "n", yaxt = "n", ylab = "", xlab = "", col = "red")

legend ("topleft", c ("prior", "posterior"), col = c ("blue", "red"), pch = c (19, 7))

0 1 2 3 4

0.

1. theta

prior

posterior

(j.iii) The posterior distribution is( + sn, + n ), so the posterior mean is ++ snn = 2_._ 071429 and the

posterior SD is

+ sn ( + n )^2 = 0_._^2719919 , the MLE is 2.071429 and its standard error 0.3846546, which is larger than the posterior SD.

(j.iv) The posterior interval for is[1_._ 033671 , 3_._ 574646]

y <- c (1, 2, 1, 1, 4, 1, 2, 2, 0, 3, 6, 2, 1, 3)

beta <- length (y) alpha <- beta *** mean** (y) qgamma (0.001 / 2, alpha, beta)

## [1] 1.

qgamma (1 – 0.001 / 2, alpha, beta)

## [1] 3.