Overview

All my homework answers about Statistical Computing class.

title: "A—21010—2021—09—16"

Question

Use knitr to produce at least three examples(texts,figures,tables).

Answer

example one: random number generation from the Poisson Distribution

set.seed(1014)
x<-rpois(108,lambda=6)
y<-matrix(rpois(12*9,lambda=6),nrow=12,ncol=9,byrow=TRUE)
x;y

\ example two: table of time series

ts(matrix(rpois(108,6),12,9),frequency=12,start=c(1958,1))

\ example three: expression

x<-1:10
y<-runif(10)
z<-rchisq(10,14)
exp1<-expression(x/(y^2+exp(z)))
exp1
eval(exp1)
D(exp1,"x");D(exp1,"y");D(exp1,"z")

\ example four: graphic split

m<-matrix(1:16,4,4)
layout(m,widths=c(1,2,3,4),heights=c(4,3,2,1))
layout.show(16)

\ example five:

x=seq(-10,10,0.01)
plot(x,exp(((-1/2)*x^2))/sqrt(2*pi),xlim=c(-15,15), ylim=c(0,1), main="标准正态图 ",  
xlab="x", ylab="y")

title: "A—21010—2021—09—23"

Question

Exercises 3.4,3.11,and 3.20(pages 94-96,Statistical Computing with R).

Answer

3.4: Devolop an algorithm to generate random samples from a Rayleigh($\sigma$) distribution: u<-runif(n); y<-$-2ln(u)$; x<-$\sigma*sqrt(y)$

set.seed(1014)
n<-1e5
u<-runif(n)
y<--2*log(u)
x1<-sqrt(y);x2<-3*sqrt(y);x3<-5*sqrt(y)  
##Generate Rayleigh(sigma) samples for several choices of sigma>0
hist(x1,prob=TRUE,main = expression(f(x1)==x1*exp(-x1^2/2)))
m1<-seq(0,100,0.1)
lines(m1,m1*exp(-m1^2/2),col="green")
hist(x2,prob=TRUE,main = expression(f(x2)==(x2/9)*exp(-x2^2/18)))
m2<-seq(0,100,0.1)
lines(m2,(m2/9)*exp(-m2^2/18),col="purple")
hist(x3,prob=TRUE,main = expression(f(x3)==(x3/25)*exp(-x3^2/50)))
m3<-seq(0,100,0.1)
lines(m3,(m3/25)*exp(-m3^2/50),col="darkred")
## check that the mode of the generated samples is close to the theoretical mode sigma(check the histogram)

\ 3.11:

set.seed(1014)
n<-1000
X1<-rnorm(n,mean=0,sd=1)
X2<-rnorm(n,mean=3,sd=1)
## Generate a random sample of size 1000 from a normal location mixture.
Z1<-0.75*X1+(1-0.75)*X2
hist(Z1,prob=TRUE,main = expression(f(Z1)==0.75*X1+0.25*X2))
x<-seq(-5,5,length.out = 1000)
lines(x,dnorm(x,0.75,0.625),col="darkblue",lty=2)
## Graph the histogram of the sample with density superimposed,for p1=0.75.
p1<-sample(c(0,1),n,replace=TRUE)
Z2<-p1*X1+(1-p1)*X2
hist(Z2)
## The components of the mixture have N(0,1) and N(3,1) distributions with mixing probabilities p1 and p2=1-p1.
Z3<-0.5*X1+0.5*X2;Z4<-0.3*X1+0.7*X2;Z5<-0.1*X1+0.9*X2
hist(Z3);hist(Z4);hist(Z5)
## Repeat with different values for p1 and observe whether the empirical distribution of the mixture appears to be bimodal.

Make a conjecture : p1=0.5,then Z=0.5X1+0.5X2 is bimodal.

\ 3.20: Write a program to simulate a compound Poisson(lambda)-Gauss process(Y has a Gamma distribution).

set.seed(1014)
f<-function(n,t,lambda){
  Nt<-rpois(n,lambda = lambda*t)
  return(Nt)}
Xt<-numeric()
g<-function(alpha, beta){
for(i in 1:length(N)){
Xt[i]<-sum(rgamma(N[i],shape = alpha, scale= 1/beta))}
return(Xt)}
lambda<-seq(1,10,by=1)
alpha<-seq(2,length.out = 10,by=2)
beta<-seq(3,30,by=3)
D<-data.frame()
E_X<-numeric()
Var_X<-numeric()
lambda.t.E_Y<-numeric()
Theoretical_Var<-numeric()
for(i in 1:10){
N<-f(1e4,10,lambda[i])
X<-g(alpha[i],beta[i])
E_X[i]<-mean(X)
Var_X[i]<-(1000-1)*var(X)/1000  
lambda.t.E_Y[i]<-lambda[i]*10*alpha[i]/beta[i] ## theoretical mean :E[X(t)]=lambda*t*E[Yi]
Theoretical_Var[i]<-(lambda[i]*10*alpha[i]*(alpha[i]+1)/(beta[i])^2)}
## theoretical variance  :Var[X(t)]=lambda*t*E[(Ti)^2]
D<-data.frame(lambda,alpha,beta,E_X,Var_X,lambda.t.E_Y,Theoretical_Var)
## putting differeent values of lambda ,alpha ,beta in f() and g() we can simulate compound Poisson process(lambda)-Gauss process(Y has a Gamma distribution).
D

The estimation of the mean and the variance of X(10) for several choices of the parameters(lambda,alpha,beta) is close to the theoretical values.

"A-21010-2021-09-30"

Exercises 5.4,5.9,5.13,and 5.14 (pages 149-151,Statistical Computating with R).

Question 5.4

Write a function to cmpute a Monte Carlo estimate of the Beta(3,3)cdf,and use the function to estimate F(x) for x=0.1,0.2,...,0.9.Compare the estimates with the values returned by the pbeta function in R.

Answer 5.4

\textbf{solution}:The Beta(3,3) density is $$ f(x)=\frac{1}{B(3,3)}[x^(3-1)][(1-x)^(3-1)]=30x^2(1-x)^2,0<x<1 $$

The Beta(3,3) cumulative distribution function is $$ F(x)=\int_{0}^xf(t)dt=\int_{0}^x30t^2(1-t)^2dt $$

make variable replacement:let t=xy $$ F(x)=\int_{0}^{1}30(xy)^2(1-xy)^2xdy=E_Y[30x^3Y^2(1-xY)^2] $$ where the random variable Y has the Uniform(0,1) distribution.

Generate iid Uniform(0,1) random numbers$u_1,u_2,...,u_m$,and compute $$ \hat{\theta}=1/m\sum_{i=1}^{m}30x^3u_i^2(1-xu_i)^2 $$

set.seed(1014)
n<-1e6
f<-function(x)
{
s=0
for(i in 1:n)
{
y=rbeta(1,3,3)
if(y<=x)
s=s+1  
}
n_est<s/n
return(c(n_est,pbeta(x,3,3)))
}
set.seed(1014)
x<-seq(0.1,0.9,length=9)
m<-1e6
u<-runif(m)
cdf<-numeric(length(x))
for(i in 1:length(x))
 { g<-30*(x[i]^5)*(u^4)-60*(x[i]^4)*(u^3)+30*(x[i]^3)*(u^2)
  cdf[i]<-mean(g)
}
p.value<-pbeta(x,3,3)
print(round(rbind(x,cdf,p.value),5))

We find that the figures are very close for m=100000 replications.

Question 5.9

The Rayleigh denstity[156,(18.76)] is $$ f(x) = \frac{x}{\sigma^2}e^{-x^2/(2\sigma^2)}, \quad x\ge 0, \sigma>0. $$

Implement a function to generate samples from a Rayleigh($\sigma$) distribution,using antithetic variables.What is the percent reduction in variance of $\frac{X+X'}{2}$ compared with $\frac{X_1+X_2}{2}$ for independent $X_1,X_2$?

Answer 5.9

\textbf{solution}:The Rayleigh density is $$ f(x) = \frac{x}{\sigma^2}e^{-x^2/(2\sigma^2)}, \quad x\ge 0, \sigma>0. $$

The Rayleigh cumulative distribution function is $$ F(x)=\int_{0}^x f(t)d t = \int_{0}^x\frac{t}{\sigma^2}e^{-t^2/(2\sigma^2)}dt $$

make variable replacement:let t=xy $$ F(x)=\int_{0}^{1}\frac{xy}{\sigma^2}e^{-(xy)^2/(2\sigma^2)}xdy=E_Y[1/(\sigma^2)x^2Ye^{-(xY)^2/(2\sigma^2)}] $$ where the random variable Y has the Uniform(0,1) distribution.

Generate random numbers $u_1,u_2,...,u_\frac{m}{2}$~Uniform(0,1) and compute half of the replicates using $$ Y_j=\frac{u_jx^2}{\sigma^2}e^{-\frac{(xu_j)^2}{2\sigma^2}},j=1,2,...,\frac{m}{2}\ Y_j'=\frac{(1-u_j)x^2}{\sigma^2}e^{-\frac{[x(1-u_j)]^2}{2\sigma^2}},j=1,2,...,\frac{m}{2} $$ The antithetic variable estimator is $$ \hat{\theta}=\frac{1}{m}\sum_{j=1}^{m/2}(Y_j+Y_j')=\frac{1}{m}\sum_{j=1}^{m/2}(\frac{u_jx^2}{\sigma^2}e^{-\frac{(xu_j)^2}{2\sigma^2}}+\frac{(1-u_j)x^2}{\sigma^2}e^{-\frac{[x(1-u_j)]^2}{2\sigma^2}}) $$

MC.Phi<-function(x,R=1e6,antithetic=TRUE)
{
  u<-runif(R/2)
  sigma<-1
  if(!antithetic)
    v<-runif(R/2)
  else
    v<-1-u
  u<-c(u,v)
  cdf<-numeric(length(x))
  for(i in 1:length(x))
{
  g<-{(u*x[i]^2)/(sigma^2)}*exp(-(u*x[i])^2/(2*sigma^2))
  cdf[i]<-mean(g)
  }
  cdf
}
x<-seq(0.1,3,length=10)
Phi<-pnorm(x)
set.seed(123)
MC1<-MC.Phi(x,antithetic=FALSE)
set.seed(123)
MC2<-MC.Phi(x,antithetic=TRUE)
print(round(rbind(x,MC1,MC2,Phi),5))

The approximate reduction in variance can be estimated for given x by a simulation under antithetic variable method and the simple Monte Carlo integration method.

set.seed(1014)
m<-10000
MC1<-MC2<-numeric(m)
x<-4.5
for(i in 1:m)
{
  MC1[i]<-MC.Phi(x,R=1000,antithetic=FALSE)
  MC2[i]<-MC.Phi(x,R=1000,antithetic=TRUE)
}

print(sd(MC1));print(sd(MC2));print((var(MC1)-var(MC2))/var(MC1))*100

the percent reduction in variance of $\frac{X+X'}{2}$ compared with $\frac{X_1+X_2}{2}$ for independent $X_1,X_2$ achieved approximately 77.9% reduction in variance at x=4.5

Question 5.13

Find two importance functions $f_1$ and $f_2$ that are supported on (1,∞) and are 'close' to $$ g(x)=\frac{x^2}{\sqrt{2\pi}}e^{-x^2/2},x>1 $$ Which of your importance functions should produce the smaller varience in estimating $$ \int_{1}^{∞}\frac{x^2}{\sqrt{2\pi}}e^{-x^2/2}dx $$ by importance sampling? Explain.

Answer 5.13

g<-function(x)
{
  (x^2*exp(-x^2/2))/sqrt(2*pi)*(x>1)
}
integrate(g,1,Inf)

Find two importance functions $f_1=e^{-(x-1)}I(x>1)$ and $f_2=\frac{4}{\pi(1+x^2)}I(x>1)$ that are supported on (1,∞) and are 'close' to g(x)

set.seed(1014)
f1<-function(x)
{
  exp(-(x-1))*(x>1)
}
f2<-function(x)
{
  ((1+x^2)^(-1)*(x>1)*4/pi)
}
m<-1e7
u<-runif(m)
x1<-1-log(1-u)
x2<-tan(pi*(1+u)/4)
fg<-cbind(g(x1)/f1(x1),g(x2)/f2(x2))
theta.hat<-se<-numeric(2)
theta.hat<-c(mean(fg[,1]),mean(fg[,2]))
se<-c(sd(fg[,1]),sd(fg[,2]))
rbind(theta.hat,se)

$0.1575414=se(\hat\theta_1) < se(\hat\theta_2)=0.3308799$ $f_1=e^{-(x-1)}I(x>1)$produce the smaller varience in estimating $\int_{1}^{∞}\frac{x^2}{\sqrt{2\pi}}e^{-x^2/2}dx$

Question 5.14

Obtain a Monte Carlo estimate of $$ \int_{1}^{∞}\frac{x^2}{\sqrt{2\pi}}e^{-x^2/2}dx $$ by importance sampling.

Answer 5.14

consider a importance function: $$ f(x)=e^{-x}I(x>0)\ $$

set.seed(1014)
m<-1e6
theta.hat<-se<-numeric(1)
g<-function(x)
{
  (x^2*exp(-x^2/2))/sqrt(2*pi)*(x>1)
}
integrate(g,1,Inf)
x<-rexp(m,1)
fg<-g(x)/exp(-x)
theta.hat<-mean(fg)
se<-sd(fg)
rbind(theta.hat,se)

So we get a Monte Carlo estimate of $\int_{1}^{∞}\frac{x^2}{\sqrt{2\pi}}e^{-x^2/2}dx$ is $$ \hat\theta=\frac{1}{m}\sum_{i=1}^{m}\frac{g(X_i)}{f(X_i)}=\frac{1}{m}\sum_{i=1}^{m}\frac{X_i^2}{\sqrt{2\pi}}e^{X_i-X_i^2/2} $$

title: "A-21010-2021-10-14"

Exercises 6.5 and 6.A(page 180-181,Statistical Computating with R).

If we obtain the powers for two methods under a particular simulation setting with 10,000 experiments: say,0.651 for one method and 0.676 for another method. We want to know if the powers are different at 0.05 level.

(1)What is the corresponding hypothesis test problem?

(2)what test should we use? Z-test, two-sample t-test, paired-t test or McNemar test? Why?

(3)Please provide the least necessary information for hypothesis testing.

Question 6.5

Suppose a 95% symmetric t-interval is applied to estimate a mean, but the sample date are non-normal.Then the probability that the confidence interval covers the mean is not necessarily equal to 0.95.Use a Monte Carlo experiment to estimate the coverage probability of the t-interval for random samples of $\chi^2(2)$data with sample size n=20.Compare your t-interval results with the simulation results in Example 6.4 (The t-interval should be more robust to departures from normality than the interval for variance.)

Answer 6.5

\textbf{solution}:\ Use a Monte Carlo experiment to estimate the coverage probability of the t-interval for random samples of $\chi^2(2)$data with sample size n=20.\

set.seed(1212)
alpha<-0.05
n<-20
m<-10000
UCL<-numeric(m)
LCL<-numeric(m)
for (i in 1:m) 
{
  x<-rchisq(n,2)
  LCL[i]<-mean(x)-qt(alpha/2,n-1,lower.tail=FALSE)*sd(x)/(sqrt(n))
  UCL[i]<-mean(x)+qt(alpha/2,n-1,lower.tail=FALSE)*sd(x)/(sqrt(n))
}
mean(LCL<2&UCL>2) ## coverage probability of the t-interval
UCL1<-numeric(m)
LCL1<-numeric(m)
for (i in 1:m) 
{
  x<-rchisq(n,2)
  LCL1[i]<-mean(x)-qnorm(alpha/2,lower.tail=FALSE)*sd(x)/(sqrt(n))
  UCL1[i]<-mean(x)+qnorm(alpha/2,lower.tail=FALSE)*sd(x)/(sqrt(n))
}
mean(LCL1<2&UCL1>2) ## coverage probability of the normality interval
n <- 20
alpha <- .05
m <- 10000
set.seed(1212)
xbar <- numeric(m)
es <- numeric(m)
for (i in 1:m) {
   xbar[i] <- mean(rchisq(n, df = 2))
   es[i] <- qt((1-alpha/2),n-1)*sd(rchisq(n, df = 2))/sqrt(n)
}
# The symmetric t-interval to estimate a mean is xbar +(-) t(1-α/2)(n-1)s/sqrt(n)
p <- mean((xbar-es)<2 & 2<(xbar+es))
#Judging  whether the mean of the populations falling within the confidence interval generated by the samples subject to the chi-square distribution, thus obtaining the probability that the confidence interval covers the mean
p

Conclusion:The t-interval should be more robust.

Question 6.A

Use Monte Carlo simulation to investigate whether the empirical Type I error rate of the t-test is approximately equal to the nominal significance level $\alpha$,when the sampled population is non-normal.The t-test is robust to mild departures from normality. Discuss the simulation results for the cases where the sampled population is (i)$\chi^2(1)$,(ii)Uniform(0,2),and (iii)Exponential(rate=1).In each case,test $H_0:\mu=\mu_0$vs$H_a:\mu\neq\mu_0$,where $\mu_0$is the mean of $\chi^2(1)$,Uniform(0,2),and Exponential(1),respectively.

Answer 6.A

\textbf{solution}: (1) for chisquare distribution with 1 degrees of freedom.

set.seed(1212)
n<-1000
alpha<-0.05
m<-10000
p<-numeric(m)
for (j in 1:m) {
u1=runif(n)
u2=runif(n)
z=sqrt(-2*log(u1))*cos(2*pi*u2)
x=z^2
ttest <- t.test(x, mu = 1)
p[j] <- ttest$p.value
}
p.hat <- mean(p < alpha)
print(c(p.hat,alpha))

Conculsion:\ (i)the empirical Type I error rate of the t-test is approximately equal to the nominal significance level $\alpha$.\ (ii)p-value>alpha,accept $H_0:\mu=1$.\ (2)For uniform distribution with parameter a=0 and b=2.

set.seed(23569)
n<-1000
alpha<-0.05
m<-10000
p<-numeric(m)
for (j in 1:m) {
u=runif(n)
x=2*u
ttest <- t.test(x, mu = 1)
p[j] <- ttest$p.value
}
p.hat <- mean(p < alpha)
print(c(p.hat,alpha))

Conculsion:\ (i)the empirical Type I error rate of the t-test is approximately equal to the nominal significance level $\alpha$.\ (ii)p-value>alpha,accept $H_0:\mu=1$.\ (3)for exponential distribution with mean 1.

set.seed(17852)
n<-1000
alpha<-0.05
m<-10000
p<-numeric(m)
for (j in 1:m) {
u=runif(1000)
x=-log(1-u)
ttest <- t.test(x, mu = 1)
p[j] <- ttest$p.value
}
p.hat <- mean(p < alpha)
print(c(p.hat,alpha))

Conculsion:\ (i)the empirical Type I error rate of the t-test is approximately equal to the nominal significance level $\alpha$.\ (ii)p-value>alpha,accept $H_0:\mu=1$.、 As can be seen from the table, we can find the empirical Type I error rate of the t-test is approximately equal to the nominal significance level α, when the sampled population is non-normal(such as $\chi^{2}(1)$, Uniform(0,2) and Exponential(rate=1). The t-test is robust to mild departures from normality and all accept $H_0:\mu=1$.

Question

If we obtain the powers for two methods under a particular simulation setting with 10,000 experiments: say,0.651 for one method and 0.676 for another method. We want to know if the powers are different at 0.05 level. (1)What is the corresponding hypothesis test problem? (2)what test should we use? Z-test, two-sample t-test, paired-t test or McNemar test? Why? (3)Please provide the least necessary information for hypothesis testing.

Answer

\textbf{solution}:\ We may can't say the powers are different at 0.05 level.\

(1)The corresponding hypothesis test problem is whether the powers are different, that is "$H_{0}: power_{1}=power_{2}$ vs $H_{1}: power_{1} \neq power_{2}$".\ (2)We should use paired-t test.\ (3)There is only one sample for each of $power_{1}$ and $power_{2}$ in the title, so multiple experiments should be performed to generate a plurality of $power_{1}$ and $power_{2}$, thereby obtaining a plurality of differences of $power_{1}$ and $power_{2}$. It is noted that the difference between $power_{1}$ and $ppwer_{2}$ is d, a plurality of d approximately obey the t distribution. The null hypothesis can be tested by applying paired-t test.

title: "A-21010-2021-10-21"

Exercise 6.C (pages 182,Statistical Computing with R).

Question 6.C

Repeat Examples 6.8 and 6.10 for Mardia's multivariate skewness test.Mardia[187] proposed tests of multivariate normality based on multivariate generalizations of skewness and kurtosis.If X and Y are iid,the multivariate population skewness $\beta_{1,d}$ is defined by Mardia as\ $$ \beta_{1,d}=E[(X-\mu)^T\Sigma^{-1}(Y-\mu)]^3. $$ \ under normality,$\beta_{1,d}=0$.The multivariate skewness statistic is\ $$ b_{1,d}=\frac{1}{n^2}\sum_{i,j=1}^{n}((X_i-\bar{X})^T{\hat{\Sigma}}^{-1}(X_j-\bar{X}))^3, $$ \ where $\hat\Sigma$ is the maximum likelihood estimator of covariance.Large values of $b_{1,d}$ are significant.The asymptotic distribution of $nb_{1,d}/6$ is chisquared with $d(d+1)(d+2)/6$ degrees of freedom.\

Solution: The test problem:\ $H_0:\beta_{1,d}=0$ vs $H_a:\beta_{1,d}\neq0$.\ The test statistic:\ $$ \chi^2=nb_{1,d}/6=\frac{1}{6n}\sum_{i,j=1}^{n}((X_i-\bar{X})^T{\hat{\Sigma}}^{-1}(X_j-\bar{X}))^3 $$ \ $\chi^2=nb_{1,d}/6\sim\chi^2(\frac{d(d+1)(d+2)}{6})$where d is dimension of $X$ when $H_0:\beta_{1,d}=0$ is right.\ critical.value:\ so $H_0:\beta_{1,d}=0$ is rejected if $\chi^2>\chi^2_{\alpha}(\frac{d(d+1)(d+2)}{6})$.\

Repeat Examples 6.8: Assess the Type I error rate for a skewness test of normality at $\alpha=0.05$ based on the asymptotic distribution of $b_{1,d}$ for sample sizes n=10,20,30,50,100,500 and 1000.\ Let d=1,2,3 to repeat this experiment m=10000 times, respectively.\

set.seed(1212)
c.v=function(d){
  return(qchisq(0.95,df=d*(d+1)*(d+2)/6))
}
critical.value=c(c.v(1),c.v(2),c.v(3)) ## compute the critiacal values for each d
s<- function(x) {
  statistic=0
  sigma=cov(x,x)
  n=dim(x)[1]
  d=dim(x)[2]
  matrix=(scale(x))%*%t(scale(x))
  statistic=sum(matrix^3)
  return( statistic/n/6 )
} ## computes the sample skewness
n<-c(10,20,30,50,100,500) ## sample sizes
m<-1e4 ## repeat experiment times
Type.I.error=matrix(nrow=length(n),ncol=3)
for(i in 1:length(n)){
  for(d in 1:3){
    result=numeric(m)
    for(j in 1:m){
      x=array(rnorm(n[i]*d),dim=c(n[i],d))
      result[j]=as.integer(abs(s(x)) >=critical.value[d] )
    }
    Type.I.error[i,d]=mean(result)
  }
}
dimnames(Type.I.error)[[1]]=c("size=10","size=20","size=30","size=50","size=100","size=500")
dimnames(Type.I.error)[[2]]=c("d=1","d=2","d=3")
print(Type.I.error)

\ So when the size of sample is large, the Type I error rate gets close to nominal level $\alpha=0.05$.\ Repeat Examples 6.10: The skewness test of normality was desribed in Repeat Example 6.8.In this example,we estimate by simulation the power of the skewness test of normality against a contaminated normal (normal scale mixture) alternative described in Example 6.3.he contaminated normal distribution is denoted by\ $$ (1-\epsilon)N(\mu=0,\sigma^2=1)+\epsilon N(\mu=0,\sigma^2=100),0\leq \epsilon \leq1 $$ \ When $\epsilon=0$ or $\epsilon=1$the distribution is normal, but the mixture is nonnormal for $0\leq \epsilon \leq1$. We can estimate the power of the skewness test for a sequence of alternatives indexed by ε and plot a power curve for the power of the skewness test against this type of alternative. For this experiment, the significance level is $\alpha=0.05$ and the sample size is n=100, the dimension d=1,2,3.\

set.seed(1212)
alpha<-0.05
n<-100
m<-5000
eps<-c(seq(0, .2, .01), seq(.2, 1, .05))
N<-length(eps)
power1<-numeric(N)
c.v=function(d){
  return(qchisq(0.95,df=d*(d+1)*(d+2)/6))
}
critical.value=c(c.v(1),c.v(2),c.v(3)) ## compute the critiacal values for each d
s<- function(x) {
  statistic=0
  sigma=cov(x,x)
  n=dim(x)[1]
  matrix=(scale(x))%*%t(scale(x))
  statistic=sum(matrix^3)
  return( statistic/n/6 ) 
} ## computes the sample skewness
for (j in 1:N) { 
 e<-eps[j]
 sktests1<-numeric(m)
 for (i in 1:m) { 
 sig1<-sample(c(1, 10), replace = TRUE,size = n*1, prob = c(1-e, e))
 x=array(rnorm(30*1,0,sig1),dim=c(30,1))
 sktests1[i] <- as.integer(s(x) >= c.v(1))
}
power1[j]<-mean(sktests1)
}
plot(eps, power1, type = "b",
xlab = bquote(eps,splice = FALSE), ylim = c(0,1),main="Power of demension one")
abline(a=NULL,b=NULL,h = 0.05, lty = 3)

\

set.seed(1212)
alpha<-0.05
n<-100
m<-5000
eps<-c(seq(0, .2, .01), seq(.2, 1, .05))
N<-length(eps)
power2<-numeric(N)
c.v=function(d){
  return(qchisq(0.95,df=d*(d+1)*(d+2)/6))
}
critical.value=c(c.v(1),c.v(2),c.v(3)) ## compute the critiacal values for each d
s<- function(x) {
  statistic=0
  sigma=cov(x,x)
  n=dim(x)[1]
  matrix=(scale(x))%*%t(scale(x))
  statistic=sum(matrix^3)
  return( statistic/n/6 ) 
} ## computes the sample skewness
for (j in 1:N) { 
 e<-eps[j]
 sktests2<-numeric(m)
 for (i in 1:m) { 
  sig2<-sample(c(1, 10), replace = TRUE,size = n*2, prob = c(1-e, e))
 x=array(rnorm(30*2,0,sig2),dim=c(30,2))
 sktests2[i] <- as.integer(s(x) >= c.v(2))
}
power2[j]<-mean(sktests2)
}
plot(eps, power2, type = "b",
xlab = bquote(eps,splice = FALSE), ylim = c(0,1),main="Power of demension two")
abline(a=NULL,b=NULL,h = 0.05, lty = 3)

\

set.seed(1212)
alpha<-0.05
n<-100
m<-5000
eps<-c(seq(0, .2, .01), seq(.2, 1, .05))
N<-length(eps)
power3<-numeric(N)
c.v=function(d){
  return(qchisq(0.95,df=d*(d+1)*(d+2)/6))
}
critical.value=c(c.v(1),c.v(2),c.v(3)) ## compute the critiacal values for each d
s<- function(x) {
  statistic=0
  sigma=cov(x,x)
  n=dim(x)[1]
  matrix=(scale(x))%*%t(scale(x))
  statistic=sum(matrix^3)
  return( statistic/n/6 ) 
} ## computes the sample skewness
for (j in 1:N) { 
 e<-eps[j]
 sktests3<-numeric(m)
 for (i in 1:m) { 
  sig3<-sample(c(1, 10), replace = TRUE,size = n*3, prob = c(1-e, e))
 x=array(rnorm(30*3,0,sig3),dim=c(30,3))
 sktests3[i] <- as.integer(s(x) >= c.v(3))
}
power3[j]<-mean(sktests3)
}
plot(eps, power3, type = "b",
xlab = bquote(eps,splice = FALSE), ylim = c(0,1),main="Power of demension 3")
abline(a=NULL,b=NULL,h = 0.05, lty = 3)

conclusion: When $\epsilon=0$ or $\epsilon=1$the distribution is normal, but the mixture is nonnormal for $0\leq \epsilon \leq1$ for the significance level is $\alpha=0.05$ and the sample size is n=100, the dimension d=1,2,3.And we find the empirical power of the test is higher than 0.05,when$\epsilon\approx0.1$ the empirical power of the test is highest.

title: "A-21010-2021-10-28"

Exercise 7.7,7.8,7.9,and 7.B(pages,Statistical Computing with R).

Question 7.7

Refer to Exercise 7.6. Efron and Tibshirani discuss the following example [84,Ch. 7]. The five-dimensional scores data have a 5 × 5 covariance matrix $\Sigma$,with positive eigenvalues $\lambda_1>\cdots>\lambda_5$. In principal components analysis,\ $$ \theta=\frac{\lambda_1}{\sum_{j=1}^{5}\lambda_j} $$ \ measures the proportion of variance explained by the first principal component. Let $\hat\lambda_1>\cdots>\hat \lambda_5$ be the eigenvalues of $\hat\Sigma$, where $\hat\Sigma$ is the MLE of $\Sigma$.Compute the sample estimate\ $$ \hat\theta=\frac{\hat\lambda_1}{\sum_{j=1}^{5}\hat\lambda_j} $$ \ of $\theta$.Use bootstrap to estimate the bias and standard error of $\hat\theta$.\

Solution:

Refer to Exercise 7.6.Efron and Tibshirani discuss the scor (bootstrap) test score data on 88 students who took examinations in five subjects [84, Table 7.1], [188, Table 1.2.1].The first two tests (mechanics, vectors) were closed book and the last three tests (algebra, analysis, statistics) were open book. Each row of the data frame is a set of scores $(x_{i1},\cdots,x_{i5})$ for the $i^{th}$ student.\ Use bootstrap to estimate the bias and standard error of $\hat\theta$.\

library(bootstrap)
data(scor)
X<-scor
n<-nrow(X)
p<-ncol(X)
set.seed(1212)
B<-1e4
f <- function(X){
m <-cov(X)
lambda <- eigen(m)$values
theta<- lambda[1]/sum(lambda)
}
sigma <- cov(scor)
s <- eigen(sigma)
v <- s$values[order(-(s$values))]
theta <- v[1]/sum(v) ## measures the proportion of variance explained by the first principal component of the original sample.
boot <- 1:B
for(i in 1:B){
index <- sample(1:n,n,replace=TRUE)
X_boot <- X[index,]
boot[i] <- f(X_boot)
}
theta_hat <- f(X)
bias.bootstrap <- mean(boot) - theta_hat ## Obtain the bootstrap estimates of bias of θ.hat
se.bootstrap <- sqrt(var(boot)) ## Obtain the bootstrap estimates of standard error of θ.hat
print(theta,digits = 8)
print(bias.bootstrap,digits = 8)
print(se.bootstrap,digits = 8)

\ So:Use bootstrap to estimate the bias and standard error of $\hat\theta$(Under B=10000).\ $$ \hat{bias}_B(\hat\theta)=0.0020850521\ \hat{se}_B(\hat\theta)=0.047149043 $$ \

Question 7.8

Refer to Exercise 7.7. Obtain the jackknife estimates of bias and standard error of $\hat\theta$.\

Solution:

Use jackknife to estimate the bias and standard error of $\hat\theta$.\

library(bootstrap)
set.seed(1212)
sigma <- cov(scor)
s <- eigen(sigma)
v <- s$values[order(-(s$values))]
theta <- v[1]/sum(v) ## measures the proportion of variance explained by the first principal component of the original sample.
n <- 88
theta.hat <- numeric(n)
for (i in 1:n) {
  scor.jack <- scor[-i, ]
  sigma.hat <- cov(scor.jack)
  ss <- eigen(sigma.hat)
  vv <- ss$values[order(-(ss$values))]
  theta.hat[i] <- vv[1]/sum(vv)
}
bias.jackknife <- (n-1)*(mean(theta.hat)-theta) ## Obtain the jackknife estimates of bias of θ.hat
se.jackknife <- sqrt((n-1)*mean((theta.hat-theta)^2)) ## Obtain the jackknife estimates of standard error of θ.hat
print(theta,digits = 8)
print(bias.jackknife,digits = 8)
print(se.jackknife,digits = 8)

\ So:Use Jackknife to estimate the bias and standard error of $\hat\theta$.\ $$ \hat{bias}_J(\hat\theta)=0.0010691389\ \hat{se}_J(\hat\theta)=0.04955244 $$ \

Question 7.9

Refer to Exercise 7.7. Compute 95% percentile and BCa confidence intervals for $\hat\theta$.\

Solution:

Using the function boot.ci to compute 95% percentile and BCa confidence intervals for $\hat\theta$.\

library(boot)
set.seed(827)
data(scor, package = "bootstrap")
theta.hat <- numeric(88)
boot.theta<-function(X,i){
  sigma.hat <- cov(X[i,])
  ss <- eigen(sigma.hat)
  vv <- ss$values[order(-(ss$values))]
  theta.hat[i] <- vv[1]/sum(vv)
}
boot.obj <- boot(scor, statistic = boot.theta,R = 2000)
print(boot.obj)
boot.ci(boot.obj,type=c("perc","bca"))

\ So (1)the 95% percentile confidence intervals for $\hat\theta$ is ( 0.5229, 0.7051 ) and (2)the 95% BCa confidence intervals for $\hat\theta$ is ( 0.5216, 0.7045 ).

Question 7.B

Repeat Project 7.A for the sample skewness statistic. Compare the coverage rates for normal populations (skewness 0) and $\chi^2(5)$ distributions (positive skewness).\

Solution(normal population):

Repeat Project 7.A for the sample skewness statistic: estimate the coverage probabilities of (1)the standard normal bootstrap confidence interval, (2)the basic bootstrap confidence interval, and (3)the percentile confidence interval. Sample from a normal population and check the empirical coverage rates for the sample skewness statistic.Find (i)the proportion of times that the confidence intervals miss on the left, and (ii)the porportion of times that the confidence intervals miss on the right.\

library(boot)
set.seed(827)
## computes the sample skewness.
skewness <- function(x,i) {
  xbar <- mean(x[i])
  m3 <- mean((x[i] - xbar)^3)
  m2 <- mean((x[i] - xbar)^2)
  return( m3 / m2^1.5 )
}
s <- 0 ## normal populations (skewness=0)
n <- 20 ## the size of sample
m <- 1000
normal.norm <- normal.basic <- normal.perc <- matrix(0, m, 2)
for (i in 1:m) {
  data.normal <- rnorm(n, 0, 5)
  normal.ske <- boot(data.normal, statistic = skewness, R=1000)
  normal <- boot.ci(normal.ske, type=c("norm","basic","perc"))
  normal.norm[i,] <- normal$norm[2:3]
  normal.basic[i,] <- normal$basic[4:5]
  normal.perc[i,] <- normal$percent[4:5]
}
## Calculate the coverage probability of a normal distribution
    norm <- mean(normal.norm[,1] <= s & normal.norm[,2] >= s)
    basic <- mean(normal.basic[,1] <= s & normal.basic[,2] >= s)
    perc <- mean(normal.perc[,1] <= s & normal.perc[,2] >= s)
## (i)Calculate the probability of the left side of the normal distribution
   norm.left <- mean(normal.norm[,1] >= s )
    basic.left <- mean(normal.basic[,1] >= s )
    perc.left <- mean(normal.perc[,1] >=s )
## (ii)Calculate the probability of the right side of the normal distribution
   norm.right <- mean(normal.norm[,2] <= s )
    basic.right <- mean(normal.basic[,2] <= s )
    perc.right <- mean(normal.perc[,2] <= s)
Distribution<-"N(0,25)"
Type <- c( "norm","basic","perc")
Left <- c(norm.left, basic.left,perc.left)
Right <- c(norm.right, basic.right, perc.right)
P.coverage <- c(norm,basic,perc)
result <- data.frame(Distribution, Type, Left, Right, P.coverage)
knitr::kable(result) 

\ After repeating Project 7.A for the sample skewness statistic,we can find that in the case of the size of sample is 20, the coverage probabilities of the 3 types of confidence intervals ((1)the standard normal bootstrap confidence interval, (2)the basic bootstrap confidence interval, and (3)the percentile confidence interval.) are very close to 0.9 in the normal populations (skewness 0).And we can get the probability that (i)the confidence intervals miss on the left, and (ii)the probability that the confidence intervals miss on the right from the table. \

Solution(chisq with 5 freedom distribution):

Repeat Project 7.A for the sample skewness statistic: estimate the coverage probabilities of (1)the standard normal bootstrap confidence interval, (2)the basic bootstrap confidence interval, and (3)the percentile confidence interval. Sample from a $\chi^2(5)$ distribution and check the empirical coverage rates for the sample skewness statistic.Find (i)the proportion of times that the confidence intervals miss on the left, and (ii)the porportion of times that the confidence intervals miss on the right.\

library(boot)
set.seed(827)
## computes the sample skewness.
skewness <- function(x,i) {
  xbar <- mean(x[i])
  m3 <- mean((x[i] - xbar)^3)
  m2 <- mean((x[i] - xbar)^2)
  return( m3 / m2^1.5 )
}
s<-sqrt(8/5) ## chisq with 5 freedom distribution (positive skewness=1.6).
n<-20 ## the size of sample
m<-1000
chisq.norm<-chisq.basic<-chisq.perc<-matrix(0, m, 2)
for (i in 1:m) {
  data.chisq<-rchisq(n,5)
  chisq.ske<-boot(data.chisq,statistic=skewness, R=1000)
  chisq<- boot.ci(chisq.ske,type=c("norm","basic","perc"))
  chisq.norm[i,]<-chisq$norm[2:3];
  chisq.basic[i,]<-chisq$basic[4:5];
  chisq.perc[i,]<-chisq$percent[4:5];
}
## Calculate the coverage probability of the chi-square distribution.
    norm <- mean(chisq.norm[,1] <= s & chisq.norm[,2] >= s)
    basic <- mean(chisq.basic[,1] <= s & chisq.basic[,2] >= s)
    perc <- mean(chisq.perc[,1] <= s & chisq.perc[,2] >= s)
## (i) Calculate the probability of the left side of the chi-square distribution.
    norm.left <- mean(chisq.norm[,1] >= s )
    basic.left <-mean(chisq.basic[,1] >= s )
    perc.left <- mean(chisq.perc[,1] >=s )
## (ii) Calculate the probability of the right side of the chi-square distribution.
    norm.right <- mean(chisq.norm[,2] <= s )
    basic.right <- mean(chisq.basic[,2] <= s )
    perc.right <- mean(chisq.perc[,2] <= s)
Distribution<-"chisq(5)"
Type <- c( "norm","basic","perc")
Left <- c(norm.left, basic.left,perc.left)
Right <- c(norm.right, basic.right, perc.right)
P.coverage <- c(norm,basic,perc)
result <- data.frame(Distribution, Type, Left, Right, P.coverage)
knitr::kable(result) 

\ After repeating Project 7.A for the sample skewness statistic,we can find that in the case of the size of sample is 20, the coverage probabilities of the 3 types of confidence intervals ((1)the standard normal bootstrap confidence interval, (2)the basic bootstrap confidence interval, and (3)the percentile confidence interval.) are very close to 0.7 in the $\chi^2(5)$ distribution(positive skewness).And we can get the probability that (i)the confidence intervals miss on the left, and (ii)the probability that the confidence intervals miss on the right from the table. \ Compare the coverage rates for normal populations (skewness 0) and $\chi^2(5)$ distributions (positive skewness).\ we can find that in the case of the size of sample is 20, the coverage probabilities of the 3 types of confidence intervals are very close to 0.9 in the normal distribution (skewness 0) and are very close to 0.7 in $\chi^2(5)$ distribution (positive skewness).

title: "A-21010-2021-11-04"

Exercise 8.2 (page 242,Statistical Computing with R).

Design experiments for evaluating the performance of the NN,energy, and ball methods in various situations.

Unequal variances and equal expectations.\ Unequal variances and unequal expectations.\ Non-normal distributions: t distribution with 1 df (heavy-tailed distribtion), bimodel distribution (mixture of two normal distribtions).\ Unbalanced samples (say, 1 case versus 10 controls).\ NOTE: The parameters should be chosen such that the powers are distinguishable (say,range from 0.3 to 0.8).

Question8.2

Implement the bivariate Spearman rank correlation test for independence[255] as a permutation test. The Spearman rank correlation test statistic can be obtained from function cor with method = "spearman". Compare the achieved significance level of the permutation test with p-value reported by cor.test on the same samples.

Solution: The Spearman's rank correlation test statistic is:\ $$ \rho=\frac{\sum_{i=1}^{n}R(x_i)R(y_i)-n(\frac{n+1}{2})^2}{\sqrt{\sum_{i=1}^{n}R(x_i)^2-n(\frac{n+1}{2})^2}\sqrt{\sum_{i=1}^{n}R(y_i)^2-n(\frac{n+1}{2})^2}} $$ \ where $R(x)$ and $R(y)$ are the ranks of a pair of variables ($x$ and $y$) each containing n observations.

set.seed(1212)
data <- as.matrix(iris[1:50, 1:4])
x <- data[ , 3:4]
y <- data[ , 1:2]
R <- 999
z <- c(x,y)
K <- 1:200
n <- length(x)
permutation <- numeric(R)
cor0 <- cor.test(x,y)$statistic
for (i in 1:R) {
  k <- sample(K,size = n,replace = FALSE)
  x1 <- z[k]
  y1 <- z[-k]
  permutation[i] <- cor.test(x1,y1)$statistic
}
permutation.p <- mean(abs(c(cor0,permutation))>=abs(cor0))
permutation.p
cor.test(x, y,alternative = "two.sided", conf.level = 0.95)

\ It can be seen that the correlation coefficient between the two variate is 0.908061, and the test p-value is 2.2e-16<0.05. So x and y are related, not independent.\

Question:Design experiments for evaluating the performance of the NN, energy, and ball methods in various situations.

Unequal variances and equal expectations.\ Unequal variances and unequal expectations.\ Non-normal distributions: t distribution with 1 df (heavy-tailed distribtion), bimodel distribution (mixture of two normal distribtions).\ Unbalanced samples (say, 1 case versus 10 controls).\ NOTE: The parameters should be chosen such that the powers are distinguishable (say,range from 0.3 to 0.8).

Solution:

set.seed(1212)
library(boot)
library(RANN)
library(energy)
library(Ball)
library(ggplot2)
## write the function of the NN test.
Tn <- function(z, ix, sizes,k){
  n1 <- sizes[1]; n2 <- sizes[2]; n <- n1 + n2
  if(is.vector(z)) z <- data.frame(z,0);
  z <- z[ix, ];
  NN <- nn2(data=z, k=k+1) 
  block1 <- NN$nn.idx[1:n1,-1]
  block2 <- NN$nn.idx[(n1+1):n,-1]
  i1 <- sum(block1 < n1 + .5)
  i2 <- sum(block2 > n1+.5)
  (i1 + i2) / (k * n)
} ## the function of NN method.
NN_test <- function(x,y,NumR=999,k=3){
N <- c(length(x), length(y)) 
z <- c(x,y)
boot.obj <- boot(data = z, statistic = Tn, R = NumR, sim = "permutation", sizes = N,k=k)
ts <- c(boot.obj$t0,boot.obj$t) 
p.value <- mean(ts>=ts[1])
return(p.value)
}
## Energy test
Energy_test <- function(x,y,NumR=999){
N <- c(length(x), length(y)) 
z <- c(x,y)
boot.obs <- eqdist.etest(z, sizes=N, R=NumR) 
p.value <- boot.obs$p.value
return(p.value)
}
alpha <- 0.05
## Ball test: ball statistic test for equal distributions can be directly achieved by  Ball package.

\

Unequal variances and equal expectations

set.seed(1212)
m <- 1000; ## number of permutation tests
k<-3; ## boot parameter
p<-2; ## dimension of data
mu <- 0; 
n1 <- n2 <- 50; ## the sample size of x and y
R<-999; ## boot parameter
sigma1 <- 1 
sigma2 <- 2
## x~N(0,1) 50 y~N(0,4) 50
p.values1 <- matrix(NA,m,3) 
for(i in 1:m){ 
x <- rnorm(n1,mu,sigma1)
y <- rnorm(n2,mu,sigma2)
z <- c(x,y) 
p.values1[i,1] <- NN_test(x,y,R,k) 
p.values1[i,2] <- Energy_test(x,y,R)
p.values1[i,3] <- bd.test(x,y,R=R,seed=i*1212)$p.value 
}
pow <- colMeans(p.values1<alpha)
power <- data.frame(methods = c('NN','energy','Ball'),pow)
power
ggplot(power,aes(methods,pow))+geom_col(fill = 'lightblue')+coord_flip()

\ Conclusion: Unequal variances and equal expectations:The power of Ball method is better than NN and energy methods.\

Unequal variances and unequal expectations

set.seed(1212)
m <- 1000; ## number of permutation tests
k<-3; ## boot parameter
p<-2; ## dimension of data
n1 <- n2 <- 50; ## the sample size of x and y
R<-999; ## boot parameter
mu <- 0.5
sd <- 1.5
#x~N(0,1) 50  y~N(0.5,2.25) 50
p.values2 <- matrix(NA,m,3) 
for(i in 1:m){
  x <- matrix(rnorm(n1*p),ncol=p)
  y <- matrix(rnorm(n2*p,mean=mu,sd=sd),ncol=p)
  z <- rbind(x,y)
p.values2[i,1] <- NN_test(x,y,R,k) 
p.values2[i,2] <- Energy_test(x,y,R)
p.values2[i,3] <- bd.test(x,y,R=R,seed=i*1212)$p.value 
}
pow <- colMeans(p.values2<alpha)
power <- data.frame(methods = c('NN','energy','Ball'),pow)
power
ggplot(power,aes(methods,pow))+geom_col(fill = 'lightblue')+coord_flip()

\ Conclusion: Unequal variances and unequal expectations:The power of Ball method is better than NN and energy methods.\

Non-normal distributions: t distribution with 1 df (heavy-tailed distribtion), bimodel distribution (mixture of two normal distribtions)

set.seed(1212)
m <- 1000; ## number of permutation tests
k<-3; ## boot parameter
n1 <- n2 <- 50; ## the sample size of x and y
df=1 ## freedom of t distribution(heavy-tailed distribtion)
R<-999;## boot parameter
p=0.4
mu1 =-1
mu2 = 1
sigma1 =1
sigma2 =2
#x~t1 50  y~0.3N(-1,1)+0.7N(1,2) 50
p.values3 <- matrix(NA,m,3) 
for(i in 1:m){ 
x <- rt(n1,df=df)
y <- 0.3 * rnorm(n2,mu1,sigma1) + 0.7*rnorm(n2,mu2,sigma2)
z <- c(x,y) 
p.values3[i,1] <- NN_test(x,y,R,k) 
p.values3[i,2] <- Energy_test(x,y,R)
p.values3[i,3] <- bd.test(x,y,R=R,seed=i*1212)$p.value 
}
pow <- colMeans(p.values3<alpha)
power <- data.frame(methods = c('NN','energy','Ball'),pow)
power
ggplot(power,aes(methods,pow))+geom_col(fill ='lightblue')+coord_flip()

\ Conclusion: Non-normal distributions: t distribution with 1 df (heavy-tailed distribtion), bimodel distribution (mixture of two normal distribtions):The power of energy method is better than NN and Ball methods.\

Unbalanced samples (say, 1 case versus 10 controls)

set.seed(122)
m <- 1000; ## number of permutation tests
k<-3; ## boot parameter
n1 <- 100
n2 <- 10
R<-999; ## boot parameter
mu1 =-1
mu2 = 0
sigma1 =1
sigma2 =2
#x~N(-1,1) 100  y~N(1,2) 10
p.values4 <- matrix(NA,m,3) 
for(i in 1:m){ 
x <- rnorm(n1,mu1,sigma1)
y <- rnorm(n2,mu2,sigma2)
z <- c(x,y)
p.values4[i,1] <- NN_test(x,y,R,k) 
p.values4[i,2] <- Energy_test(x,y,R)
p.values4[i,3] <- bd.test(x,y,R=R,seed=i*1212)$p.value 
}
pow <- colMeans(p.values4<alpha)
power <- data.frame(methods = c('NN','energy','Ball'),pow)
power
ggplot(power,aes(methods,pow))+geom_col(fill ='lightblue')+coord_flip()

\ Conclusion: Unbalanced samples (say, 1 case versus 10 controls):The power of energy method is better than NN and Ball methods.\

title: "A-21010-2021-11-11"

Exercise 9.3 and 9.8 (pages 277-278,Statistical Computing with R).\

For each of the above exercise, use the Gelman-Rubin method to monitor convergence of the chain, and run the chain until it converges approximately to the target distribution according to $\hat{R}<1.2$.\

Question 9.3

Use the Metropolis-Hastings sampler to generate random variables from a standard Cauchy distribution. Discard the first 1000 of the chain, and compare the deciles of the generated observations with the deciles of the standard Cauchy distribution (see qcauchy or qt with df=1). Recall that a Cauchy$(\theta, \eta)$ distribution has density function\ $$ f(x)=\frac{1}{\theta\pi(1+[(x-\eta)/\theta]^2)},-\infty0. $$ \ The standard Cauchy has the Cauchy$(\theta = 1, \eta = 0)$ density. (Note that the standard Cauchy density is equal to the Student t density with one degree of freedom.)

Solution:

Because the standard Cauchy density is equal to the Student t density with one degree of freedom, try the Student t distribution with one degree of freedom for the proposal distribution.\

set.seed(2029)
m <- 3000 ## Sample size
x <- numeric(m)
u <- runif(m)
theta=1
eta=0
x[1] <- rnorm(1)
k <- 0

## The following function evaluates the Cauchy(0,1) density.
f <- function(x, theta=1, eta=0){
  out <- theta /(pi * (theta^2+(x-eta)^2))
  return(out)
}
for(i in 2:m){
  xt <- x[i-1]
  y <- rnorm(1,mean=xt) ## At each transition, the candidate point Y is generated from N(mean=xt)
  R <- f(y)*dnorm(xt,mean=y)/(f(xt)*dnorm(y,mean=xt)) ## accept probability
  if(u[i] <= R){
    x[i] <- y
  }else{
    x[i] <- xt
    k <- k+1 ## y is rejected
  }
}
print(k/m)
I <- 1001:m ##  The following code will display a partial plot starting at time index 1000.
plot(I,x[I],type="l")
hist(x[I], probability=TRUE,breaks=100)
plot.x <- seq(min(x[I]),max(x[I]),0.01)
lines(plot.x,f(plot.x))
#compare the deciles
observations <- quantile(x[I],seq(0,1,0.1))
expectations <- qcauchy(seq(0,1,0.1))
decile <- data.frame(observations,expectations)
decile

\ In this question, approximately 24% of the candidate points are rejected, so the chain is somewhat efficient.\ We can see, the deciles of the generated observations and the deciles of the standard Cauchy distribution is very close. \ The following QQ and histograms can also verify this conclusion.\

qqnorm(x[I])
qqline(x[I])
sequence <- seq(0,1,0.01)
standard.Cauchy <- qcauchy(sequence)
standard.Cauchy <- standard.Cauchy[(standard.Cauchy> -Inf) 
                                 & (standard.Cauchy< Inf)]
hist(x[I], freq = FALSE)
lines(standard.Cauchy, dcauchy(standard.Cauchy), lty = 2)

Question 9.8

This example appears in [40]. Consider the bivariate density\ $$ f(x,y)\propto \binom{n}{x}y^{x+a-1}(1-y)^{n-x+b-1},x=0,1,\cdots,n,0\leq y\leq 1. $$ \ It can be shown (see e.g. [23]) that for fixed a, b, n, the conditional distributions are Binomial(n, y) and Beta(x + a, n − x + b). Use the Gibbs sampler to generate a chain with target joint density f(x, y).\

Solution:

The bivariate density in this question follows that (supressing the overall dependence on n , a , b) that\ $$ f(x|y)\sim Binomial(n,y)\ f(y|x)\sim Beta(x+a,n-x+b) $$ \ The Gibbs sampler proceeds as follows:\ (1) Start from some value of (X(0); Y (0))\ (2) Generate X(i) from Binomial(n, Y (i-1))\ (3) Generate Y (i) from Beta(X(i) + a , n − X(i) + b )\

set.seed(1212)
f = function(x,y){
choose(n,x) * y^(x + a - 1) * (1 - y)^(n - x + b -1)
} ## x = 0,1,2,...,n and 0 < y < 1
N <- 10000   ## sample size
## fixed a,b and n
a <- 2
b <- 4
n <- 16
X=NULL ## Vector of X values
X[1]=5 ## initial value of X given to be 5
Y=NULL ## Vector of Y values
Y[1]=0.33 ## initial value of Y given to be 0.33
Z=matrix(0,N,2) ## the chain,a bivariate sample
## generate the chain
for( i in 2:N){
   X[i]=rbinom(1,n,Y[i-1])
   Y[i]=rbeta(1,(X[i]+a),(n-X[i]+b))
   Z[i,]=c(X[i],Y[i])
}
sample_X <- Z[(1:N),1]## The first column of Z gives X values of the sample
sample_Y <- Z[(1:N),2]## The second column of Z gives Y values of the sample

sample<-round(c(mean(sample_X),mean(sample_Y)),4) #Estimated mean of marginal random variable X from sample
true<-round(c(n*Y[1],(X[1]+a)/(X[1]+a+n-X[1]+b)),4)
compare <- data.frame(sample,true)
compare

\ Conclusion:The sample means are very close to the true mean,so we use the Gibbs sampler to generate a chain with target joint density f(x, y).\

(Continue)Question 9.3

For Question 9.3, use the Gelman-Rubin method to monitor convergence of the chain, and run the chain until it converges approximately to the target distribution according to $\hat{R}<1.2$.\

Solution:

# Gelman-Rubin statistic
Gelman.Rubin <- function(psi) {
  # psi[i,j] is the statistic psi(X[i,1:j])
  # for chain in i-th row of X
  psi <- as.matrix(psi)
  n <- ncol(psi)
  k <- nrow(psi)

  psi.means <- rowMeans(psi)     #row means
  B <- n * var(psi.means)        #between variance est.
  psi.w <- apply(psi, 1, "var")  #within variances
  W <- mean(psi.w)               #within est.
  v.hat <- W*(n-1)/n + (B/n)     #upper variance est.
  r.hat <- v.hat / W             #G-R statistic
  return(r.hat)
}

## generates a Metropolis chain for Cauchy(0,1)
## with Normal(mean=xt) proposal distribution
## and starting value X1
Standard_Cauchy_Chain <- function(N, X1){
  X <- numeric(N)
  X[1] <- X1    #初始值
  for(i in 2:N){
    Xt <- X[i-1]
    Y <- rnorm(1,0,abs(Xt))
    r <- dt(Y,1)*dnorm(Xt,0,abs(Y))/dt(Xt,1)/dnorm(Y,0,abs(Xt))
    U <- runif(1)
    if(r > 1) r <- 1
    if(U <= r) X[i] <- Y
    else X[i] <- Xt
  }
  return(X)
}
k <- 4      
N <- 8000
b <- 1000     #burn-in length
X1 <- c(0.1,0.2,0.1,0.2)    #initial value

set.seed(12345)
X <- matrix(0, nrow = k, ncol = N)
for(i in 1:k){
  X[i,] <- Standard_Cauchy_Chain(N, X1[i])
}

# compute diagnostic statistics
psi <- t(apply(X, 1, cumsum))
for (i in 1:nrow(psi))
  psi[i,] <- psi[i,] / (1:ncol(psi))
print(Gelman.Rubin(psi))
for (i in 1:k)
  if(i==1){
    plot((b+1):N,psi[i, (b+1):N],ylim=c(-1,1), type="l",
         xlab='Index', ylab=bquote(phi))
  }else{
      lines(psi[i, (b+1):N], col=i)
  }
#plot the sequence of R-hat statistics
rhat <- rep(0, N)
for (j in (b+1):N)
  rhat[j] <- Gelman.Rubin(psi[,1:j])
plot(rhat[(b+1):N], type="l", xlab="", ylab="R")
abline(h=1.2, lty=2)

\ From the results, we can see that the $\hat{R}$ is $1.127139<1.2$. It shows that the chains have approximately converged to the target distribution when sample size approximately equal 1000.

(Continue)Question 9.8

For Question 9.8, use the Gelman-Rubin method to monitor convergence of the chain, and run the chain until it converges approximately to the target distribution according to $\hat{R}<1.2$.\

Solution:

# Gelman-Rubin statistic
Gelman.Rubin <- function(psi) {
  # psi[i,j] is the statistic psi(X[i,1:j])
  # for chain in i-th row of X
  psi <- as.matrix(psi)
  n <- ncol(psi)
  k <- nrow(psi)

  psi.means <- rowMeans(psi)     #row means
  B <- n * var(psi.means)        #between variance est.
  psi.w <- apply(psi, 1, "var")  #within variances
  W <- mean(psi.w)               #within est.
  v.hat <- W*(n-1)/n + (B/n)     #upper variance est.
  r.hat <- v.hat / W             #G-R statistic
  return(r.hat)
}
Bivariate.Gibbs <- function(N, X1){
  a <- b <- 1
  X <- matrix(0, N, 2)
  X[1,] <- X1    #初始值
  for(i in 2:N){
    X2 <-  X[i-1, 2]
    X[i,1] <- rbinom(1,25,X2)
    X1 <- X[i,1]
    X[i,2] <- rbeta(1,X1+a,25-X1+b)
  }
  return(X)
}
k <- 4          
N <- 8000 
b <- 1000    #burn-in length
X1 <- cbind(c(2,7,10,15),runif(4)) # initial value

set.seed(12345)
X <- matrix(0, nrow=k, ncol=N)
Y <- matrix(0, nrow=k, ncol=N)
for (i in 1:k){
  BG <- Bivariate.Gibbs(N, X1[i,])
  X[i, ] <- BG[,1]
  Y[i, ] <- BG[,2]
}
# Consider X

#compute diagnostic statistics
psi <- t(apply(X, 1, cumsum))
for (i in 1:nrow(psi))
  psi[i,] <- psi[i,] / (1:ncol(psi))

#plot the sequence of R-hat statistics
rhat <- rep(0, N)
for (j in (b+1):N)
  rhat[j] <- Gelman.Rubin(psi[,1:j])
plot(rhat[(b+1):N], type="l", xlab="", ylab="R")
abline(h=1.2, lty=2)

# Consider Y

#compute diagnostic statistics
psi <- t(apply(Y, 1, cumsum))
for (i in 1:nrow(psi))
  psi[i,] <- psi[i,] / (1:ncol(psi))

#plot the sequence of R-hat statistics
rhat <- rep(0, N)
for (j in (b+1):N)
  rhat[j] <- Gelman.Rubin(psi[,1:j])
plot(rhat[(b+1):N], type="l", xlab="", ylab="R")
abline(h=1.2, lty=2)

\ From the results, we can see the chains have approximately converged to the target joint density f(x, y) when sample size approximately equal 4000.\

title: "A-21010-2021-11-18"

Exercise 11.3 and 11.5(pages 353-354,Statistical Computing with R)

E-M algorithm

Suppose $T_1,\cdots,T_n$ are i.i.d samples drawn from the exponential distribution with expectation $\lambda$.Those values greater than $\tau$ are not observed due to the right censorship, so that the observed values are $Y_i=T_iI(T_i\leq\tau)+\tau I(T_i>\tau),i=1,\cdots,n$.Suppose $\tau=1$ and the observed $Y_i$ values are as follows: $$ 0.54,0.48,0.33,0.43,1.00,1.00,0.91,1.00,0.21,0.85 $$ \ Use the E-M algorithm to estimate $\lambda$,compare your result with the observed data MLE(note:Y_i follows a mixture distribution).

Question 11.3

(a) Write a function to compute the $k^{th}$ term in\ $$ \sum_{k=0}^\infty\frac{(-1)^k}{k!2^k}\frac{||a||^{2k+2}}{(2k+1)(2k+2)}\frac{\Gamma(\frac{d+1}{2})\Gamma(k+\frac{3}{2})}{\Gamma(k+\frac{d}{2}+1)} $$ \ where $d ≥ 1$ is an integer, $a$ is a vector in $R^d$, and $||\cdot||$ denotes the Euclidean norm. Perform the arithmetic so that the coefficients can be computed for (almost) arbitrarily large k and d. (This sum converges for all $a\in R^d$).\ (b) Modify the function so that it computes and returns the sum.\ (c) Evaluate the sum when $a = (1, 2)^T$.\

Solution:

## (a)Write a function to compute the k^{th} term
f=function(a,k){
d=length(a)
((-1)^k/(factorial(k)*2^k))*(sqrt(sum(a*a))^(2*k+2)/((2*k+1)*(2*k+2)))*((gamma((d+1)/2)*gamma(k+1.5))/gamma(k+d/2+1))
}
## (b)Modify the function so that it computes and returns the sum
a <- c(1,2) ## value of a
p <- 0
q <- 0
k <- 1
while(is.nan(p)==FALSE){ 
## we stop calculaton when r starts providing NaN for k^{th} term calculation
q[k]=f(a,k-1) ## the k^{th} term of series
p=q[k]
k=k+1 ## increase the k by 1
}
## (c)Evaluate the sum when $a = (1, 2)^T$
sum(q[-length(q)])

\

Question 11.5

Write a function to solve the equation\ $$ \begin{gathered} \frac{2 \Gamma\left(\frac{k}{2}\right)}{\sqrt{\pi(k-1)} \Gamma\left(\frac{k-1}{2}\right)} \int_{0}^{c_{k-1}}\left(1+\frac{u^{2}}{k-1}\right)^{-k / 2} d u =\frac{2 \Gamma\left(\frac{k+1}{2}\right)}{\sqrt{\pi k} \Gamma\left(\frac{k}{2}\right)} \int_{0}^{c_{k}}\left(1+\frac{u^{2}}{k}\right)^{-(k+1) / 2} d u \end{gathered} $$ \ for $a$,where\ $$ c_{k}=\sqrt{\frac{a^{2} k}{k+1-a^{2}}} $$ \ Compare the solutions with the points $A(k)$ in the Exercise 11.4.\

Solution:

For Exercise 11.4\

# find the intervals in which the roots fall  
f <- function(a, k) 1-pt(sqrt(a^2*k/(k+1-a^2)), k)
k <- c(4, 25, 100, 1000)
par(mar=c(1,1,1,1))
for (i in k) {
  g <- function(x) f(x, i-1)-f(x, i)
    a <- seq(0, sqrt(i), .1)
  plot(a, g(a), type = 'l', main = paste('k=',i))
  abline(h = 0)
}

\ From the figures plotted when k = 4, 25, 100 and 1000, we can conclude that the roots will fall in the interval (1, 2).\

f <- function(a, k) 1-pt(sqrt(a^2*k/(k+1-a^2)), k)
k <- c(4:25, 100, 500, 1000)
Ak <- numeric(length(k))
i <- 1
for (j in k) {
  g <- function(x) f(x, j-1)-f(x, j)
  Ak[i] <- uniroot(g, lower = 1, upper = 2)$root
  i <- i+1
}
knitr::kable(cbind(k,Ak))

\ For this Question 11.5\

f <- function(k) 2/sqrt(pi*k)*exp(lgamma((k+1)/2)-lgamma(k/2))
ck <- function(a, k) sqrt(a^2*k/(k+1-a^2))
g <- function(u, k) (1+u^2/k)^(-(k+1)/2)
k <- c(4:25, 100, 500, 1000)
root <- numeric(length(k))
i <- 1
for (j in k) {
  ff <- function(a) f(j)*integrate(function(u) {g(u, j)}, 0, ck(a, j))$value-f(j-1)*integrate(function(u) {g(u, j-1)}, 0, ck(a, j-1))$value 
  root[i] <- uniroot(ff, lower = 1, upper = 2)$root
  i <- i+1
}
knitr::kable(cbind(k, Ak, root))

\ Conclusion:In the above table, the outcomes in 'Ak' column are the outcomes in Exercise 11.4 and these in 'root' column are the outcomes in this Question 11.5. We can see that the results are the same. This result is obvious as these two questions are in fact the same.\

E-M algorithm

Suppose $T_1,\cdots,T_n$ are i.i.d samples drawn from the exponential distribution with expectation $\lambda$.Those values greater than $\tau$ are not observed due to the right censorship, so that the observed values are $Y_i=T_iI(T_i\leq\tau)+\tau I(T_i>\tau),i=1,\cdots,n$.Suppose $\tau=1$ and the observed $Y_i$ values are as follows: $$ 0.54,0.48,0.33,0.43,1.00,1.00,0.91,1.00,0.21,0.85 $$ \ Use the E-M algorithm to estimate $\lambda$,compare your result with the observed data MLE(note:Y_i follows a mixture distribution).

Solution:

Obesrved data: $y_1,\cdots,y_n$,let$n_0$is true observation data,thus the right censorship data$y_i=\tau,i=n_0,\cdots,n$.\ Complete data: $t_1,\cdots,t_n$\ (1)The observed data MLE\ $$ L_{observed}(\lambda)=P(T>\tau)^{n-n_0}\cdot\lambda^{-n_0}e^{-\frac{1}{\lambda}\sum_{i=1}^{n_0}y_i}\ =e^{-\frac{\tau(n-n_0)}{\lambda}}\cdot\lambda^{-n_0}e^{-\frac{1}{\lambda}\sum_{i=1}^{n_0}y_i},\ lnL_{observed}(\lambda)=-\frac{\tau(n-n_0)}{\lambda}-n_0ln\lambda-\frac{1}{\lambda}\sum_{i=1}^{n_0}y_i,\ \frac{\partial lnL_{observed}(\lambda)}{\partial \lambda}=\frac{\tau(n-n_0)}{\lambda^2}-\frac{n_0}{\lambda}+\frac{1}{\lambda^2}\sum_{i=1}^{n_0}y_i=0,\ \hat\lambda=\frac{\tau(n-n_0)+\sum_{i=1}^{n_0}y_i}{n_0}. $$ \ So we get the observed data MLE $\hat\lambda$ which $\tau=1,n=10,n_0=7,$ and true observation data$0.54,0.48,0.33,0.43,0.91,0.21,0.85$:\

tau=1
n=10
n_0=7
y=c(0.54,0.48,0.33,0.43,0.91,0.21,0.85)
lambda.hat <- (tau*(n-n_0)+sum(y))/n_0
lambda.hat

(2)Use the E-M algorithm to estimate $\lambda$\ $$ L_{complete}(\lambda)=\lambda^{-n}e^{-\frac{1}{\lambda}\sum_{i=1}^{n}t_i},\ lnL_{complete}(\lambda)=-nln\lambda-\frac{1}{\lambda}\sum_{i=1}^{n}t_i $$ \ (i)E step:\ $$ Q(\lambda,\lambda^{(i)})=E{[lnL_{complete}(\lambda)|Y,\lambda^{(i)}]}\ =E{[-nln\lambda-\frac{1}{\lambda}\sum_{i=1}^{n}t_i|Y,\lambda^{(i)}]}\ =-nln\lambda-\frac{1}{\lambda}\sum_{i=1}^{n}E[t_i|y_i,\lambda^{(i)}]\ =-nln\lambda-\frac{1}{\lambda}[\sum_{i=1}^{n_0}y_i+\sum_{i=n_0+1}^{n}(y_i+\lambda^{(i)})]\ =-nln\lambda-\frac{1}{\lambda}\sum_{i=1}^{n}y_i-\frac{(n-n_0)\lambda^{(i)}}{\lambda} $$ \ (ii)M step(maximize the $Q(\lambda,\lambda^{(i)})$ of E step):\ $$ \frac{\partial Q(\lambda,\lambda^{(i)})}{\partial \lambda}=-\frac{n}{\lambda}+\frac{\sum_{i=1}^{n}y_i+(n-n_0)\lambda^{(i)}}{\lambda^2}=0,\ \lambda^{(i+1)}=\frac{1}{n}\sum_{i=1}^{n}y_i+\frac{n-n_0}{n}\lambda^{(i)} $$ \ So $$ |\lambda^{(i+1)}-\lambda^{(i)}|=|(\frac{1}{n}\sum_{i=1}^{n}y_i+\frac{n-n_0}{n}\lambda^{(i)})-(\frac{1}{n}\sum_{i=1}^{n}y_i+\frac{n-n_0}{n}\lambda^{(i-1)})|\ =\frac{n-n_0}{n}|\lambda^{(i)}-\lambda^{(i-1)}|\ \Rightarrow |\lambda^{(n+1)}-\lambda^{(n)}|=(\frac{n-n_0}{n})^n|\lambda^{(1)}-\lambda^{(0)}|\overset{n\rightarrow\infty}{\rightarrow}0\ \Rightarrow \lambda^{(n)} converge\ \Rightarrow \tilde{\lambda}=\frac{1}{n}\sum_{i=1}^{n}y_i+\frac{n-n_0}{n}\tilde{\lambda}\ \Rightarrow \tilde{\lambda}=\frac{1}{n_0}\sum_{i=1}^{n}y_i $$ \ Compare the estimation of $\lambda$ by using the E-M algorithm with the observed data MLE:\ $$ \tilde{\lambda}=\frac{1}{n_0}\sum_{i=1}^{n}y_i\ \hat\lambda=\frac{\tau(n-n_0)+\sum_{i=1}^{n_0}y_i}{n_0} $$ \ When $\tau=1,n=10,n_0=7,$,the EM algorithm $\tilde{\lambda}=\frac{1}{n_0}\sum_{i=1}^{n}y_i$ converges to the observed data MLE $\hat\lambda=\frac{\tau(n-n_0)+\sum_{i=1}^{n_0}y_i}{n_0}$!\

Use EM algorithm to solve MLE of $\lambda$.

y <- c(0.54, 0.48, 0.33, 0.43, 1.00, 1.00, 0.91, 1.00, 0.21, 0.85)
lambda0 <- 0.5

EMlambda <- function(y, lambda){
  N <- 10000
  tol <- .Machine$double.eps^0.5
  n <- length(y)
  nt <- sum(y==1)

  for(i in 1:N){
    total <- sum(y)
    lambda_old <- lambda
    lambda <- (total + nt*lambda_old) / n

    if(abs(lambda - lambda_old)/lambda_old <tol) break
  }

  return(lambda)
}

EMlambda(y, lambda0)

Using EM algorithm we get the estimation of $\lambda$ is equal 0.9642857.\

title: "A-21010-2021-11-25"

Exercise 1 and 5 (page 204, Advanced R)

Exercise 1 and 7 (page 214, Advanced R)

Exercise 1 (page 204, Advanced R)

Why are the following two invocations of lapply() equivalent?\ $$ trims <- c(0, 0.1, 0.2, 0.5)\ x <- rcauchy(100)\ lapply(trims, function(trim) mean(x, trim = trim))\ lapply(trims, mean, x = x) $$ \

Solution:

trims <- c(0, 0.1, 0.2, 0.5)
x <- rcauchy(100)
lapply(trims, function(trim) mean(x, trim = trim))
lapply(trims, mean, x = x)

Reason of the two invocations of lapply() equivalent:\ In the first statement each element of "trims" is explicitly supplied to "mean()"'s second argument. \ In the latter statement this happens via positional matching, since "mean()"'s first argument is supplied via name in "lapply()"'s third argument.

Exercise 5 (page 204, Advanced R)

For each model in the previous two exercises, extract $R^2$ using the function below. $$ rsq <- function(mod) summary(mod)$r.squared $$ \

Solution:

rsq <- function(mod) summary(mod)$r.squared
# 3
data(mtcars)
formulas <- list(
  mpg ~ disp,
  mpg ~ I(1 / disp),
  mpg ~ disp + wt,
  mpg ~ I(1 / disp) + wt
)
models <- lapply(formulas, function(formula) lm(formula, mtcars))
lapply(models, function(model) rsq(model))
# 4
data("mtcars")
bootstraps <- lapply(1:10, function(i) {
  rows <- sample(1:nrow(mtcars), rep = TRUE)
  mtcars[rows, ]
})

(1)

# for loop
models_loop <- list()
for(i in 1:10){
  model <- lm(mpg ~ disp, data = bootstraps[[i]])
  models_loop <- c(models_loop, list(model))
}

(2)

# R^2
lapply(models_loop, function(model) rsq(model))

(1)

# lapply
models_lapply <- lapply(bootstraps, function(data) lm(mpg ~ disp, data))

(2)

# R^2
lapply(models_lapply, function(model) rsq(model))

\

Exercise 1 (page 214, Advanced R)

Use vapply() to:\ a) Compute the standard deviation of every column in a numeric data frame.\ b) Compute the standard deviation of every numeric column in a mixed data frame. (Hint: you’ll need to use vapply() twice.)\

Solution:

(a)Every column in a numeric data frame.\

cars ##  a numeric data.frame.
vapply(cars, sd, numeric(1))

\ (2)Every numeric column in a mixed data frame.\

## a mixed data.frame"mtcars".
vapply(mtcars[vapply(mtcars, is.numeric, logical(1))],sd, numeric(1)) ## use vapply() twice.

Exercise 7 (page 214, Advanced R)

Implement mcsapply(), a multicore version of sapply(). Can you implement mcvapply(), a parallel version of vapply()?Why or why not?\

Solution:\ (1)Notice that "mc.cores" > 1 is not supported on Windows.An alternative to mcsapply for Windows is parallel::parSapply.\ (2)mcvapply(),unlike sapply, vapply is directed to Internal(vapply(X, FUN, FUN.VALUE, USE.NAMES)), which performs a call to an internal code built in the R interpreter, so it's not easy to implement mcvapply() the parallel version of vapply.\ (3)It can be seen from the results that mcsapply() the multicore version of sapply() is significantly faster than using the sapply function alone.\ But I don't think you can implement mcvapply() (a parallel version of vapply()) by R package(parallel),because vapply() directly output non-list!

title: "A-21010-2021-12-02"

Question

(1)Write an Rcpp function for Exercise 9.8 (page 278, Statistical Computing with R).\ (2)Compare the corresponding generated random numbers with pure R language using the function “qqplot”.\ (3)Campare the computation time of the two functions with the function “microbenchmark”.\ (4)Comments your results.\

Exercise 9.8: This example appears in [40]. Consider the bivariate density\ $$ f(x,y)\propto \binom{n}{x}y^{x+a-1}(1-y)^{n-x+b-1},x=0,1,\cdots,n,0\leq y\leq 1. $$ \ It can be shown (see e.g. [23]) that for fixed a, b, n, the conditional distributions are Binomial(n, y) and Beta(x + a, n − x + b). Use the Gibbs sampler to generate a chain with target joint density f(x, y).\

Solution:\

(1)Write an Rcpp function for Exercise 9.8 (page 278, Statistical Computing with R).\

library(Rcpp)
cppFunction('
NumericMatrix gibbsc(int N,int burn,int a,int b,int n){
  NumericMatrix X(N,2);
  NumericMatrix XX(N-burn,2);
  float x,y;
  X(0,0)=1;
  X(0,1)=0.5;
  for(int i=1;i<N;i++){
    y=X(i-1,1);
    X(i,0)=rbinom(1,n,y)[0];
    x=X(i,0);
    X(i,1)=rbeta(1,x+a,n-x+b)[0];
  }
  for(int k=0;k<N-burn;k++){
    XX(k,0)=X(k+burn,0);
    XX(k,1)=X(k+burn,1);
  }
  return XX;
}
')
res1<-gibbsc(5000,1000,2,2,10)

\ (2)Compare the corresponding generated random numbers with pure R language using the function “qqplot”.\

gibbsr<-function(N,burn,a=2,b=2,n=10){
  X <- matrix(0, N, 2) #the chain, a bivariate sample
  ###### generate the chain #####
  X[1, ] <- c(1, .5) #initialize
  for (i in 2:N) {
    y <- X[i-1, 2]
    X[i, 1] <- rbinom(1, n, y)
    x <- X[i, 1]
    X[i, 2] <- rbeta(1,x+a,n-x+b)
    }
  b <- burn + 1
  return(X[b:N, ])
}
res2<-gibbsr(5000,1000,2,2,10)

\

a <- ppoints(50)
Q1x <- quantile(res1[,1], a)   #quantiles of Cauchy
Q2x <- quantile(res2[,1], a)
Q1y <- quantile(res1[,2], a)   #quantiles of Cauchy
Q2y <- quantile(res2[,2], a)
qqplot(Q1x, Q2x, main="",
        xlab="Rcpp Quantiles of x", ylab="R Quantiles of x")
    abline(c(0,0),c(1,1),col='blue',lwd=2)
qqplot(Q1y, Q2y, main="",
        xlab="Rcpp Quantiles of y", ylab="R Quantiles of y")
    abline(c(0,0),c(1,1),col='blue',lwd=2)

\ (3)Campare the computation time of the two functions with the function “microbenchmark”.\

library(microbenchmark)
ts <- microbenchmark(Rcpp=gibbsc(5000,1000,2,2,10),R=gibbsr(5000,1000,2,2,10))
summary(ts)[,c(1,3,5,6)]

\ (4)Comments your results.\

Comments:\ (i)We can see that the qq plot is close to y = x,which means the samples generated by two samplers are quite similar.\ (ii)By computing the computation time,the time of sampler using R is much more than that of sampler using Rcpp,so Rcpp gives higher performance computation! \

The all answers of my homewrk of "Statistical Computing" class is above.



Shenshiny/StatComp21010 documentation built on Dec. 23, 2021, 10:22 p.m.