Overview

SC19057 is a simple R package mainly contains two R functions knnde and indTest. knnde is used to estimate density by k-nearest method, and indTest performs independence test on two vectors. The former function needs the R_package "FNN" and the latter function is based on R function "kedd". Besides, a random walk Metropolis sampler for generating the standard Laplace distribution using Rcpp rwMC is contained. And all of my homework answer is showed below.

K-nearest neighbor density estimation

library(SC19057)
data("law",package = "bootstrap")
x <- as.matrix(law)
xrange <- seq(from = 500, to = 700, by = 10) 
yrange <- seq(from = 1, to = 4, by = 0.05)
k <- 4 
fit <- knnde(x, k, xrange, yrange)
persp(xrange,yrange,fit,col='orange')

By the plot we can get the distribution of students' LAST and GPA, and the two values seems not independent.

Independence test based on kernel density estimation and bootstrap method

library(SC19057)
data("law")
x <- law$LSAT
y <- law$GPA
B <- 20
indTest(x,y,B,0.05)

By the independece test we can also find that the two vectors are not independent.

A random walk Metropolis sampler using Rcpp

This case is used to see the performance of cpp function

library(SC19057)
library(Rcpp)
out <- rwMC(2.0,20.0,2000)
plot(out[,1], type="l", main='rwMC(sigma=2)', ylab="x",xlab = "")

Homework

2019.9.20

An Example of Dateset Women

This is an dataset from R.The dataset named Women contains average heights and weights for American women.

library(knitr)
kable(women)

From the table, we can easily find that weight has a positive correlation with height. To prove our thought and find the exact correlaion, we can make a linear regression based on the data.

myfit<-lm(weight~height,women)
summary(myfit)
b<-round(coefficients(myfit),4)

From the result of anlysis, we know $R^2= 0.991$, which tells that's a good fit. And We can also get the fuction between weight and height.The equation is:

$$ weight= r b[[1]]+ r b[[2]] * height $$

Meanwhile, we can get the fitted line as following plot shows.From the picture, we can easily find that most spots are in the fitted line.

myfit<-lm(weight~height,women)
plot(women$height,women$weight,xlab = "HEIGHT",ylab = "WEIGHT",main = "Regression of weight on height",col.main="red")
abline(myfit,col=("blue"))

Aother Example about Random Values

I have produced 150 random values from standard normal distribution,which named x.Meanwhile, I also sampled same number random values from T distribution with 50 d.f,100 d.f and 150 d.f,which named y1,y2,y3. We can see the distribution of these values from the following plot.

x<-rnorm(150)
y1<-rt(150,50)
y2<-rt(150,100)
y3<-rt(150,150)
layout(matrix(c(1,1,1,2,3,4),2,3,byrow = TRUE))
hist(x,main="The Distribution Of X",col.main="red",col="blue",xlim =c(-3,3))
hist(y1,main="The Distribution Of y1",col.main="orange",col="green",xlim =c(-3,3))
hist(y2,main="The Distribution Of y2",col.main="orange",col="green",xlim =c(-3,3))
hist(y3,main="The Distribution Of y3",col.main="orange",col="green",xlim =c(-3,3))

2019.9.29

1.

Souppose $t\sim Gamma(1,\frac{1}{\sigma^2}),\sigma>0$.

$\because F(x)=P(X\le x)=P(\sqrt{t}\le x)=P(t\le x^2)$

$\therefore F'(X)=f_T (x^2)2x=\frac{1}{2\sigma^2}e^{-\frac{x^2}{2\sigma^2}}2x = \frac{x}{\sigma^2} e^{-\frac{x^2}{2\sigma^2}}$

Since we can find that when $t\sim Gamma(1,\frac{1}{\sigma^2})$,$x=\sqrt{t}\sim Rayleigh(\sigma)$, we can Generate a random $t$ from $Gamma(1,\frac{1}{\sigma^2})$ first, then diliver $x=\sqrt{t}$. Thus, the $x$ we get is generated from $Rayleigh(\sigma)$. Observing the histogram of x, we can find the mode of the generated samples is close to the theoretical mode $σ$.

set.seed(1234)
x<-function(a){
shape<-1/(2*a^2)
t<-rgamma(1e4,1,shape)
x<-sqrt(t)
return(x)
}#generate a random R(σ) sample

#layout(matrix(c(1:4),2,2,byrow = TRUE))
hist(x(0.1),prob=TRUE)
hist(x(0.5),prob=TRUE)
hist(x(2),prob=TRUE)
hist(x(5),prob=TRUE)

2.

The problem is to generate a mixture sample, thus we can use transformation method to get the mixture value. The algorithm is as follows: a. Generate an value $u \sim U(0,1)$ b. If $u\le p_1$, deliver a random $x$ from $N(0,1)$.Otherwise, deliver $x$ from $N(3,1)$

From the histogram of the mixture values under different $p_1$, we can find that when $p_1=0.45 \ or\ 0.6$, there is a bimodal trend in the plot.Thus, we can guess when $p_1\in (0.45,0.6)$ can produce a bimodal mixture.

set.seed(1234)
n <- 1000
x1 <- rnorm(n)
x2 <- rnorm(n,3,1)
x<-function(p1){
p<-p1
u <- runif(n)
k <- as.integer(u < p) #vector of 0’s and 1’s
x <- k * x1 + (1-k) * x2 
}#generte a mixture sample
layout(matrix(c(1:6),2,3,byrow = TRUE))
hist(x(0.75), prob=TRUE);lines(density(x(0.75)),col="red")
hist(x(0.15), prob=TRUE);lines(density(x(0.15)),col="red")
hist(x(0.3), prob=TRUE);lines(density(x(0.3)),col="red")
hist(x(0.45), prob=TRUE);lines(density(x(0.45)),col="red")
hist(x(0.6), prob=TRUE);lines(density(x(0.6)),col="red")
hist(x(0.9), prob=TRUE);lines(density(x(0.9)),col="red")

3.

Since a Wishart distribution sample $X=X_1*X_1^T$, where $X\sim N_d(0,Σ)$ is a $d\times1$ vector. Thus, when $Σ$ is given, our algorithm is:

a. Generate an $n × d$ matrix $Z$ containing nd random $N(0,1)$ variates b. Compute a factorization $Σ = Q^T Q$. c. Apply the transformation $X = ZQ$ . d. Deliver the $n×d$ matrix $X$. e. Deliver $M=X^T*X$

In my case, I have produced a positive definite symmetric matrix Sigma first, and then I used the matrix to gnerate a wishart distribution sample by the transformation method and Bartlett’s decomposition.

set.seed(12345)
wishart <- function(n,sigma){
  d <- nrow(sigma)
  T <- matrix(rep(0,d*d),d,d) 
  for(i in 1:d){
    for(j in 1:i){
        if(i>j) T[i,j] <- rnorm(1)
        if(i==j) T[i,j] <- sqrt(rchisq(1,df = (n-i+1)))
    }
  }
  L <- t(chol(sigma))
  M <- (L%*%T)%*%t(L%*%T)
  M
}

n <- 5
sigma <- matrix(c(1,0.7,0.7,1), nrow = 2, ncol = 2)
d <- 2
M <- wishart(n,sigma)
M

2019.10.11

5.1

We can transform the integral as follows:$$\int_0^{\pi/3} \sin t\ dt=\frac\pi 3 \int_0^{\pi/3} \frac{\sin t}{\pi /3}dt=\frac\pi 3E(\sin t) , \ \ where \ t\sim U(0,\frac\pi 3)$$ Therefore, we can extract m samples from a uniform distribution and then compute the mean of function $f(t)=\sin t$ which depends on the samples. The mean we get is a Monte Carlo estimate of the integral. We can also compute the exact value of the integral:$$\int_0^{\pi/3} \sin t\ dt=\int_0^{\pi/3}-d\cos t =\cos(0)-cos(\pi/3)=0.5$$

set.seed(123)
m<-1e4
t<-runif(m,0,pi/3)
int.m<-mean(sin(t))*pi/3
int.e<-cos(0)-cos(pi/3)
cat("Monte Carlo estimate=",int.m)
cat("\nexact value=",int.e)

5.10

We can transform the integral as follows:$$\int_0^1{\frac {e^{-x}}{1+x^2}}\ dx=E(\frac {e^{-x}}{1+x^2})=E(g(x)) ,\ where\ g(x)=\frac {e^{-x}}{1+x^2} ,x\sim U(0,1)$$ Then, we can define a pair of antithetic variables $Y_i\ and\ Y_i^-$:$$Y_i=g(x_i)\ ,\ Y_i^-=g(1-x_i)\ \ \ \ where \ x_i \ is \ from \ U(0,1)$$
since $g(x)$ is montone, $Y_i\ and\ Y_i^-$ are negatively correlated.Therefore, we can estimate the integral as follows:$$\int_0^1{\frac {e^{-x}}{1+x^2}}\ dx=E(g(x)) \approx \frac2 m \sum_{i=1}^{m/2}{\frac{Y_i+Y_i^-}2}$$(the m is the size of x which we generate from the uniform distribution)

From the code below, we can get the Monte Carlo estimate of the integral is 0.524, and the approximate reduction in variance as a percentage of the variance without variance reduction is 98.2%

set.seed(123)
m<-1000
x<-runif(m)
set.seed(123)
x<-runif(m)
y1<-exp(-x)/(1+x^2)
y2<-exp(x-1)/(1+(1-x)^2)
est<-mean((y1+y2)/2)
cat("estimation=",est)
cat("\nvar=",var((y1+y2)/2))
cat("\napproximate reduction in variance=",(var(y1)-var((y1+y2)/2))/var(y1))

5.15

The question is to compare estimates of the intergral $\int_0^1{\frac {e^{-x}}{1+x^2}}\ dx$ with different method. By my answer 5.10, I have used antithetic variables to estimate the intergral, where I got the estimation is $0.5244$ with the variance being $0.0011$.

Then I use stratified importance sampling to estimate the integral(as the below code shows).And from the result, I get that the estimation is $0.5249$ with the variance is $5.3739\times10^{-8}$.

By compare the estimation and its corresponding variance, we can find that both estimation is similar with the exact value. However, the second one(with stratified importance sampling) possesses a much fewer variance. Therefore, the second one performs better in that case.

More Discussion:

By the code below, we can also find that the estimation with importance sampling only is $0.5249$ while its variance is $1.008\times10^{-6}$.Obviously, though the variance of that estimation is larger than the second one(with stratified importance sampling), it is still much fewer than the first one. Thus, we can draw a conclusion: in that case, the estimate with stratified importance sampling performs best, and the estimate with importance sampling second, and the one with antithetic variables worst.

#to calculate the new estimator with stratified importance sampling
M <- 10000 #number of replicates
k <- 5 #number of strata
r <- M / k #replicates per stratum
N <- 50 #number of times to repeat the estimation
T1 <- numeric(k)
estimates <- matrix(0, N, 2)
fg<-function(u){
  x <- - log(1 - u * (1 - exp(-1)))
  g<-exp(-x - log(1+x^2)) * (x > 0) * (x < 1)
  fg <- g/ (exp(-x) / (1 - exp(-1)))
  return(fg)
}
for (i in 1:N) {
  u <- runif(M) 
  estimates[i, 1] <- mean(fg(u)) #the estimate with importance sampling(f3) only
  for (j in 1:k){
    y <- runif(M/k, (j-1)/k, j/k) #the estimate by stratified importance sampling(f3)
    T1[j] <- mean(fg(y))
    estimates[i, 2] <- mean(T1)
  }
}
apply(estimates, 2, mean)
apply(estimates, 2, var)

2019.10.18

6.5

Since $\frac{\bar{x}-\mu}{\sqrt{S_n/n}} \sim t(n-1)$, we can produce a t-interval about mean.The 95% t-interval is: $\bar{x}\pm\sqrt{\frac{S_n}{n}}*t_{1-\alpha/2}(n-1)$.
Because this confidence interval(c.i) is based on the assumption that the sample is from normal distribution, the coverage probability of the c.i is not equal to $(1-\alpha)100\%$.The result from the lower code proves it.
In this question, sample x is from $χ^2 (2)$. We can get the coverage probability of the t-interval is 0.908, which is lower than 0.95. Compared with the example 6.4, we can find the t-interval is more robust, because the coverage probability given by Chi-square method is 0.812, which is much smaller than 0.908.

set.seed(123)
m<-1000
c<-numeric(m)
for (i in 1:m) {
  n<-20
  x<-rchisq(n,2)
  alpha<-0.05
  u<-mean(x)+sqrt(var(x)/n)*qt(1-alpha/2,n-1)
  l<-mean(x)+sqrt(var(x)/n)*qt(alpha/2,n-1)
  if(2<=u&&2>=l) c[i]=1
}
cat("coverage probability(t-interval)=",sum(c)/m)

#The example using Chi-square interval
n <- 20
alpha <- .05
UCL <- replicate(1000, expr = {
x <- rchisq(n, df = 2)
(n-1) * var(x) / qchisq(alpha, df = n-1)
} )
cat("\ncoverage probability(Chi-square interval)=",mean(UCL > 4))

6.6

To estimate the 0.025, 0.05, 0.95, and 0.975 quantiles of the skewness under normality by a Monte Carlo experiment, I first produced n samples from standard normal distribution, which could produce one estimation of skewness. Then repeated the procedure for m times, and I got m skewness' estimators where I could compute the required quantiles' estimators.

To get the s.e of these estimators, I first computed the density of each quantile using the normal approximation for the density.After that, I computed the variance of these quantiles by the formula(2.14), and finally got the standard error of each quantile.

To Compare the estimated quantiles with the quantiles of the large sample approximation, I have produced m values from $N(0,6/n)$, and then their quantiles were the quantiles of the large sample approximation.

The s.e of the estimated quantiles, estimated quantiles, and the quantile of the large sample approximation can be seen at the end of R code below. By comparison, we can find the estimated quantiles differ from the one from normal. With some more trial, I found that when the n increased, the difference between them decrease. Therefore, I guess the difference maight mainly caused by sample size n.

m<-1000
n<-1000
p<-c(0.025, 0.05, 0.95, 0.975)
sk0 <- function(x) {
  #computes the sample skewness coeff.
  xbar <- mean(x)
  m3 <- mean((x - xbar)^3)
  m2 <- mean((x - xbar)^2)
  return( m3 / m2^1.5 )
}
sk<-numeric(m)
for (i in 1:m) {
  x<-rnorm(n)
  sk[i]<-sk0(x)
}
q<-quantile(sk,probs =p)#quantiles of the estimator

#to compute the s.e of the estimator's quantiles
var.q<-p*(1-p)/(n*dnorm(q,0,sqrt(6/n))^2)
se.q<-sqrt(var.q/m)

sk.n<-rnorm(m,0,sqrt(6/n))
q.n<-quantile(sk.n,probs =p)#the quantiles of the large sample approximation 

print('s.e of esimators:');se.q
print('estimated quantiles:');q
print('quantiles of the large sample approximation:');q.n

2019.11.1

Question

6.7

Estimate the power of the skewness test of normality against symmetric $Beta(α, α)$ distributions and comment on the results. Are the results different for heavy-tailed symmetric alternatives such as t(ν)?

6A

Use Monte Carlo simulation to investigate whether the empirical Type I error rate of the t-test is approximately equal to the nominal significance level $α$, when the sampled population is non-normal. The t-test is robust to mild departures from normality. Discuss the simulation results for the cases where the sampled population is (i) $χ2(1)$, (ii) Uniform(0,2), and (iii) Exponential(rate=1). In each case, test $H_0 : µ = µ_0 \ vs \ H_0 : µ = µ_0$, where $µ_0$ is the mean of $χ2(1)$, Uniform(0,2), and Exponential(1), respectively.

Slide Discussion

If we obtain the powers for two methods under a particular simulation setting with 10000 experiments:say, 0.651 for one method and 0.676 for another method. Can we say the powers are different at 0.05 level?
(1)What is the corresponding hypothesis test problem?
(2)What test should we use? Z-test, two-sample t-test, paired-t test or McNemar test?
(3)What information is needed to test your hypothesis?

Answer

6.7

This question is to estimate the power of the skewness test of normality against symmetric $Beta(α, α)$ distributions. There are two ways to estimate the power.One is to use samples from unsymmetric beta distribution(by set the shape parameter), and the other is to use samples from a comtaminated beta distribution(which is also unsymmetric).I have computed the power using this two kind of samples below.

By plot 1, we can find when $\beta$ is away from 5, which means the beta distribution is not symmetric and skwness is away from 0, the power of the test is larger.
From plot 2, we can find when $\epsilon$ is 0 or 1, which means the distribution is a unc0ntaminated beta distribution, the power equals 0.when $\epsilon$ is about 0.9, the power reach its peak.
Compare plot 2 and 3, which both are sample from contaminated distribution, we can find that the power has changed under different contaminated distribution. In plot 3, when $\epsilon$ equals 0, the power is still large. Besides, the power gradually decreased as $\epsilon $ becomes larger.

set.seed(123)
sk <- function(x) {
  #computes the sample skewness coeff.
  xbar <- mean(x)
  m3 <- mean((x - xbar)^3)
  m2 <- mean((x - xbar)^2)
  return( m3 / m2^1.5 )
}
alpha <- .1
n <- 30
m <- 2500
epsilon <- seq(0,1, .01)
N <- length(epsilon)
par(cex=0.7,cex.lab=1.5,cex.axis=1,cex.main=2)
cv <- qnorm(1-alpha/2, 0, sqrt(6*(n-2) / ((n+1)*(n+3))))#critical value for the skewness test

# against beta distribution
a<-seq(0,5,0.05)
pw<-numeric(N)
for (j in 1:N) { #for each a
  sktests <- numeric(m)
  for (i in 1:m) { #for each replicate
    x <- rbeta(n,a[j],5)
    sktests[i] <- as.integer(abs(sk(x)) >= cv)
  }
  pw[j] <- mean(sktests)
}
plot(a, pw, type = "b",xlab = bquote(a), ylim = c(0,1),main='against beta distribution')

#contaminated beta distribution
pw.b<-numeric(N)
for (j in 1:N) { #for each epsilon
  e <- epsilon[j]
  sktests <- numeric(m)
  for (i in 1:m) { #for each replicate
    s <- sample(c(4, 1000), replace = TRUE,
                  size = n, prob = c(1-e, e))
    x <- rbeta(n,s,s)
    sktests[i] <- as.integer(abs(sk(x)) >= cv)
  }
  pw.b[j] <- mean(sktests)
}
plot(epsilon, pw.b, type = "b",
     xlab = bquote(epsilon), ylim = c(0,1),main = 'against contaminated beta distribution')

#against t(v) 
pw.t <- numeric(N)
for (j in 1:N) { #for each epsilon
  e <- epsilon[j]
  sktests <- numeric(m)
  for (i in 1:m) { #for each replicate
    d.f <- sample(c(1, 5), replace = TRUE,
                  size = n, prob = c(1-e, e))
    x <- rt(n,d.f)
    sktests[i] <- as.integer(abs(sk(x)) >= cv)
  }
  pw.t[j] <- mean(sktests)
}
plot(epsilon, pw.t, type = 'p',pch='$',col='goldenrod',
     xlab = bquote(epsilon), ylim = c(0,1),main = 'aganist t-distribution')
6A

With MC method, we can find that when samples is from uniform distribution,empirical Type I error rate is close to the nominal 0.05, while that of samples from chisquare and exp(1) are lower than 0.05.

set.seed(123)
n <- 20
alpha <- .05
m <- 10000 #number of replicates
p <- numeric(m) #storage for p-values

#Chisquare
for (j in 1:m) {
  x <- rchisq(n, 1)
  ttest <- t.test(x, alternative = "greater", mu = 1)
  p[j] <- ttest$p.value
}
p.chi <- mean(p < alpha)
se.chi <- sqrt(p.chi * (1 - p.chi) / m)
print(c(p.chi, se.chi))


#Uniform
for (j in 1:m) {
  x <- runif(n, 0,2)
  ttest <- t.test(x, alternative = "greater", mu = 1)
  p[j] <- ttest$p.value
}
p.unif <- mean(p < alpha)
se.unif <- sqrt(p.unif * (1 - p.unif) / m)
print(c(p.unif, se.unif))

#exp
for (j in 1:m) {
  x <- rexp(n, 1)
  ttest <- t.test(x, alternative = "greater", mu = 1)
  p[j] <- ttest$p.value
}
p.exp <- mean(p < alpha)
se.exp <- sqrt(p.exp * (1 - p.exp) / m)
print(c(p.exp, se.exp))
Discussion
#an example
n<-10000
x<-rbinom(n,n,p=0.651)
y<-rbinom(n,n,p=0.676)
t.test(x,y,'two.sided',mu=0)

(1)The hypothesis test is to test whether the sample means of power given by two methods are equal.that is: $H_0:p_1=p_2$ VS $H_1:p_1\ne p_2$ where $p_i$ is the corresponding power of different method.
(2)McNemar test
(3)The power function is needed

2019.11.8

Question

7.6

Efron and Tibshirani discuss the scor (bootstrap) test score data on 88 students who took examinations in five subjects. The first two tests (mechanics, vectors) were closed book and the last three tests (algebra, analysis, statistics) were open book. Each row of the data frame is a set of scores $ (x_{i1},...x_{i5})$ for the $i^{th}$ student. Use a panel display to display the scatter plots for each pair of test scores. Compare the plot with the sample correlation matrix. Obtain bootstrap estimates of the standard errors for each of the following estimates: $\hat\rho_{12}=\hat\rho(mec,vec),\hat \rho_{34}=\hat\rho(alg,ana), \hat\rho_{35}=\hat\rho(alg,sta), \hat\rho_{45}=\hat\rho(ana,sta)$

7.B

Repeat Project 7.A for the sample skewness statistic. Compare the coverage rates for normal populations (skewness 0) and $χ^2(5)$ distributions (positive skewness)

Answer

7.6

Comparing the scatter plot with the sample correlation matrix, we can find the results are same.Both indicate that ala and alg, alg and vec have a relatively strong relation.And there is little relation between sta and mec.
By Boot function, we can get the required bootstrap estimates of s.e:
$ s.e(\hat\rho_{12}) = 0.07675172, \ s.e(\hat\rho_{34}) = 0.04844542, \ s.e(\hat\rho_{35} )= 0.06027028, \s.e(\hat\rho_{45}) = 0.06668117$.

set.seed(123)
library(bootstrap)
library(lattice)
splom(scor)
cor(scor)
x<-as.matrix(scor)

library(boot) #for boot function

rho<-function(k,l){
r <- function(x, i) {
  #want correlation of columns 1 and 2
  cor(x[i,k], x[i,l])
}
theta <- boot(data = scor, statistic = r, R = 2000)
return(theta)
}
rho(1,2)
rho(3,4)
rho(3,5)
rho(4,5)
7.B

From the results below, we can know that when x is from N(0,1), the coverage probabilities of the standard normal bootstrap confidence interval, the basic bootstrap confidence interval, and the percentile confidence interval are 0.894 , 0.889 and 0.994 respectively.
The correspondingly proportion of times that the confidence intervals miss on the left are 0.057,0.058,0.004.
And the correspondingly porportion of times that the confidence intervals miss on the right are 0.049,0.053 ,0.002.

When x is from $χ^2(5)$,the coverage probabilities of the standard normal bootstrap confidence interval, the basic bootstrap confidence interval, and the percentile confidence interval are 0.75,0.711, 0.73 respectively.
The correspondingly proportion of times that the confidence intervals miss on the left are 0.245, 0.246 ,0.27.
And the correspondingly porportion of times that the confidence intervals miss on the right are 0.005,0.043,0.

library(boot)
library(moments)
set.seed(1)
mu1<-0
mu2<-(8*gamma(11/2)/gamma(5/2)-3*5*2*5-5^3 )/(2*5)^(3/2)# compute the skewness of chisquare with freedom 5
n<-1e1
m<-1e3
boot.sk <- function(x,i) skewness(x[i])
ci.n<-ci.b<-ci.p<-matrix(NA,m,2)

#sample from N(0,1)
for(i in 1:m){
  x<-rnorm(n)
  de <- boot(data=x,statistic=boot.sk, R = 1000)
  ci <- boot.ci(de,type=c("norm","basic","perc"))
  ci.n[i,]<-ci$norm[2:3]
  ci.b[i,]<-ci$basic[4:5]
  ci.p[i,]<-ci$percent[4:5]
}
cat('norm =',mean(ci.n[,1]<=mu1 & ci.n[,2]>=mu1),
    '\nbasic =',mean(ci.b[,1]<=mu1 & ci.b[,2]>=mu1),
    '\nperc =',mean(ci.p[,1]<=mu1 & ci.p[,2]>=mu1))
cat('\nnorm(right) =',mean(ci.n[,2]<=mu1),
     '\nbasic(right) =',mean(ci.b[,2]<=mu1 ),
     '\nperc(right) =',mean(ci.p[,2]<=mu1))
cat('\nnorm(left) =',mean(ci.n[,1]>=mu1),
     '\nbasic(left) =',mean(ci.b[,1]>=mu1),
     '\nperc(left)=',mean( ci.p[,1]>=mu1))

#sample from chisqare
for(i in 1:m){
  x<-rchisq(n,5)
  de <- boot(data=x,statistic=boot.sk, R = 1000)
  ci <- boot.ci(de,type=c("norm","basic","perc"))
  ci.n[i,]<-ci$norm[2:3]
  ci.b[i,]<-ci$basic[4:5]
  ci.p[i,]<-ci$percent[4:5]
}
cat('\nnorm =',mean(ci.n[,1]<=mu2 & ci.n[,2]>=mu2),
    '\nbasic =',mean(ci.b[,1]<=mu2 & ci.b[,2]>=mu2),
    '\nperc =',mean(ci.p[,1]<=mu2 & ci.p[,2]>=mu2))
cat('\nnorm(right) =',mean(ci.n[,2]<=mu2),
     '\nbasic(right) =',mean(ci.b[,2]<=mu2 ),
     '\nperc(right) =',mean(ci.p[,2]<=mu2 ))
cat('\nnorm(left) =',mean(ci.n[,1]>=mu2),
     '\nbasic(left) =',mean(ci.b[,1]>=mu2),
     '\nperc(left)=',mean( ci.p[,1]>=mu2))

2019.11.15

Question

7.8

Refer to Exercise 7.7. Obtain the jackknife estimates of bias and standard error of $\hat{θ}$.

7.10

In Example 7.18, leave-one-out (n-fold) cross validation was used to select the best fitting model. Repeat the analysis replacing the Log-Log model with a cubic polynomial model. Which of the four models is selected by the cross validation procedure? Which model is selected according to maximum adjusted $R^2$?

Answer

7.8

To compute jackknife estimates of bias and standard error of $\hat{θ}$, I first computed $\hat{θ}$'s jackknife estimates. And then, with the properties of jackknife estimates: $$\hat{bias}{jack} = (n − 1)( \bar{\hat{θ}}(·) − \hat{θ} )$$ $$\hat{se}{jack} =\sqrt{\frac{n-1}{n}\sum_{i=1}^{n}( \hat{\theta}{(-i)}-\bar{\hat{θ}}(·) )^2}$$ where $\hat{\theta}{(-i)}$ is the estimate with $i^{th}$ observation left out, and $\bar{\hat{θ}}(·)$ is the mean of all jackknife estimates.
I could compute the bias and s.e of jackknife estimate.The code and result is showed below.

library(bootstrap)
data(scor)
n<-nrow(scor)
c<-cov(scor)
lambda<-eigen(c)$values
theta.hat<-lambda[1]/sum(lambda)

# to compute jackknife estimates
theta.j<-numeric(n)
for (i in 1:n) {
  c.j<-cov(scor[-i,])
  l.j<-eigen(c.j)$values
  theta.j[i]<-l.j[1]/sum(l.j)
}
bias.j <- (n-1)*(mean(theta.j)-theta.hat)
se.j <- sqrt((n-1)*mean((theta.j-mean(theta.j))^2))
round(c(theta.hat=theta.hat,bias.jack=bias.j, se.jack=se.j),3)
7.10

In this question, the four models are: $$ 1. Linear: Y = β_0 + β_1X + ε.\ 2. Quadratic: Y = β_0 + β_1X + β_2X^2 + ε.\ 3. Exponential: log(Y) = log(β_0) + β_1X + ε.\ 4. Cubic: Y = β_0 + β_1X + β_2X^2 + β_3X^3 + ε. $$

By compare the estimates for prediction error are obtained from the n-fold cross validation, we can find the quadratic model, whose prediction error is minimum, would be the best fit for the data.
According to maximum adjusted $R^2$, the quadratic model, whose adjusted $R^2$ is the largest, would be best.

library(knitr)
library(DAAG)
attach(ironslag)
n <- length(magnetic) #in DAAG ironslag

# cross validation
e1 <- e2 <- e3 <- e4 <- numeric(n)
for (k in 1:n) {
  y <- magnetic[-k]
  x <- chemical[-k]

  J1 <- lm(y ~ x)
  yhat1 <- J1$coef[1] + J1$coef[2]*chemical[k]
  e1[k] <- magnetic[k] - yhat1

  J2 <- lm(y ~ x + I(x^2))
  yhat2 <- J2$coef[1] + J2$coef[2]*chemical[k] +J2$coef[3]*chemical[k]^2
  e2[k] <- magnetic[k] - yhat2

  J3 <- lm(log(y) ~ x)
  logyhat3 <- J3$coef[1] + J3$coef[2]*chemical[k]
  yhat3 <- exp(logyhat3)
  e3[k] <- magnetic[k] - yhat3

  J4 <- lm(y ~ x + I(x^2)+I(x^3))
  yhat4 <- J4$coef[1] + J4$coef[2]*chemical[k] + J4$coef[3]*chemical[k]^2 + J4$coef[4]*chemical[k]^3
  e4[k] <- magnetic[k] - yhat4
}


L1 <- lm(magnetic~chemical)
L2 <- lm(magnetic~chemical+I(chemical^2))
L3 <- lm(log(magnetic)~chemical)
L4 <- lm(magnetic~chemical+I(chemical^2)+I(chemical^3))
# to get adjustied R^2 
r1 <- summary(L1)$adj.r.squared
r2 <- summary(L2)$adj.r.squared
r3 <- summary(L3)$adj.r.squared
r4 <- summary(L4)$adj.r.squared

results <- data.frame(Linear=c(mean(e1^2),r1),Quadratic=c(mean(e2^2),r2),
           Exponential=c(mean(e3^2),r3),Cubic=c(mean(e4^2),r4),
           row.names=c(" prediction error","adjustied R-square"))
kable(results)

2019.11.22

Question

8.3

The Count 5 test for equal variances in Section 6.4 is based on the maximum number of extreme points. Example 6.15 shows that the Count 5 criterion is not applicable for unequal sample sizes. Implement a permutation test for equal variance based on the maximum number of extreme points that applies when sample sizes are not necessarily equal.

slides 31

Power comparison (distance correlation test versus ball covariance test)
Model 1: Y = X/4 + e
Model 2: Y = X/4 × e
$X ∼ N(0_2, I_2), e ∼ N(0_2, I_2)$, X and e are independent.

Answer

8.3

By permutation, we can get the type-I error is 0.039, while without permutation, the type-I error is 0.318. Therefore, permutation test for equal variance based on the maximum number of extreme points that applies when sample sizes are not necessarily equal performs better.

count5test <- function(x, y) {
  X <- x - mean(x)
  Y <- y - mean(y)
  outx <- sum(X > max(Y)) + sum(X < min(Y))
  outy <- sum(Y > max(X)) + sum(Y < min(X))
  # return 1 (reject) or 0 (do not reject H0)
  return(as.integer(max(c(outx, outy)) > 5))
}

set.seed(123)
# Count Five test permutation
count5test_permutation <- function(z) {
n <- length(z)
x <- z[1:(n/2)]
y <- z[-(1:(n/2))]
X <- x - mean(x)
Y <- y - mean(y)
outx <- sum(X > max(Y)) + sum(X < min(Y)) 
outy <- sum(Y > max(X)) + sum(Y < min(X))
# return 1 (reject) or 0 (do not reject H0) 
return(as.integer(max(c(outx, outy)) > 5))
}

permutation <- function(z,R) {
  n <- length(z)
  out <- numeric(R)
  for (r in 1: R){
      p <- sample(1:n ,n ,replace = FALSE)
      out[r] <- count5test_permutation(z[p])
  }
  sum(out)/R
}              

n1 <- 20
n2 <- 50
mu1 <- mu2 <- 0
sigma1 <- sigma2 <- 1
m <- 1e3

alphahat1 <- mean(replicate(m, expr={
x <- rnorm(n1, mu1, sigma1)
y <- rnorm(n2, mu2, sigma2)
x <- x - mean(x) #centered by sample mean
y <- y - mean(y)
count5test(x, y)
}))
alphahat2 <- mean(replicate(m, expr={
x <- rnorm(n1, mu1, sigma1)
y <- rnorm(n2, mu2, sigma2)
x <- x - mean(x) #centered by sample mean 
y <- y - mean(y)
z <- c(x,y)
permutation(z,1000) 
})<0.05)

round(c(count5test=alphahat1,count5test_permutation=alphahat2),4)
slides 31

By the code below, we can find that in model 1, power of two methods are similiar. But in model 2, the ball covariance test seems better.

#some function used in distance correlation
dCov <- function(x, y) {
  x <- as.matrix(x); y <- as.matrix(y)
  n <- nrow(x); m <- nrow(y)
  if (n != m || n < 2) stop("Sample sizes must agree")
  if (! (all(is.finite(c(x, y)))))
    stop("Data contains missing or infinite values")
  Akl <- function(x) {
    d <- as.matrix(dist(x))
    m <- rowMeans(d); M <- mean(d)
    a <- sweep(d, 1, m); b <- sweep(a, 2, m)
    b + M
  }
  A <- Akl(x); B <- Akl(y)
  sqrt(mean(A * B))
}
ndCov2 <- function(z, ix, dims) {
  #dims contains dimensions of x and y
  p <- dims[1]
  q <- dims[2]
  d <- p + q
  x <- z[ , 1:p] #leave x as is
  y <- z[ix, -(1:p)] #permute rows of y
  return(nrow(z) * dCov(x, y)^2)
}

library(boot)
library(Ball)
set.seed(123)
n<-20
p.cor1<-p.cor2<-p.ball1<-p.ball2<-numeric(n)

  for(i in 1:n){
    x <- e <- matrix(rnorm(2*i), , 2)
    # for model 1
    y1<-x/4+e
    z1 <- cbind(x,y1)
    boot.obj1 <- boot(data = z1, statistic = ndCov2, R = 999,
                      sim = "permutation", dims = c(2, 2))
    # permutatin: resampling without replacement
    tb1 <- c(boot.obj1$t0, boot.obj1$t)
    p.cor1[i]<- 1-mean(tb1>=tb1[1])
    #ball
    p.ball1 [i]<- 1-bcov.test(z1[,1:2],z1[,3:4],R=999)$p.value

    # for model 2
    y2<-x/4*e
    z2 <- cbind(x,y2)
    boot.obj2 <- boot(data = z2, statistic = ndCov2, R = 999,
                      sim = "permutation", dims = c(2, 2))
    # permutatin: resampling without replacement
    tb2 <- c(boot.obj2$t0, boot.obj2$t)
    p.cor2[i]<- 1-mean(tb2>=tb2[1])
    #ball
    p.ball2[i] <- 1-bcov.test(z2[,1:2],z2[,3:4],R=999)$p.value
  }

plot(1:20,p.cor1,'l',xlab = 'n',ylab = 'power',main = 'model 1')
lines(1:20,p.ball1,'l',col='red')
legend(10,0.4,c('distance correlation test','ball covariance test'),col=c('black','red'),
       lty = c(1,1),cex = 0.7 )

plot(1:20,p.cor2,'l',xlab = 'n',ylab = 'power',main = 'model 2')
lines(1:20,p.ball2,'l',col='red')
legend(10,0.4,c('distance correlation test','ball covariance test'),col=c('black','red'),
       lty = c(1,1),cex = 0.7 )

2019.11.29

Qusetion

9.4

Implement a random walk Metropolis sampler for generating the standard Laplace distribution (see Exercise 3.2). For the increment, simulate from a normal distribution. Compare the chains generated when different variances are used for the proposal distribution. Also, compute the acceptance rates of each chain.

Answer

To implement a random walk Metropolis for generating the standard Laplace distribution, we need first get the density function(which I defined as "dl" in R code). And then set a random x0 as our starter to get the markov chain and the acceptance rate(this is what the fuction "rw.Metropolis does). From the R code below, we can get the random walk Metropolis chains generated by proposal distributions with different variances, and we can also compute its corresponding acceptance rate. From the result, we can find when $\sigma=1$, the chain is best.

#define the density fuction
dl<-function(x) 1/2*exp(-abs(x))
rw.Metropolis <- function(n, sigma, x0, N) {
  x <- numeric(N)
  x[1] <- x0
  u <- runif(N)
  k <- 0
  for (i in 2:N) {
    y <- rnorm(1, x[i-1], sigma)
    if (u[i] <= (dl(y) / dl(x[i-1])))
      x[i] <- y else {
        x[i] <- x[i-1]
        k <- k + 1
      }
  }
  return(list(x=x, k=k))
}

n <- 4 
N <- 3000
sigma <- c(.01, .1, 1, 10)
x0 <- 5
index <- 0:N


#sigama=0.01
rw1 <- rw.Metropolis(n, sigma[1], x0, N)
y1 <- rw1$x[index]
plot(y1, type="l", main=expression(sigma==0.01), ylab="x",xlab = "")
p1<-1-rw1$k/N

#sigama=0.1
rw2 <- rw.Metropolis(n, sigma[2], x0, N)
y2 <- rw2$x[index]
plot(y2, type="l", main=expression(sigma==0.1), ylab="x",xlab = "")
p2<-1-rw2$k/N

#sigama=1
rw3 <- rw.Metropolis(n, sigma[3], x0, N)
y3 <- rw3$x[index]
plot(y3, type="l", main=expression(sigma==1), ylab="x",xlab = "")
p3<-1-rw3$k/N

#sigma=10
rw4 <- rw.Metropolis(n, sigma[4], x0, N)
y4 <- rw4$x[index]
plot(y4, type="l", main=expression(sigma==10), ylab="x",xlab = "")
p4<-1-rw4$k/N

rate<-data.frame(p1,p2,p3,p4,row.names ='acceptance rate')
names(rate)<-c(0.01,0.1,1,10)
knitr::kable(rate)

2019.12.6

Question

11.1

The natural logarithm and exponential functions are inverses of each other,so that mathematically $\log(\exp x) = \exp(\log x) = x$. Show by example that this property does not hold exactly in computer arithmetic. Does the identity hold with near equality? (See all.equal.)

11.5

Write a function to solve the equation: $$ \frac{2 \Gamma\left(\frac{k}{2}\right)}{\sqrt{\pi(k-1)} \Gamma\left(\frac{k-1}{2}\right)} \int_{0}^{c_{k-1}}\left(1+\frac{u^{2}}{k-1}\right)^{-k / 2} d u =\frac{2 \Gamma\left(\frac{k+1}{2}\right)}{\sqrt{\pi k} \Gamma\left(\frac{k}{2}\right)} \int_{0}^{c_{k}}\left(1+\frac{u^{2}}{k}\right)^{-(k+1) / 2} d u $$ for $a$, where $c_{k}=\sqrt{\frac{a^{2} k}{k+1-a^{2}}}$
Compare the solutions with the points $A(k)$ in Exercise 11.4

Slides

Let the three alleles be A, B, and O with allele frequencies p,q, and r. The 6 genotype frequencies under HWE andcomplete counts are as follows.

$$\begin{array}{|c|c|c|c|c|c|c|}\hline \text { Genotype } & {A A} & {B B} & {O O} & {A O} & {B O} & {A B} & {S u m} \ \hline \text { Frequency } & {p^{2}} & {q^{2}} & {r^ 2} & {2 p r} & {2 q r} & {2 p q} & {1} \ \hline \text { Count } & {n_{AA} } & {n_ {B B}} & {n_ {O O}} & {n {A O}} & {n {B O}} & {n_ {A B}} & {n} \ \hline\end{array}$$

Observed data: $n_{A·} = n_{AA} + n_{AO} = 28$ (A-type), $n_{B·} = n_{BB} + n_{BO} = 24$(B-type), $n_{OO} = 41$ (O-type), $n_{AB} = 70$ (AB-type).
(1)Use EM algorithm to solve MLE of $p$ and $q$ (consider missing data $n_{AA}$ and $n_{BB}$).
(2)Show that the log-maximum likelihood values in M-steps are increasing via line plot.

Answer

11.1

Given a random value 3, we can find the result of $\log(\exp(x)) , \exp(\log(x)),x$ are not all equal, but we can find the identity hold with near equality by the R function "all.equal"

x<-3 #given a random value
x1<-log(exp(x))
x2<-exp(log(x))
cat("x-log(exp(x))=",x-x1)
cat("\nx-exp(log(x))=",x-x2)
cat("\nlog(exp(x))-exp(log(x))=",x1-x2)
cat("\n")
all.equal(x,x1)
all.equal(x,x2)
all.equal(x1,x2)
11.5

To solve the equation for $a$ is to find $a$ makes $$ \frac{2 \Gamma\left(\frac{k}{2}\right)}{\sqrt{\pi(k-1)} \Gamma\left(\frac{k-1}{2}\right)} \int_{0}^{c_{k-1}}\left(1+\frac{u^{2}}{k-1}\right)^{-k / 2} d u -\frac{2 \Gamma\left(\frac{k+1}{2}\right)}{\sqrt{\pi k} \Gamma\left(\frac{k}{2}\right)} \int_{0}^{c_{k}}\left(1+\frac{u^{2}}{k}\right)^{-(k+1) / 2} d u=0 $$ Thus, we can use the R function "uniroot" to find such $a$ with different $k$. The code is as follows.
Compared with the results in 11.4, we can find the point estimation under the same $k$ value is same

#in the 11.5 case
fc<-function(a,k) sqrt(a^2*k/(1+k-a^2))
f1<-function(u) (1+u^2/(k-1))^(-k/2)
f2<-function(u) (1+u^2/k)^(-(k+1)/2)
f <- function(a){
  2*gamma(k/2)/(sqrt(pi*(k-1))*gamma((k-1)/2))*integrate(f1,0,fc(a,k-1))$value-2*gamma((k+1)/2)/(sqrt(pi*k)*gamma(k/2))*integrate(f2,0,fc(a,k))$value
}   

K <- c(4:25,100)
n <- length(K)
R <- numeric(n)
for (i in 1:n) {
  k <- K[i]
  R[i] <- uniroot(f,c(0.5,sqrt(k)/2+1))$root
}
print(R)

# in the 11.4case
S<-function(a){
  pt(sqrt((k-1)*a^2/(k-a^2)),k-1)-pt(sqrt((k*a^2)/(k+1-a^2)),k)
}
K1 <- c(4:25,100,500,1000)
n <- length(K1)
inter <- numeric(n)
for (i in 1:n) {
  k <- K1[i]
  inter[i] <- uniroot(S,c(0.5,sqrt(k)/2+1))$root
}
print(inter)
Slides

(1)In this question, we can first get the log-maximum likelihood function with the data we get(which is denote as "E" in my code), and than by iteration and updating $p$ and $q$, we can finally get the estimators of $p,q$ which converges. Finally, by my code, the estimators of $p,q$ are 0.3273442, 0.3104267.
(2)And we can find the log-maximum likelihood values in M-steps are increasing by line plot.

library(rootSolve)
N <- 500
na.<-28
nb.<-24
noo<-41
nab<-70
n<-sum(na.,nb.,noo,nab)
#initial estimations for p and q
p.hat<-uniroot(function(p) 2*p-p^2-(na.+nab)/n,c(0,1))$root
q.hat<-uniroot(function(q) 2*q-q^2-(nb.+nab)/n,c(0,1))$root
pa <- c(p.hat,q.hat)

# initial estimates 
tol <- .Machine$double.eps^0.5
pa.old <- pa+1
E <- numeric(N)
for(j in 1:N){
  E[j]<-2*pa[1]*na.*log(pa[1])/(2-pa[1]-2*pa[2])+2*pa[2]*nb.*log(pa[2])/(2-pa[2]-2*pa[1])+2*noo*log(1-pa[1]-pa[2])+na.*(2-2*pa[1]-2*pa[2])*log(2*pa[1]*(1-pa[1]-pa[2]))/(2-pa[1]-2*pa[2])+nb.*(2-2*pa[1]-2*pa[2])*log(2*pa[2]*(1-pa[1]-pa[2]))/(2-pa[2]-2*pa[1])+nab*log(2*pa[1]*pa[2])
  M<-function(x){
    P1<-2*pa[1]*na./((2-pa[1]-2*pa[2])*x[1])-2*noo/(1-x[1]-x[2])+na.*(2-2*pa[1]-2*pa[2])*(1-2*x[1]-x[2])/((2-pa[1]-2*pa[2])*x[1]*(1-x[1]-x[2]))-nb.*(2-2*pa[1]-2*pa[2])/((2-pa[2]-2*pa[1])*(1-x[1]-x[2]))+nab/x[1]
    P2<-2*pa[2]*nb./((2-pa[2]-2*pa[1])*x[2])-2*noo/(1-x[1]-x[2])-na.*(2-2*pa[1]-2*pa[2])/((2-pa[1]-2*pa[2])*(1-x[1]-x[2]))+nb.*(2-2*pa[1]-2*pa[2])*(1-2*x[2]-x[1])/((2-pa[2]-2*pa[1])*x[2]*(1-x[1]-x[2]))+nab/x[2]
    c(P1=P1,P2=P2)
  }
  S<-multiroot(f=M,star=c(.1,.1))
  pa<-S$root
  # update p and q
  if (sum(abs(pa-pa.old)/pa.old)<tol) break
  pa.old<-pa
}
plot(1:N,E,xlab=' ',ylab=' ','l',main='log-maximum likelihood')
cat("p=",pa[1])
cat("\nq=",pa[2])

2019.12.13

Question

1.

Use both for loops and lapply() to fit linear models to the mtcars using the formulas stored in this list:
formulas <- list(
mpg ~ disp,
mpg ~ I(1 / disp),
mpg ~ disp + wt,
mpg ~ I(1 / disp) + wt
)

2.

Fit the model mpg ~ disp to each of the bootstrap replicates of mtcars in the list below by using a for loop and lapply(). Can you do it without an anonymous function?
bootstraps <- lapply(1:10, function(i) {
rows <- sample(1:nrow(mtcars), rep = TRUE)
mtcars[rows, ]
})

3.

For each model in the previous two exercises, extract R2 using the function below.
rsq <- function(mod) summary(mod)$r.squared

4.

The following code simulates the performance of a t-test for non-normal data. Use sapply() and an anonymous function to extract the p-value from every trial.
trials <- replicate(
100,
t.test(rpois(10, 10), rpois(7, 10)),
simplify = FALSE
)

5.

Implement mcsapply(), a multicore version of sapply(). Can you implement mcvapply(), a parallel version of vapply()? Why or why not?

Answer

1.
data("mtcars")
attach(mtcars)
formulas <- list(
  mpg ~ disp,
  mpg ~ I(1 / disp),
  mpg ~ disp + wt,
  mpg ~ I(1 / disp) + wt
)

#using fuction "lapply"
m1<-lapply(formulas, lm)
print(m1)

#using loop
for (x in formulas) {
  print(lm(x))
}
2.

If we don't want use an anonymous function, we can use the function"lapply" in a special way by adding some commands. Besides, we can also bootstrap only in "mpg" and its corresponding "disp", which is showed below. Both two ways are avoid using an anonymous function.

set.seed(123)
bootstraps <- lapply(1:10, function(i) {
  rows <- sample(1:nrow(mtcars), rep = TRUE)
  mtcars[rows,]
})
#using fuction "lapply"
m2<-lapply(seq_along(bootstraps),function(i) {lm(bootstraps[[i]]$mpg~bootstraps[[i]]$disp)})
print(m2)
#using loop
for (x in bootstraps) {
  print(lm(mpg~disp))
}



#without an anonymous function 
m2<-lapply(bootstraps, lm, formula=mpg~disp)
print(m2)
# there are another way which also not using an anonymous function
bootstraps1 <- lapply(1:10, function(i) {
  rows <- sample(1:nrow(mtcars), rep = TRUE)
  mtcars[rows,c(1,3)]
})
#using fuction "lapply"(without an anonymous function)
lapply(bootstraps1, lm)
#using loop(without an anonymous function)
for (x in bootstraps1) {
  print(lm(x))
}
3
rsq <- function(mod) summary(mod)$r.squared
#R^2 in question 1
lapply(m1, rsq)
#R^2 in question 2
lapply(m2, rsq)

detach(mtcars)
4.
set.seed(123)
trials <- replicate(
  100,
  t.test(rpois(10, 10), rpois(7, 10)),
  simplify = FALSE
)
sapply(seq_along(trials),function(i) {round(trials[[i]]$p.value,4)})
5.

Just like the function "lapply" and "mclapply", we can get the function "mcsapply" and use it.
However, I don't think the function "mcvapply()" is practical, because the implement of "sapply" needs to set the parameter---"function value", which seems not easy to apply it to each cores meanwhile.

library(parallel)

num_cores = detectCores()
cluster = makePSOCKcluster(num_cores)

mcsapply = function(cluster,X,FUN,...){
 res = parLapply(cluster,X,FUN,...) 
 simplify2array(res)
}

system.time(mcsapply(cluster, 1:10, function(i) Sys.sleep(1)))
system.time(sapply(1:10, function(i) Sys.sleep(1)))

stopCluster(cluster)

2019.12.20

Question

You have already written an R function for Exercise 9.4 (page 277, Statistical Computing with R). Rewrite an Rcpp function for the same task.
(1)Compare the generated random numbers by the two functions using qqplot.
(2) Campare the computation time of the two functions with microbenchmark.
(3) Comments your results.

Answer

Both functions are showed in the code below. By compare the results of two functions and their qqplot, we can find that the chains generated by two functions are similar (while sigma is small, the chain is not good and the chains generated by two functions may different). However, compare the computing time, we can easily find that the cpp function is more efficient than R function under different sigma. Thus, The cpp function performs better than R function in this case.

library(Rcpp)
library(microbenchmark)
#define cpp function
cppFunction('NumericMatrix rwMC(double sigma, double x0, int N){
  NumericMatrix x(N,2);
  x(0,0) = x0;
  x(0,1) = 1;
  NumericVector u(N);
  u=runif(N);
  for (int i=1;i<N; i++) {
    double y = rnorm(1, x[i-1], sigma)[0];
    double t = exp(-abs(y)) / exp(-abs(x[i-1]));
      if (u[i] <= t) {
        x(i,0) = y; 
        x(i,1) = 1;
      }else {
        x(i,0) = x(i-1,0);
        x(i,1) = 0;
      }
  };
  return x;
}')
#define R function
rwMR <- function(sigma, x0, N){
  x <- numeric(N)
  x[1] <- x0
  u <- runif(N)
  k <- 0
  for (i in 2:N) {
    y <- rnorm(1, x[i-1], sigma)
    if (u[i] <= (exp(-abs(y)) / exp(-abs(x[i-1])))) x[i] = y 
    else {
      x[i] = x[i-1]
      k = k+1
    }
  }
  return(list(x = x, k = k))
}
# the plot with sigma=2
out1<-rwMC(2,20,2000)
out2<-rwMR(2,20,2000)
plot(out1[,1], type="l", main='rwMC(sigma=2)', ylab="x",xlab = "")
plot(out2$x,type="l", main='rwMR(sigma=2)', ylab="x",xlab = "")
#compare the generated random numbers
qqplot(out1[,1],out2$x,xlab = 'rwMR',ylab = 'rwMC',main='qqplot of two functions(sigma=2)')
abline(a=0,b=1)
#compare the computation time
ts <- microbenchmark(r=rwMR(2,20,2000),cpp=rwMC(2,20,2000))
summary(ts)

# the plot with sigma=10
out1<-rwMC(10,20,2000)
out2<-rwMR(10,20,2000)
plot(out1[,1], type="l", main='rwMC(sigma=10)', ylab="x",xlab = "")
plot(out2$x,type="l", main='rwMR(sigma=10)', ylab="x",xlab = "")
#compare the generated random numbers
qqplot(out1[,1],out2$x,xlab = 'rwMR',ylab = 'rwMC',main='qqplot of two functions(sigma=10)')
abline(a=0,b=1)
#compare the computation time
ts <- microbenchmark(r=rwMR(10,20,2000),cpp=rwMC(10,20,2000))
summary(ts)

# the plot with sigma=0.5
out1<-rwMC(0.5,20,2000)
out2<-rwMR(0.5,20,2000)
plot(out1[,1], type="l", main='rwMC(sigma=0.5)', ylab="x",xlab = "")
plot(out2$x,type="l", main='rwMR(sigma=0.5)', ylab="x",xlab = "")
#compare the generated random numbers
qqplot(out1[,1],out2$x,xlab = 'rwMR',ylab = 'rwMC',main='qqplot of two functions(sigma=0.5)')
abline(a=0,b=1)
#compare the computation time
ts <- microbenchmark(r=rwMR(0.5,20,2000),cpp=rwMC(0.5,20,2000))
summary(ts)


SC19057/package documentation built on Jan. 3, 2020, 12:10 a.m.