Question

  • Exercises 3.5

  • A discrete random variable X has probability mass function

| x | 0 | 1 |2 |3 |4 | |--:|--:|--:|--:|--:|--:| |p(x)|0.1|0.2|0.2|0.2|0.3|

  • Use the inverse transform method to generate a random sample of size 1000 from the distribution of X. Construct a relative frequency table and compare the empirical with the theoretical probabilities. Repeat using the R sample function.

Answer

set.seed(1)
x<-c(0:4) ; p<-c(0.1,0.2,0.2,0.2,0.3)
cp<-cumsum(p); n<-1000
r<-x[findInterval(runif(n),cp)+1] #has already get zhe sample of size 1000 by incerse transform
set.seed(1)
y<-sample(x,size=n,replace = TRUE,prob=p)
counts <- table(c(rep(0,n),rep(1,n)),c(r, y))
barplot(counts, main="The distribution of X",
        xlab="X",ylab='Count', col=c("darkblue","red"),
        legend = c('Inverse','sample'), beside=TRUE) #compare result

Question

  • Exercises 3.7

  • Write a function to generate a random sample of size n from the Beta(a, b) distribution by the acceptance-rejection method. Generate a random sample of size 1000 from the Beta(3,2) distribution. Graph the histogram of the sample with the theoretical Beta(3,2) density superimposed.

Answer

  • beta(a,b) with pdf $f(x)=\frac{1}{Beta(a,b)}x^{a-1}(1-x)^{b-1},0<x<1$
  • The $g(x)\sim U(0,1),0<x<1$,and c is the max value of beta(a,b)'s df which can increas the accept rate
r.beta<-function(a,b,size){
  k<-0; j<-0; y<-numeric(size)
  c<-dbeta((a-1)/(a+b-2),a,b) #The max value of beta(a,b)'s df
while (k < size) {
  u <- runif(1)
  j <- j + 1
  x <- runif(1) #random variate from g
  row.x<-dbeta(x,a,b)/c
  if ( row.x> u) {
    #we accept x
    k <- k + 1
    y[k] <- x
  }
}
  acp<-k/j #The accept rate
  return(list(acp,y))
}

x<-r.beta(3,2,1000)
x[[1]] #The accept rate
hist(x[[2]],breaks=seq(0,1,0.05),freq = FALSE,main = "Histogram of beta(3,2)",xlab = "beta(3,2)")
y<-seq(0,1,0.01)
lines(y,dbeta(y,3,2),col="red")

Question

  • Exercises 3.12

  • Simulate a continuous Exponential-Gamma mixture. Suppose that the rate parameter $\Lambda$ has $Gamma(\gamma, \beta)$ distribution and Y has $\Lambda$ distribution. That is, $(Y |\Lambda = \lambda) \sim f_{Y}(y|\lambda) = \lambda e^{-\lambda y}$. Generate 1000 random observations from this mixture with $\gamma = 4$ and $\beta = 2$.

Answer

  • It can be show that the mixture of Exponential-Gamma has a Pareto distribution with pdf $f(x)=r\frac{\beta^{r}}{(\beta+x)^{r+1}},x>0$
n<-1000;r<-4;beta<-2
lamada<-rgamma(n,r,beta)
y<-rexp(n,lamada)
hist(y,breaks = seq(0,max(y)+1,0.1),freq = F)
f1<-function(r,beta,x){
  r*(beta^r)/((beta+x)^(r+1))
} #The pdf of Pareto distribution 
x<-seq(0,max(y)+1,0.01)
lines(x,f1(r,beta,x),col="red")

Question

  • Exercises 5.4

    Write a function to compute a Monte Carlo estimate of the Beta(3, 3) cdf, and use the function to estimate F(x) for x=0 .1,0.2,...,0.9. Compare the estimates with the values returned by the pbeta function in R.

Answer

  • Because $\theta=\int_{0}^{x}\frac{1}{B(3,3)}t^{2}(1-t)^{2}dt=E_{Y}[\frac{x}{B(3,3)}Y^{2}(1-Y)^{2}],Y\sim U(0,x)$ ,in R
set.seed(12)
beta_33<-function(x,m){
  n<-length(x)
  theta<-numeric(n)
  for(i in 1:n){
  y<-runif(m,0,x[i]) #sample Y
  theta[i]<-mean(x[i]*(y^2)*((1-y)^2)/beta(3,3))
}
  return(theta)
}

x<-seq(0.1,0.9,0.1)
m<-100000

x1<-beta_33(x,m)
x1<-round(x1,5) #estime result
x2<-pbeta(x,3,3)
a<-rbind(x,x1,x2)
rownames(a)<-c("X","MC","pbeta")
colnames(a)<-c("x1","x2","x3","x4","x5","x6","x7","x8","x9")
knitr::kable(a,format="markdown",caption = "Comparation of them",align = "c")

Question

  • Exercises 5.9

    The Rayleigh density [156, (18.76)] is $$f(x)=\frac{x}{\sigma}e^{-x^2/(2\sigma^2)},x\ge0,\sigma>0$$ Implement a function to generate samples from a Rayleigh($\sigma$) distribution, using antithetic variables. What is the percent reduction in variance of $\frac{X+X^\prime}{2}$ compared with $\frac{X_{1}+X_{2}}{2}$ for independent $X_{1},X_{2}$?

Answer

Because the cdf is $F(x;\sigma)=1-e^{-x^2/(2\sigma^2)},x\ge0$ ,Given a random variate U drawn from the uniform distribution in the interval (0, 1), then the variate $X=\sigma\sqrt{-2lnU}$ has a Rayleigh distribution with parameter $\sigma$.This is obtained by applying the inverse transform sampling-method.The antithetic variable is $X^{\prime}=\sigma\sqrt{-2ln(1-U)}$

 RL<-function(sigma,m,antithetic = TRUE){
  y<-runif(m)
  if(antithetic) v<-1-y else v<-runif(m)
  x1<-sigma*sqrt(-2*log(y))
  x2<-sigma*sqrt(-2*log(v))
  return(cbind(x1,x2))
 }
 sigma<-2 #the value of sigma
 m<-1000
 mc1<-RL(sigma,m) # antithetic samples
 mc2<-RL(sigma,m,antithetic = FALSE) #independent samples
 sd1<-sd(rowMeans(mc1))  
 sd2<-sd(rowMeans(mc2))  
 print(c(sd1,sd2,sd1/sd2))

Question

  • Exercise 5.13

    Find two importance functions $f_{1}$ and $f_{2}$ that are supported on $(1,\infty)$ and are 'close' to $$g(x)=\frac{x^2}{\sqrt{2\pi}}e^{-x^2/2},x>1$$ Which of your two importance functions should produce the smaller variance in estimating $$\int_{1}^{\infty} \frac{x^2}{\sqrt{2\pi}}e^{-x^2/2}dx$$ by importance sampling? Explain.

Answer

We can choose $f_{1}=e^{-(x-1)},x\ge1$ and $f_{2}=\frac{1}{x^2},x\ge1$as the importance functions of $g(x)$

 g<-function(x) (x^2)*(exp((-x^2)/2))/sqrt(2*3.14159)
 f1<-function(x) exp(-(x-1))
 f2<-function(x) 1/x^2
 x<-seq(1,10,.01)
 gs<-c(expression(g(x)==x^2*e^{-x^2/2}/sqrt(2*pi)),expression(f1(x)==e^-(x-1)),expression(f2(x)==1/x^2))

 plot(x,g(x), type = "l", ylab = "",ylim = c(0,1), lwd = 0.25,col=1,main='(A)')
 lines(x, f1(x), lty = 2, lwd = 0.25,col=2)
 lines(x, f2(x), lty = 3, lwd = 0.25,col=3)
 legend("topright", legend = gs, lty = 1:3, lwd = 0.25, inset = 0.01,col=1:3)

because$var(\frac{g(x)}{f(x)})=E(\frac{g(x)}{f(x)})^2-[E(\frac{g(x)}{f(x)})]^2=\int(\frac{g(x)}{f(x)})^2f(x)dx-(\int\frac{g(x)}{f(x)}f(x)dx)^2=\int\frac{g(x)^2}{f(x)}dx-(\int g(x)dx)^2$,then we can only compare $\frac{g(x)^2}{f(x)}$

 gs1<-c(expression(g^2/f1),expression(g^2/f2))
 plot(x,g(x)^2/f1(x),type="l", ylab = "",ylim = c(0,0.3),lwd = 0.25,col=1,main='(B)')
 lines(x,g(x)^2/f2(x), lty = 2, lwd = 0.25,col=2)
 legend("topright", legend = gs1,lty = 1:2, lwd = 0.25, inset=0.1,col=1:2) 

The figue (B) shows that $\frac{g^2}{f_{1}}$ is small than $\frac{g^2}{f_{2}}$,so $f_{1}$has smaller variance.and it can be show in 5.14

Question

  • Exercise 5.14

    Obtain a Monte Carlo estimate of $$\int_{1}^{\infty}\frac{x^2}{\sqrt{2\pi}}e^{-x^2/2}dx$$ by importance sampling

Answer

 m<-1000
 n<-10000
 result1<-numeric(m)
 for(i in 1:m){
 y1<-rexp(n,1)+1
 result1[i]<-mean(g(y1)/f1(y1))
 }
 m1<-mean(result1)
 s1<-sd(result1)

 result2<-numeric(m)
 for(i in 1:m){
   u<-runif(n)
   y2<-1/(1-u)
   result2[i]<-mean(g(y2)/f2(y2))
 } 
 m2<-mean(result2)
 s2<-sd(result2)
 b<-matrix(c(m1,s1,m2,s2),2,2)
 colnames(b)<-c("f1","f2")
 rownames(b)<-c("theta","se")
 knitr::kable(b,format="markdown",caption = "Comparation of them",align = "c")

from table we can see that $f_{1}$has smaller variance

Question

  • Exercises 6.9

    Let $X$ be a non-negative random variable with $\mu=E[X]<\infty$. For a random sample $x_1,\cdots,x_n$ from the distribution of $X$ ,the Gini ratio is defined by $$G=\frac{1}{2n^2\mu}\sum_{j=1}^{n}\sum_{i=1}^{n}|x_i-x_j|$$ The Gini ratio is applied in economics to measure inequality in income distribution (see e.g. [163]). Note that $G$ can be written in terms of the order statistics $x_{(i)}$ as $$G=\frac{1}{n^2\mu}\sum_{i=1}^{n}(2i-n-1)x_{(i)}$$.If the mean is unknown,let $\hat{G}$ be the statistic $G$ with $\mu$ replacedby $\bar{x}$. Estimate by simulation the mean, median and deciles of $\hat{G}$ if $X$ is standard lognormal. Repeat the procedure for the uniform distribution and Bernoulli(0.1). Also construct density histograms of the replicates in each case.

Answer

G<-function(x){
  n<-length(x)
  a<-seq(1-n,n-1,2)
  x.i<-sort(x)
  G.hat<-sum(a*x.i)/(n*n*mean(x))
  return(G.hat)
} # you can estimate a G.hat if there comes a sample

set.seed(1012)
# if X is standard lognormal
n=500
m<-1000
G.hat1<-numeric(m)
for(i in 1:m){
  y<-rnorm(n)
  x<-exp(y) # then x is standard lognormal
  G.hat1[i]<-G(x)
}
result1<-c(mean(G.hat1),quantile(G.hat1,probs=c(0.5,0.1)))
names(result1)<-c("mean","median","deciles")
print(result1)
hist(G.hat1,breaks=seq(min(G.hat1)-0.01,max(G.hat1)+0.01,0.01),freq =F,main = "Histogram of G",xlab = "standard lognormal")

# if X is uniform
G.hat2<-numeric(m)
for(i in 1:m){
  x<-runif(n) # then x is uniform
  G.hat2[i]<-G(x)
}
result2<-c(mean(G.hat2),quantile(G.hat2,probs=c(0.5,0.1)))
names(result2)<-c("mean","median","deciles")
print(result2)
hist(G.hat2,breaks =seq(min(G.hat2)-0.01,max(G.hat2)+0.01,0.01) ,freq =F,main = "Histogram of G",xlab = "uniform")

#if x is  Bernoulli(0.1)
G.hat3<-numeric(m)
for(i in 1:m){
  x<-rbinom(n,1,0.1) # then x is Bernoulli(0.1)
  G.hat3[i]<-G(x)
}
result3<-c(mean(G.hat3),quantile(G.hat3,probs=c(0.5,0.1)))
names(result3)<-c("mean","median","deciles")
print(result3)
hist(G.hat3,breaks=seq(min(G.hat3)-0.01,max(G.hat3)+0.01,0.01),freq =F,main = "Histogram of G",xlab = "Bernoulli(0.1)")

Question

  • Exercises 6.10

    Construct an approximate 95% confidence intervalfor the Gini ratio $\gamma=E[G]$ if $X$ is lognormal with unknown parameters. Assess the coverage rate of the estimation procedure with a Monte Carlo experiment.

Answer

when we get the sample G ,we use $\hat\gamma=\bar{G},\hat\sigma_G=\frac{1}{m-1}\sum_{i=1}^{m}(G_i-\bar{G})^2$ for estimation,and the approximate 95% confidence interval is $(\bar{G}-\hat{\sigma t_{n-1}(\alpha/2)},\bar{G}+\hat{\sigma t_{n-1}(\alpha/2)})$

n<-500
m<-1000
mu<-1
sigma<-1
G1<-function(n,mu,sigma){
  y<-rnorm(n,mu,sigma)
  x<-exp(y)
  G.sample<-G(x)
  return(G.sample)
}
G.sp<-numeric(m)
for(i in 1:m){
G.sp[i]<-G1(n,mu,sigma)
}
#the  approximate 95% conffidence interval is
CI<-c(mean(G.sp)-sd(G.sp)*qt(0.975,m-1),mean(G.sp)+sd(G.sp)*qt(0.975,m-1))
print(CI)
cover.rate<-sum(I(G.sp>CI[1]&G.sp<CI[2]))/m
print(cover.rate)

Question

  • Exercises 6.B

    Tests for association based on Pearson product moment correlation $\rho$, Spearman's rank correlation coefficient $\rho_{s}$ or Kendall's coefficient $\tau$, are implemented in cor.test. Show (empirically) that the nonparametric tests based on $\rho_s$ or$\tau$ are less powerful than the correlation test when the sampled distribution is bivariate normal. Find an example of an alternative (a bivariate distribution (X,Y) such that X and Y are dependent) such that at least one of the nonparametric tests have better empirical power than the correlation test against this alternative.

Answer

we chose $$(X,Y)^{'}\sim N(O,\Sigma),with \Sigma= \left[ \begin{matrix} 1 & 0.2 \ 0.2 & 1 \end{matrix}\right] $$so the correlation of X and Y are 0.2

library(MASS)
m<-1000
sigma1<-matrix(c(1,0.2,0.2,1),2,2)
p.value1<-p.value2<-p.value3<-numeric(m)
for(i in 1:1000){
x1<-mvrnorm(20,mu=c(0,0),Sigma = sigma1)
p.value1[i]<-cor.test(x1[,1],x1[,2],method = "pearson")$p.value
p.value2[i]<-cor.test(x1[,1],x1[,2],method = "spearman")$p.value
p.value3[i]<-cor.test(x1[,1],x1[,2],method = "kendall")$p.value
}
#the power
power<-c(mean(p.value1<=0.05),mean(p.value2<=0.05),mean(p.value3<=0.05))
names(power)<-c("pearson","spearman","kendall")
print(power)

we can see that pearson is more powerful.

Question

  • Exercises 7.1

Compute a jackknife estimate of the bias and the standard error of the correlation statistic in Example 7.2.

Answer

library(boot)
library(bootstrap)
library(DAAG)
set.seed(1)
LSAT<-c(576, 635, 558, 578,666, 580, 555, 661, 651, 605, 653, 575, 545, 572, 594)
GPA<-c(339, 330, 281, 303, 344, 307, 300, 343, 336, 313, 312, 274, 276, 288, 296)
x<-cbind(LSAT,GPA)
n <-length(GPA)
R.hat <- cor(x[,1],x[,2]) 
R.jack <- numeric(n) 
for(i in 1:n)
  { R.jack[i] <-cor(x[-i,1],x[-i,2]) } 
bias.jack <- (n-1)*(mean(R.jack)-R.hat) 
se.jack <- sqrt((n-1)*mean((R.jack-R.hat)^2)) 
round(c(original=R.hat,bias=bias.jack, se=se.jack),3)

Question

  • Exercises 7.5

Refer to Exercise 7.4. Compute 95% bootstrap confidence intervals for the mean time between failures $1/\lambda$ by the standard normal, basic, percentile, and BCa methods. Compare the intervals and explain why they may differ.

Answer

We can use sample to estimate $1/\lambda$.

y<-c(3,5,7,18,43,85,91,98,100,130,230,487)
boot.mean <- function(y,i) mean(y[i]) 
de <- boot(data=y,statistic=boot.mean, R =1000) 
ci <- boot.ci(de,type=c("norm","basic","perc","bca")) 
ci$norm[2:3]
ci$basic[4:5]
ci$percent[4:5]
ci$bca[4:5]

Because they use the quantile of different distribution.

Question

  • Exercises 7.8

Refer to Exercise 7.7. Obtain the jackknife estimates of bias and standard error of $\hat{\theta}$.

Answer

attach(scor)
sigma.hat<-cov(scor)
n<-nrow(scor)
value<-eigen(sigma.hat)$value #get all Eigenvalues
theta.hat<-value[1]/sum(value)
theta.jack<- numeric(n) 
for(i in 1:n){ 
  sigma.1<-cov(scor[-i,])
  value.1<-eigen(sigma.1)$value
  theta.jack[i] <-value.1[1]/sum(value.1) 
  } 
bias.jack <- (n-1)*(mean(theta.jack)-theta.hat) 
se.jack <- sqrt((n-1)*mean((theta.jack-theta.hat)^2)) 
round(c(original=theta.hat,bias=bias.jack, se=se.jack),3)

Question

  • Exercises 7.11

In Example 7.18, leave-one-out (n-fold) cross validation was used to select the best fitting model. Use leave-two-out cross validation to compare the models.

Answer

attach(ironslag)
n <- length(magnetic) #in DAAG ironslag

e1 <- e2 <- e3 <- e4<- matrix(0,n*(n-1)/2,2)

# for n*(n-1)/2-fold cross validation
# fit models on leave-two-out samples 
i=0 
for (k in 1:(n-1)) { 
  for (j in (k+1):n) {
  i=i+1  

  y <- magnetic[-c(k,j)] 
  x <- chemical[-c(k,j)]

 J1 <- lm(y ~ x) 
 yhat1 <- J1$coef[1] + J1$coef[2] * chemical[c(k,j)] 
 e1[i,] <- magnetic[c(k,j)] - yhat1

 J2 <- lm(y ~ x + I(x^2)) 
 yhat2 <- J2$coef[1] + J2$coef[2] * chemical[c(k,j)] + J2$coef[3] * chemical[c(k,j)]^2 
 e2[i,] <- magnetic[c(k,j)] - yhat2

 J3 <- lm(log(y) ~ x) 
 logyhat3 <- J3$coef[1] + J3$coef[2] * chemical[c(k,j)] 
 yhat3 <- exp(logyhat3) 
 e3[i,] <- magnetic[c(k,j)] - yhat3

 J4 <- lm(log(y) ~ log(x)) 
 logyhat4 <- J4$coef[1] + J4$coef[2] * log(chemical[c(k,j)]) 
 yhat4 <- exp(logyhat4) 
 e4[i,] <- magnetic[c(k,j)] - yhat4
 }
}

c(mean(e1^2),mean(e2^2),mean(e3^2),mean(e4^2))

According to the prediction error criterion, Model 2, the quadratic model, would be the best fit for the data.

Question

  • Exercises 8.1

Implement the two-sample Cramer-von Mises test for equal distributions as a permutation test. Apply the test to the data in Examples 8.1 and 8.2.

  • Design experiments
    • Design experiments for evaluating the performance of the NN, energy, and ball methods in various situations.

      • Unequal variances and equal expectations
      • Unequal variances and unequal expectations
      • Non-normal distributions: t distribution with 1 df (heavy-tailed distribution), bimodel distribution (mixture of two normal distributions)
      • Unbalanced samples (say, 1 case versus 10 controls)

      • Note: The parameters should be choosen such that the powers are distinguishable (say, range from 0.3 to 0.9).

Answer

library(survival)
library(mvtnorm)
library(gam)
library(splines)
library(foreach)
library(RANN)
library(energy)
library(boot)
library(Ball)
set.seed(1)
attach(chickwts)
x <- sort(as.vector(weight[feed == "soybean"])) 
y <- sort(as.vector(weight[feed == "linseed"])) 
detach(chickwts)

CramerVonMisesTwoSamples <- function(x1,x2){
  Fx1<-ecdf(x1)
  Fx2<-ecdf(x2)
  n<-length(x1)
  m<-length(x2)
  w1<-sum((Fx1(x1)-Fx2(x1))^2)+sum((Fx1(x2)-Fx2(x2))^2) 
  w2<-w1*m*n/((m+n)^2)
  return(w2)
}  #get the Cramer-von Mises statistic

R <- 999
z <- c(x, y)
K<- 1:26
n<-length(x)
reps <- numeric(R)
t0 <-  CramerVonMisesTwoSamples (x,y)
for (i in 1:R) {
  k<- sample(K, size = n, replace = FALSE) 
  x1 <- z[k]
  y1 <- z[-k] #complement of x1 
  reps[i] <- CramerVonMisesTwoSamples (x1,y1)
  } 
  p <- mean(abs(c(t0, reps)) >= abs(t0)) 
  round(p,3)

  hist(reps, main = "", freq = FALSE, xlab = "Cramer-von Mises statistic", breaks = "scott")

  points(t0, 0, cex = 1, pch = 16) #observed T

Design experiments

m<-1e2;k<-3;p<-2;mu<-0.5;R<-999;
p.values <- matrix(0,m,3)

Tn<-function(z,ix,sizes,k){
n1<-sizes[1];n2<-sizes[2];n<-n1+n2
if(is.vector(z)) z<-data.frame(z,0)
z<-z[ix, ]
NN<-nn2(data=z,k=k+1)
block1<-NN$nn.idx[1:n1,-1] 
block2<-NN$nn.idx[(n1+1):n,-1] 
i1<-sum(block1 <= n1)
i2<-sum(block2 >n1) 
(i1+i2)/(k*n)
}

eqdist.nn<-function(z,sizes,k){
boot.obj<-boot(data=z,statistic=Tn,R=R,sim="permutation",sizes=sizes,k=k)
ts<-c(boot.obj$t0,boot.obj$t)
p.value<-mean(ts>=ts[1])
list(statistic=ts[1],p.value=p.value)
}

Unequal variances and equal expectations

n1<-n2<-50;n<-n1+n2;N=c(n1,n2)  
for(i in 1:m){
x<-rnorm(n1,0,1)
y<-rnorm(n2,0,2)
z<-c(x,y)
p.values[i,1]<-eqdist.nn(z,N,k)$p.value
p.values[i,2]<-eqdist.etest(z,sizes=N,R=R)$p.value
p.values[i,3]<-bd.test(x=x,y=y,R=999,seed=i*12345)$p.value
}
alpha<-0.1
pow<-colMeans(p.values<alpha)
pow

Unequal variances and unequal expectations

n1<-n2<-50;n<-n1+n2;N=c(n1,n2)  
for(i in 1:m){
x<-rnorm(n1,0,1)
y<-rnorm(n2,1,2)
z<-c(x,y)
p.values[i,1]<-eqdist.nn(z,N,k)$p.value
p.values[i,2]<-eqdist.etest(z,sizes=N,R=R)$p.value
p.values[i,3]<-bd.test(x=x,y=y,R=999,seed=i*12345)$p.value
}
alpha<-0.1
pow<-colMeans(p.values<alpha)
pow

t distribution with 1 df,bimodel distribution

n1<-n2<-50;n<-n1+n2;N=c(n1,n2)  
for(i in 1:m){
x<-rt(n1,1)
y<-c(rnorm(n2/2,5,1),rnorm(n2/2))
z<-c(x,y)
p.values[i,1]<-eqdist.nn(z,N,k)$p.value
p.values[i,2]<-eqdist.etest(z,sizes=N,R=R)$p.value
p.values[i,3]<-bd.test(x=x,y=y,R=999,seed=i*12345)$p.value
}
alpha<-0.1
pow<-colMeans(p.values<alpha)
pow

Unbalanced samples

n1<-10;n2<-100;n<-n1+n2;N=c(n1,n2) 
for(i in 1:m){
x<-rnorm(10)
y<-rnorm(100)
z<-c(x,y)
p.values[i,1]<-eqdist.nn(z,N,k)$p.value
p.values[i,2]<-eqdist.etest(z,sizes=N,R=R)$p.value
p.values[i,3]<-bd.test(x=x,y=y,R=999,seed=i*12345)$p.value
}
alpha<-0.1
pow<-colMeans(p.values<alpha) 
pow

Question

  • Exercises 9.3

Use the Metropolis-Hastings sampler to generate random variables from a standard Cauchy distribution. Discard the first 1000 of the chain, and compare the deciles of the generated observations with the deciles of the standard Cauchydistribution(see qcauchy or qt with df=1). Recall that a $Cauchy(\theta,\eta)$ distribution has density function $$f(x)=\frac{1}{\theta\pi(1+[(x-\eta)/\theta]^2)}, -\infty0$$. The standard Cauchy has the $Cauchy(\theta=1,\eta=0)$ density. (Note that the standard Cauchy density is equal to the Student t density with one degree of freedom.)

Answer

The cauchy distribution can be rewrite as $cauchy(\lambda,\mu)$: $$f(x)=\frac{\lambda}{\pi(\lambda^2+(x-\mu)^2)}, -\infty0, -\infty<\mu<\infty $$. And the standard cauchy has the $cauchy(\lambda=1,\mu=0)$.

 f1<-function(x,u,lamada){
    pi<-3.14159265
    return(lamada/(pi*(lamada^2+(x-u)^2)))
} 

cauchy.chain<-function(n,sigma,x0,u,lamada){
    x<-NULL
    x[1]<-x0
    e=runif(n)
    k<-0
    for(i in 2:n){
      y<-rnorm(1,x[i-1],sigma)
      if(e[i]<=(f1(y,u,lamada)/f1(x[i-1],u,lamada)))
      {x[i]<-y}
      else
      {x[i]<-x[i-1]
      k<-k+1}
    }
    return(list(x=x,k=k))
  } 
  n<-10000
  cauchy<-cauchy.chain(n,sigma=0.5,x0=0,u=0,lamada=1)
  refuse.pr<-cauchy$k/n
  refuse.pr
  #qq plot
  par(mfrow=c(1,1))
  qqplot(rcauchy(n-1000,0,1),cauchy$x[1001:n])
  qqline(cauchy$x[1001:n])
  #hist
  hist(cauchy$x[1001:n],freq=F,main="cauchy",breaks=60)
  curve(f1(x,0,1),add=TRUE)

Question

  • Exercises 9.6

Rao [220, Sec. 5g] presented an example on genetic linkage of 197 animals in four categories (also discussed in [67, 106, 171, 266]). The group sizes are (125,18,20,34). Assume that the probabilities of the corresponding multinomial distribution are $$(\frac{1}{2}+\frac{\theta}{4}, \frac{1-\theta}{4}, \frac{1-\theta}{4},\frac{\theta}{4})$$. Estimate the posterior distribution of $\theta$ given the observed sample, using one of the methods in this chapter.

Answer

The posterior distribution of $\theta$ given $(x_1,x_2,x_3,x_4)=(125,18,20,34)$ is therefore $$Pr(\theta|(x_1,x_2,x_3,x_4))=\frac{197!}{x_1!x_2!x_3!x_4!}p_1^{x_1}p_2^{x_2}p_3{x_3}p_4^{x_4}$$.where $(p_1,p_2,p_3,p_4)=(\frac{1}{2}+\frac{\theta}{4}, \frac{1-\theta}{4}, \frac{1-\theta}{4},\frac{\theta}{4})$

  w <-0.25 #width of the uniform support set 
  m <- 5000 #length of the chain 
  burn <- 1000 #burn-in time 
  y<-c(125,18,20,34)
  x <- numeric(m) #the chain

  prob <- function(b, y) { 
    # computes (without the constant) the target density 
    if (b < 0 || b >= 1) return (0)
    return((1/2+b/4)^y[1] * ((1-b)/4)^y[2] * ((1-b)/4)^y[3] * (b/4)^y[4])
  }

  u <- runif(m) #for accept/reject step
  v <- runif(m, -w, w) #proposal distribution 
  x[1] <-0.25 
  for (i in 2:m) { 
    z <- x[i-1] + v[i] 
    if (u[i] <= prob(z,y) / prob(x[i-1],y)) 
      x[i] <-z 
    else 
      x[i] <- x[i-1] 
  }

   xb <- x[(burn+1):m]
   xc<-mean(xb)
   print(xc)  #estimation value of theta
   print(y/sum(y))
   print(c(1/2+xc/4,(1-xc)/4,(1-xc)/4,xc/4))

Question

  • Exercises 11.4

    Find the intersection points $A(k)$ in $(0,\sqrt{k})$ of the curves $$S_{k-1}(a)=P(t(k-1)>\sqrt{\frac{a^2(k-1)}{k-a^2}})$$ and $$S_{k}(a)=P(t(k)>\sqrt{\frac{a^2k}{k+1-a^2}})$$, for $k = 4:25 ,100,500,1000$, where $t(k)$ is a Student t random variable with k degrees of freedom. (These intersection points determine the critical values for a t-test for scale-mixture errors proposed by Szekely [260].)

Answer

It is the root of $S(a)=S_{k-1}(a)-S_k(a)$.

set.seed(1)
S.k<-function(a,k)  pt(a^2*(k-1)/(k-a^2),df=k-1,lower.tail=F)-pt(a^2*k/(k+1-a^2),df=k,lower.tail=F)
k<-c(4:25,100,500,1000)
root<-numeric(length(k))
j<-0
for(i in k){
j<-j+1  
solution<-uniroot(S.k,c(1e-5,sqrt(i)-(1e-5)),k=i)
root[j]<-solution$root
}
s<-cbind(k,root)
knitr::kable(s,format="markdown",caption = "Comparation of them",align = "c")

Question

  • Exercises 9.6

For exercise 9.6, use the Gelman-Rubin method to monitor convergence of the chain, and run the chain until the chain has converged approximately to the target distribution according to $\hat{R}<1.2$.

Answer

set.seed(1)
Gelman.Rubin <- function(psi) { 
  # psi[i,j] is the statistic psi(X[i,1:j]) 
  # for chain in i-th row of X 
  psi <- as.matrix(psi) 
  n <- ncol(psi) 
  k <- nrow(psi)
  psi.means <- rowMeans(psi) #row means
  B <- n * var(psi.means) #between variance est. 
  psi.w <- apply(psi, 1, "var") #within variances 
  W <- mean(psi.w) #within est. 
  v.hat <- W*(n-1)/n + (B/n) #upper variance est. 
  r.hat <- v.hat / W #G-R statistic 
  return(r.hat) 
  }


prob <- function(b, y) { 
  # computes (without the constant) the target density 
  if (b < 0 || b >= 1) return (0)
  return((1/2+b/4)^y[1] * ((1-b)/4)^y[2] * ((1-b)/4)^y[3] * (b/4)^y[4])
}

theta.chain<-function(m,y,x1){
w <-0.25 #width of the uniform support set 
u <- runif(m) #for accept/reject step
v <- runif(m, -w, w) #proposal distribution 

x <- numeric(m) #the chain
x[1] <-x1 
for (i in 2:m) { 
  z <- x[i-1] + v[i] 
  if (u[i] <= prob(z,y) / prob(x[i-1],y)) 
    x[i] <-z 
  else 
    x[i] <- x[i-1] 
}
return(x)
}


b<- 1000 #burn-in time 
k<-4     #number of chains to generate
m <- 15000 #length of the chain 
y<-c(125,18,20,34)
x1 <- c(0.1,0.35,0.6,0.85)

#generate the chains 
x<- matrix(0, nrow=k, ncol=m) 
for (i in 1:k) 
  x[i, ] <- theta.chain(m,y,x1[i])

#compute diagnostic statistics 
psi <- t(apply(x, 1, cumsum)) 
for (i in 1:nrow(psi)) 
  psi[i,] <- psi[i,] / (1:ncol(psi)) 
print(Gelman.Rubin(psi))

#plot psi for the four chains 
par(mfrow=c(2,2),mar=c(2,2,2,2)) 
for (i in 1:k) 
  plot(psi[i, (b+1):m], type="l", xlab=i, ylab=bquote(psi)) 
par(mfrow=c(1,1)) #restore default

#plot the sequence of R-hat statistics 
rhat <- rep(0, m) 
for (j in (b+1):m) 
  rhat[j] <- Gelman.Rubin(psi[,1:j]) 
plot(rhat[(b+1):m], type="l", xlab="", ylab="R") 
abline(h=1.1, lty=2)

Question

  • Exercises 11.6

Write a function to compute the cdf of the Cauchy distribution, which has density $$\frac{1}{\theta\pi(1+[(x-\eta)/\theta]^2),-\infty0$. Compare your results to the results from the R function pcauchy. (Also see the source code in pcauchy.c.)

Answer

library(nloptr)
f<-function(y,sita,eta){
#sita mean scale parameter
#eta mean the location parameter
1/(sita*3.141592653*(1+((y-eta)/sita)^2))
}
# the cauchy pdf

pdf<-function(x,sita,eta,lower.tail=TRUE){
 if(lower.tail) res<-integrate(f,lower = -Inf,upper = x,rel.tol=.Machine$double.eps^0.25,sita=sita,eta=eta)
 else res<-integrate(f,lower = x,upper = Inf,rel.tol=.Machine$double.eps^0.25,sita=sita,eta=eta)
  return(res$value)
}
pdf(x=0,sita = 1,eta = 0)
pcauchy(0,location = 0,scale = 1)

pdf(x=2,sita = 2,eta =1,lower.tail = F )
pcauchy(2,location = 1,scale = 2,lower.tail = F)

we get the same result.

Question

  • Exercises 3

  • A-B-O blood type problem

  • Let the three alleles be A, B, and O.
        dat <- rbind(Genotype=c('AA','BB','OO','AO','BO','AB'),
                     Frequency=c('p2','q2','r2','2pr','2qr','2pq',1),
                     Count=c('nAA','nBB','nOO','nAO','nBO','nAB','n'))
    knitr::kable(dat,format='markdown',caption = "Comparation of them",align = "c")

Answer

We can see that the complete data likelihood is $$l(p,q|n_{AA},n_{BB},n_{OO},n_{A.},n_{B.},n_{AB})=2n_{AA}log(p)+2n_{BB}log(q)+2n_{OO}log(r)+(n_{A.}-n_{AA})log(2pr)+(n_{B.}-n_{BB})log(2qr)+n_{AB}log(2pq) $$ where $r=1-p-q$. and we can min $-E[l(p,q|n_{AA},n_{BB},n_{OO},n_{A.},n_{B.},n_{AB})]$.

# Mle function
eval_f0 <- function(x,x1,n.A=28,n.B=24,nOO=41,nAB=70) {
  #x[1] mean p , x1[1] mean p0
  #x[2] mean q , x1[2] mean q0
  r1<-1-sum(x1)
  nAA<-n.A*x1[1]^2/(x1[1]^2+2*x1[1]*r1)
  nBB<-n.B*x1[2]^2/(x1[2]^2+2*x1[2]*r1)
  r<-1-sum(x)
  return(-2*nAA*log(x[1])-2*nBB*log(x[2])-2*nOO*log(r)-
           (n.A-nAA)*log(2*x[1]*r)-(n.B-nBB)*log(2*x[2]*r)-nAB*log(2*x[1]*x[2]))
}


# constraint function 
eval_g0 <- function(x,x1,n.A=28,n.B=24,nOO=41,nAB=70) {
  return(sum(x)-0.999999)
}

opts <- list("algorithm"="NLOPT_LN_COBYLA",
             "xtol_rel"=1.0e-8)
mle<-NULL
r<-matrix(0,1,2)
r<-rbind(r,c(0.2,0.35))# the beginning value of p0 and q0
j<-2
while (sum(abs(r[j,]-r[j-1,]))>1e-8) {
res <- nloptr( x0=c(0.3,0.25),
               eval_f=eval_f0,
               lb = c(0,0), ub = c(1,1), 
               eval_g_ineq = eval_g0, 
               opts = opts, x1=r[j,],n.A=28,n.B=24,nOO=41,nAB=70 )
j<-j+1
r<-rbind(r,res$solution)
mle<-c(mle,eval_f0(x=r[j,],x1=r[j-1,]))
}
r  #the result of EM algorithm
mle #the max likelihood values

Question

  • Exercises 3

    Use both for loops and lapply() to fit linear models to the mtcars using the formulas stored in this list: formulas <- list( mpg ~ disp, mpg ~ I(1 / disp), mpg ~ disp + wt, mpg ~ I(1 / disp) + wt )

Answer

set.seed(1)
attach(mtcars)
formulas <- list( mpg ~ disp, mpg ~ I(1 / disp), mpg ~ disp + wt, mpg ~ I(1 / disp) + wt )
#for loops
out <- vector("list", length(formulas))
for (i in seq_along(formulas)) { out[[i]] <-lm(formulas[[i]]) }
out

#lapply
lapply(formulas,lm)

Question

  • Exercises 4

Fit the model mpg ~ disp to each of the bootstrap replicates of mtcars in the list below by using a for loop and lapply(). Can you do it without an anonymous function? bootstraps <- lapply(1:10, function(i) { rows <- sample(1:nrow(mtcars), rep = TRUE) mtcars[rows, ] })

Answer

bootstraps <- lapply(1:10, function(i) {
  rows <- sample(1:nrow(mtcars), rep = TRUE)
  mtcars[rows, ] })

#for loops
for(i in seq_along(bootstraps)){
  print(lm(mpg~disp,data =bootstraps[[i]]))
}

#lapply
lapply(bootstraps,lm,formula=mpg~disp)

Question

  • Exercises 5

For each model in the previous two exercises,extract $R^2$ using the function below. rsq <- function(mod) summary(mod)$r.squared

Answer

#in exercise 3
rsq <- function(mod) summary.lm(mod)$r.squared
#for loops
for (i in seq_along(formulas)) {
 print( rsq(lm(formulas[[i]])))
  }
#lapply
lapply(lapply(formulas,lm),rsq)
#in exercise 4
#for loops
for(i in seq_along(bootstraps)){
  print(rsq(lm(mpg~disp,data =bootstraps[[i]])))
}

#lapply
lapply(lapply(bootstraps,lm,formula=mpg~disp),rsq)

Question

  • Exercises 3 in page 213

The following code simulates the performance of a t-test for non-normal data. Use sapply() and an anonymous function to extract the p-value from every trial. trials <- replicate( 100, t.test(rpois(10, 10), rpois(7, 10)), simplify = FALSE ) Extra challenge: get rid of the anonymous function by using [[ directly.

Answer

#using anonymous function
trials <- replicate( 100, t.test(rpois(10, 10), rpois(7, 10)), simplify = FALSE )
p_value<-function(mod) mod$p.value
sapply(trials, p_value)
#using [[ directly

Question

  • Exercises 6 in page 214 Implement a combination of Map() and vapply() to create an lapply() variant that iterates in parallel over all of its inputs and stores its outputs in a vector (or a matrix). What arguments should the function take?

Answer

x <- list(a = c(1:3), b = c(4:8))

lapply.f<-function(x,f,...){
   r<-Map(f,x,...)
   n<-length(r[[1]])
   return(vapply(r,as.vector,numeric(n)))
}
lapply.f(x,mean)
lapply.f(x,quantile)

Question

  • Exercises 4

Make a faster version of chisq.test() that only computes the chi-squareteststatisticwhentheinputistwonumericvectors with no missing values. You can try simplifying chisq.test() or by coding from the mathematical definition (http://en. wikipedia.org/wiki/Pearson%27s_chi-squared_test).

Answer

library(microbenchmark)
set.seed(1)
my.chisq.statistic<-function(x,y){
  if(!is.vector(x) && !is.vector(y))
    stop("at least one of 'x' and 'y' is not a vector")
  if(typeof(x)=="character" || typeof(y)=="character")
    stop("at least one of 'x' and 'y' is not a numeric vector")
  if(any(x<0) || anyNA(x)) 
    stop("all entries of 'x' must be nonnegative and finite")
  if(any(y<0) || anyNA(y)) 
    stop("all entries of 'y' must be nonnegative and finite")
  if((n<-sum(x))==0) 
    stop("at least one entry of 'x' must be positive")
  if((n<-sum(x))==0) 
    stop("at least one entry of 'x' must be positive")
  if(length(x)!=length(y)) 
    stop("'x' and 'y' must have the same length")
  DNAME<-paste(deparse(substitute(x)),"and",deparse(substitute(y)))
  METHOD<-"Pearson's Chi-squared test"
  x<-rbind(x,y)
  nr<-as.integer(nrow(x));nc<-as.integer(ncol(x))
  sr<-rowSums(x);sc<-colSums(x);n<-sum(x)
  E<-outer(sr,sc,"*")/n
  STATISTIC<-sum((x - E)^2/E)
  names(STATISTIC)<-"X-squared"
  structure(list(statistic=STATISTIC,method=METHOD,data.name=DNAME),class="htest")
}

x<-sample(100:200,1e4,replace = TRUE)
y<-sample(100:200,1e4,replace = TRUE)
m<-rbind(x,y)
my.chisq.statistic(x,y)$statistic
chisq.test(m)$statistic
ts<-microbenchmark(mean1=my.chisq.statistic(x,y)$statistic,meanR2=chisq.test(m)$statistic)
summary(ts)[,c(1,3,5,6)]

Question

  • Exercises 5

Can you make a faster version of table() for the case of an input of two integer vectors with no missing values? Can you use it to speed up your chi-square test?

Answer

my.table<-function(...,dnn = list.names(...),deparse.level = 1){
    list.names <- function(...) {
        l <- as.list(substitute(list(...)))[-1L]
        nm <- names(l)
        fixup <- if (is.null(nm)) 
            seq_along(l)
        else nm == ""
        dep <- vapply(l[fixup], function(x) switch(deparse.level + 
            1, "", if (is.symbol(x)) as.character(x) else "", 
            deparse(x, nlines = 1)[1L]), "")
        if (is.null(nm)) 
            dep
        else {
            nm[fixup] <- dep
            nm
        }
    }
    args <- list(...)
    if (!length(args)) 
        stop("nothing to tabulate")
    if (length(args) == 1L && is.list(args[[1L]])) {
        args <- args[[1L]]
        if (length(dnn) != length(args)) 
            dnn <- if (!is.null(argn <- names(args))) 
                argn
            else paste(dnn[1L], seq_along(args), sep = ".")
    }
    bin <- 0L
    lens <- NULL
    dims <- integer()
    pd <- 1L
    dn <- NULL
    for (a in args) {
        if (is.null(lens)) 
            lens <- length(a)
        else if (length(a) != lens) 
            stop("all arguments must have the same length")
        fact.a <- is.factor(a)
        if (!fact.a) {
            a0 <- a
            a <- factor(a)
        }
        ll <- levels(a)
        a <- as.integer(a)
        nl <- length(ll)
        dims <- c(dims, nl)
        dn <- c(dn, list(ll))
        bin <- bin + pd * (a - 1L)
        pd <- pd * nl
    }
    names(dn) <- dnn
    bin <- bin[!is.na(bin)]
    if (length(bin)) 
        bin <- bin + 1L
    y <- array(tabulate(bin, pd), dims, dimnames = dn)
    class(y) <- "table"
    y
}

mya<-myb<-c(1,seq(1,4))            #example
my.table(mya,myb)
table(mya,myb)
microbenchmark(t1=my.table(mya,myb),t2=table(mya,myb)) 


JieHu18039/StatComp documentation built on May 18, 2019, 12:25 p.m.