MCMCCV: Inference and model selection for analysis of geographical...

Description Usage Arguments Value Author(s) Examples

Description

MCMC inference and model selection by cross-validation for the analysis of geographical genetic variation

Usage

1
2
3
MCMCCV(gen, D_G, D_E,
                nit, thinning, theta.max, theta.init,
                run, ud, n.validation.set,print.pct)

Arguments

gen

An array with dimensions (n,l,a) n: number of geographical locations, l: number of loci, a: max number of alleles

D_G

A matrix of geograpical distances

D_E

A matrix of environmental distances

nit

Number of iterations

thinning

Thinning of MCMC iterations

theta.max

Upper bounds for the vector of parameters in theta. Note that in theta, the parameters are assumed to be in this order: (alpha,beta_G, beta_E, gamma, delta)

theta.init

Initial state of theta

run

A vector of booleans of length 3 stating which sub-model is investigated among G+E,G,E (in this order). For example, the default value run=c(TRUE,FALSE,FALSE) means that only one MCMC run will be performed to estimate paremeters in the G+E model while e.g. run=c(TRUE,FALSE,TRUE) means that MCMC runs will be performed for models G+E and E. If n.validation.set >0 likelihoods under these two models will be returned for model selection.

ud

A vector of booleans of length 5 stating which entries in theta=(alpha,beta_E,beta_G,gamma,delta) will be updated in the MCMC iterations. By default all parameters in theta will be updated. If one entry is not updated, the value used along the MCMC simulation for this parameter is the initial value.

n.validation.set

The number of pairs (sites x locus) used as validation set

print.pct

A boolean stating whether Fortran prints percentage of computation achieved along each MCMC run.

Value

A list with a component named mod.lik containing likelihoods on the validation set for the various models compared.

Author(s)

Filippo Botta, Gilles Guillot

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
## Not run: 
data(toydata,package='Sunder')
     
     #### Computing options
     nit <- 10^2
     run  <- c(1,1,1)
     thinning  <- 1 # max(nit/10^3,1);
     ud   <- c(0,1,1,0,0) 
     theta.init <- c(1,2,1,1,0.01)
     n.validation.set  <- dim(gen)[1]*dim(gen)[2]/10 
     theta.max  <- c(10,10*max(D_G),10*max(D_E),1,0.01)
     
     plot  <- TRUE
     trace <- FALSE
     
     #### Call Sunder ####
     output <- MCMCCV(gen,D_G,D_E,
                                nit,thinning,
                                theta.max,
                                theta.init,
                                run,ud,n.validation.set)
     
     mod.lik <- output$mod.lik
     tvt <- output$theta
     
     
     ## plotting outputs
     upd=matrix(nrow=sum(run), ncol=length(theta.init), data=1)
     upd[2,3]=upd[3,2]=0
     
     plot(as.vector(D_G),as.vector(cor(t(gen[,,1]))),
          bg=colorRampPalette(c("blue", "red"))(dim(D_E)[1]^2)[order(order(as.vector(D_E)))],
          pch=21,
          xlab='Geographic distance',
          ylab='Empirical covariance genotypes')
     
     
     kol=c(4,2,3) 
     xseq=seq(thinning,nit,thinning)
     ylab=c(expression(paste(alpha)),
            expression(paste(beta[D])),
            expression(paste(beta[E])),
            expression(paste(gamma)),
            expression(paste(delta)))
     
     par(mfrow=c(sum(run),length(theta.init)))
     for (j in 1:sum(run))
     {
       for(k in 1:length(theta.init))
       {
         if (sum(upd[,k]==1)>0)
         {
           if(upd[j, k]==1)
           {
             if(exists("theta"))
               ylim=c(min(tvt[k,,j],theta[k]),max(tvt[k,,j],theta[k])) else
                 ylim=c(min(tvt[k,,j]),max(tvt[k,,j]))
             plot(0, type="n",xlab="",ylab="", xlim=c(0,nit), ylim=ylim)
             lines(xseq,tvt[k,,j],col=kol[j],xlab="",ylab="")
             if(exists("theta")) abline(h=theta[k],lty=2)
             title(xlab="iterations")        
             mtext(ylab[k], side=2, line=2.3,las=1)} else plot.new()
         }
       }
     }

print(mod.lik)
print(paste('The model achieving the highest likelihood on the validation set is:',
            names(mod.lik)[order(mod.lik,decreasing=TRUE)[1]]))
theta.GE <- apply(output$theta[,,1],1,mean)
print('Posterior mean theta under model G+E:')
print(theta.GE)

theta.G <- apply(output$theta[,,2],1,mean)
theta.G[3] <- NA
print('Posterior mean theta under model G:')
print(theta.G)

theta.E <- apply(output$theta[,,3],1,mean)
theta.E[2] <- NA
print('Posterior mean theta under model E:')
print(theta.E)


## End(Not run)

Example output

Loading required package: mnormt
Computations for model 'geog+envi'
% of computations 
[1] 1
% of computations 
[1] 2
% of computations 
[1] 3
% of computations 
[1] 4
% of computations 
[1] 5
% of computations 
[1] 6
% of computations 
[1] 7
% of computations 
[1] 8
% of computations 
[1] 9
% of computations 
[1] 10
% of computations 
[1] 11
% of computations 
[1] 12
% of computations 
[1] 13
% of computations 
[1] 14
% of computations 
[1] 15
% of computations 
[1] 16
% of computations 
[1] 17
% of computations 
[1] 18
% of computations 
[1] 19
% of computations 
[1] 20
% of computations 
[1] 21
% of computations 
[1] 22
% of computations 
[1] 23
% of computations 
[1] 24
% of computations 
[1] 25
% of computations 
[1] 26
% of computations 
[1] 27
% of computations 
[1] 28
% of computations 
[1] 29
% of computations 
[1] 30
% of computations 
[1] 31
% of computations 
[1] 32
% of computations 
[1] 33
% of computations 
[1] 34
% of computations 
[1] 35
% of computations 
[1] 36
% of computations 
[1] 37
% of computations 
[1] 38
% of computations 
[1] 39
% of computations 
[1] 40
% of computations 
[1] 41
% of computations 
[1] 42
% of computations 
[1] 43
% of computations 
[1] 44
% of computations 
[1] 45
% of computations 
[1] 46
% of computations 
[1] 47
% of computations 
[1] 48
% of computations 
[1] 49
% of computations 
[1] 50
% of computations 
[1] 51
% of computations 
[1] 52
% of computations 
[1] 53
% of computations 
[1] 54
% of computations 
[1] 55
% of computations 
[1] 56
% of computations 
[1] 57
% of computations 
[1] 58
% of computations 
[1] 59
% of computations 
[1] 60
% of computations 
[1] 61
% of computations 
[1] 62
% of computations 
[1] 63
% of computations 
[1] 64
% of computations 
[1] 65
% of computations 
[1] 66
% of computations 
[1] 67
% of computations 
[1] 68
% of computations 
[1] 69
% of computations 
[1] 70
% of computations 
[1] 71
% of computations 
[1] 72
% of computations 
[1] 73
% of computations 
[1] 74
% of computations 
[1] 75
% of computations 
[1] 76
% of computations 
[1] 77
% of computations 
[1] 78
% of computations 
[1] 79
% of computations 
[1] 80
% of computations 
[1] 81
% of computations 
[1] 82
% of computations 
[1] 83
% of computations 
[1] 84
% of computations 
[1] 85
% of computations 
[1] 86
% of computations 
[1] 87
% of computations 
[1] 88
% of computations 
[1] 89
% of computations 
[1] 90
% of computations 
[1] 91
% of computations 
[1] 92
% of computations 
[1] 93
% of computations 
[1] 94
% of computations 
[1] 95
% of computations 
[1] 96
% of computations 
[1] 97
% of computations 
[1] 98
% of computations 
[1] 99
% of computations 
[1] 100
Computations for model 'geog'
% of computations 
[1] 1
% of computations 
[1] 2
% of computations 
[1] 3
% of computations 
[1] 4
% of computations 
[1] 5
% of computations 
[1] 6
% of computations 
[1] 7
% of computations 
[1] 8
% of computations 
[1] 9
% of computations 
[1] 10
% of computations 
[1] 11
% of computations 
[1] 12
% of computations 
[1] 13
% of computations 
[1] 14
% of computations 
[1] 15
% of computations 
[1] 16
% of computations 
[1] 17
% of computations 
[1] 18
% of computations 
[1] 19
% of computations 
[1] 20
% of computations 
[1] 21
% of computations 
[1] 22
% of computations 
[1] 23
% of computations 
[1] 24
% of computations 
[1] 25
% of computations 
[1] 26
% of computations 
[1] 27
% of computations 
[1] 28
% of computations 
[1] 29
% of computations 
[1] 30
% of computations 
[1] 31
% of computations 
[1] 32
% of computations 
[1] 33
% of computations 
[1] 34
% of computations 
[1] 35
% of computations 
[1] 36
% of computations 
[1] 37
% of computations 
[1] 38
% of computations 
[1] 39
% of computations 
[1] 40
% of computations 
[1] 41
% of computations 
[1] 42
% of computations 
[1] 43
% of computations 
[1] 44
% of computations 
[1] 45
% of computations 
[1] 46
% of computations 
[1] 47
% of computations 
[1] 48
% of computations 
[1] 49
% of computations 
[1] 50
% of computations 
[1] 51
% of computations 
[1] 52
% of computations 
[1] 53
% of computations 
[1] 54
% of computations 
[1] 55
% of computations 
[1] 56
% of computations 
[1] 57
% of computations 
[1] 58
% of computations 
[1] 59
% of computations 
[1] 60
% of computations 
[1] 61
% of computations 
[1] 62
% of computations 
[1] 63
% of computations 
[1] 64
% of computations 
[1] 65
% of computations 
[1] 66
% of computations 
[1] 67
% of computations 
[1] 68
% of computations 
[1] 69
% of computations 
[1] 70
% of computations 
[1] 71
% of computations 
[1] 72
% of computations 
[1] 73
% of computations 
[1] 74
% of computations 
[1] 75
% of computations 
[1] 76
% of computations 
[1] 77
% of computations 
[1] 78
% of computations 
[1] 79
% of computations 
[1] 80
% of computations 
[1] 81
% of computations 
[1] 82
% of computations 
[1] 83
% of computations 
[1] 84
% of computations 
[1] 85
% of computations 
[1] 86
% of computations 
[1] 87
% of computations 
[1] 88
% of computations 
[1] 89
% of computations 
[1] 90
% of computations 
[1] 91
% of computations 
[1] 92
% of computations 
[1] 93
% of computations 
[1] 94
% of computations 
[1] 95
% of computations 
[1] 96
% of computations 
[1] 97
% of computations 
[1] 98
% of computations 
[1] 99
% of computations 
[1] 100
Computations for model 'envi'

% of computations 
[1] 1
% of computations 
[1] 2
% of computations 
[1] 3
% of computations 
[1] 4
% of computations 
[1] 5
% of computations 
[1] 6
% of computations 
[1] 7
% of computations 
[1] 8
% of computations 
[1] 9
% of computations 
[1] 10
% of computations 
[1] 11
% of computations 
[1] 12
% of computations 
[1] 13
% of computations 
[1] 14
% of computations 
[1] 15
% of computations 
[1] 16
% of computations 
[1] 17
% of computations 
[1] 18
% of computations 
[1] 19
% of computations 
[1] 20
% of computations 
[1] 21
% of computations 
[1] 22
% of computations 
[1] 23
% of computations 
[1] 24
% of computations 
[1] 25
% of computations 
[1] 26
% of computations 
[1] 27
% of computations 
[1] 28
% of computations 
[1] 29
% of computations 
[1] 30
% of computations 
[1] 31
% of computations 
[1] 32
% of computations 
[1] 33
% of computations 
[1] 34
% of computations 
[1] 35
% of computations 
[1] 36
% of computations 
[1] 37
% of computations 
[1] 38
% of computations 
[1] 39
% of computations 
[1] 40
% of computations 
[1] 41
% of computations 
[1] 42
% of computations 
[1] 43
% of computations 
[1] 44
% of computations 
[1] 45
% of computations 
[1] 46
% of computations 
[1] 47
% of computations 
[1] 48
% of computations 
[1] 49
% of computations 
[1] 50
% of computations 
[1] 51
% of computations 
[1] 52
% of computations 
[1] 53
% of computations 
[1] 54
% of computations 
[1] 55
% of computations 
[1] 56
% of computations 
[1] 57
% of computations 
[1] 58
% of computations 
[1] 59
% of computations 
[1] 60
% of computations 
[1] 61
% of computations 
[1] 62
% of computations 
[1] 63
% of computations 
[1] 64
% of computations 
[1] 65
% of computations 
[1] 66
% of computations 
[1] 67
% of computations 
[1] 68
% of computations 
[1] 69
% of computations 
[1] 70
% of computations 
[1] 71
% of computations 
[1] 72
% of computations 
[1] 73
% of computations 
[1] 74
% of computations 
[1] 75
% of computations 
[1] 76
% of computations 
[1] 77
% of computations 
[1] 78
% of computations 
[1] 79
% of computations 
[1] 80
% of computations 
[1] 81
% of computations 
[1] 82
% of computations 
[1] 83
% of computations 
[1] 84
% of computations 
[1] 85
% of computations 
[1] 86
% of computations 
[1] 87
% of computations 
[1] 88
% of computations 
[1] 89
% of computations 
[1] 90
% of computations 
[1] 91
% of computations 
[1] 92
% of computations 
[1] 93
% of computations 
[1] 94
% of computations 
[1] 95
% of computations 
[1] 96
% of computations 
[1] 97
% of computations 
[1] 98
% of computations 
[1] 99
% of computations 
[1] 100
Warning message:
In cor(t(gen[, , 1])) : the standard deviation is zero
      G+E         G         E 
-4865.776 -5043.479 -4592.177 
[1] "The model achieving the highest likelihood on the validation set is: E"
[1] "Posterior mean theta under model G+E:"
   alpha   beta_G   beta_E    gamma    delta 
1.000000 2.044502 1.069109 1.000000 0.010000 
[1] "Posterior mean theta under model G:"
  alpha  beta_G  beta_E   gamma   delta 
1.00000 2.08815      NA 1.00000 0.01000 
[1] "Posterior mean theta under model E:"
   alpha   beta_G   beta_E    gamma    delta 
1.000000       NA 1.148272 1.000000 0.010000 

Sunder documentation built on May 2, 2019, 2:17 p.m.