library(knitr) knitr::opts_chunk$set(fig.align="center")
Load the rgs3
package (available here):
suppressPackageStartupMessages(library(rgs3)) packageVersion("rgs3")
Choose the number of cores to run cross validation in parallel (if available):
suppressPackageStartupMessages(library(parallel)) nb.cores <- max(detectCores() - 1, 1)
Then, read first the vignette intro-rgs3
before this one.
This R chunk is used to assess how much time it takes to execute the R code in this document until the end:
t0 <- proc.time()
This vignette uses the same data as the intro-rgs3
vignette.
Load phenotypes:
phenos.file <- system.file("extdata", "phenos_df.txt.gz", package="rgs3") tools::md5sum(path.expand(phenos.file)) phenos <- read.table(phenos.file, header=TRUE) phenos$year <- as.factor(phenos$year) str(phenos)
Load genotypes:
genos.file <- system.file("extdata", "genos_mat.txt.gz", package="rgs3") tools::md5sum(path.expand(genos.file)) genos <- as.matrix(read.table(genos.file)) dim(genos)
Let us choose an identifier for our analysis, which will be used for all files, so that it will be easy to remove them at the end:
task.id <- "task-execGS3_crossval-rgs3"
Use the utility function:
ptl <- data.frame(position = c(which(colnames(phenos) == "year"), ncol(phenos) + 1), type = c("cross", "add_SNP"), nlevels = c(length(levels(phenos$year)), 0), stringsAsFactors = FALSE) config <- getDefaultConfig( nb.snps = ncol(genos), rec.id = which(colnames(phenos) == "geno"), twc = c(which(colnames(phenos) == "response1"), 0), method = "VCE", ptl = ptl, task.id = task.id)
Customize the rest:
vc <- data.frame(var = c("vara","vard","varg","varp", "vare"), exp = c("0.025", "0", "0", "0", "1"), df = rep("2", 5), stringsAsFactors = FALSE) config$vc <- vc config$niter <- 2000 config$burnin <- 200 config$thin <- 2
Note that the number of iterations here is voluntarily low, so that the vignette doesn't take too long to run.
Let's use 5 folds for this vignette:
results.cv <- crossValWithGs3( genos = genos, dat = phenos, config = config, task.id = task.id, nb.folds = 5, seed = 123, remove.files = "all", nb.cores = nb.cores, verbose = 2)
results.cv apply(results.cv[,c("rmspe","cor.p","cor.p.div.h","reg.intercept","reg.slope")], 2, function(x){ c(mean=mean(x), sd=sd(x)) })
It is also possible to have multiple replicates, using the function crossValRepWithGs3
.
t1 <- proc.time(); t1 - t0 print(sessionInfo(), locale=FALSE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.