set_cv: Gather settings for the cross-validation procedure used in...

Description Usage Arguments Details Value See Also Examples

View source: R/cross_validate.R

Description

The cross-validation procedure uses the variational lower bound as objective function and is used to select the prior average number of predictors p0_av expected to be included in the model. p0_av is used to set the model hyperparameters and ensure sparse predictor selections.

Usage

1
2
set_cv(n, p, n_folds, size_p0_av_grid, n_cpus, tol_cv = 0.001,
  maxit_cv = 1000, verbose = TRUE)

Arguments

n

Number of observations.

p

Number of candidate predictors.

n_folds

Number of number of folds. Large folds are not recommended for large datasets as the procedure may become computationally expensive. Must be greater than 2 and smaller than the number of observations.

size_p0_av_grid

Number of possible values of p0_av to be compared. Large numbers are not recommended for large datasets as the procedure may become computationally expensive.

n_cpus

Number of CPUs to be used for the cross-validation procedure. If large, one should ensure that enough RAM will be available for parallel execution. Set to 1 for serial execution.

tol_cv

Tolerance for the variational algorithm stopping criterion used within the cross-validation procedure.

maxit_cv

Maximum number of iterations allowed for the variational algorithm used within the cross-validation procedure.

verbose

If TRUE, messages are displayed when calling set_cv.

Details

This cross-validation procedure is available only for link = "identity".

Value

An object of class "cv" preparing the settings for the cross-validation settings in a form that can be passed to the locus function.

See Also

locus

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
seed <- 123; set.seed(seed)

###################
## Simulate data ##
###################

## Example using small problem sizes:
##
n <- 150; p <- 200; p0 <- 50; d <- 25; d0 <- 20

## Candidate predictors (subject to selection)
##
# Here we simulate common genetic variants (but any type of candidate
# predictors can be supplied).
# 0 = homozygous, major allele, 1 = heterozygous, 2 = homozygous, minor allele
#
X_act <- matrix(rbinom(n * p0, size = 2, p = 0.25), nrow = n)
X_inact <- matrix(rbinom(n * (p - p0), size = 2, p = 0.25), nrow = n)

shuff_x_ind <- sample(p)
X <- cbind(X_act, X_inact)[, shuff_x_ind]

bool_x_act <- shuff_x_ind <= p0

pat_act <- beta <- matrix(0, nrow = p0, ncol = d0)
pat_act[sample(p0*d0, floor(p0*d0/5))] <- 1
beta[as.logical(pat_act)] <-  rnorm(sum(pat_act))

## Gaussian responses
##
Y_act <- matrix(rnorm(n * d0, mean = X_act %*% beta, sd = 0.5), nrow = n)
Y_inact <- matrix(rnorm(n * (d - d0), sd = 0.5), nrow = n)
shuff_y_ind <- sample(d)
Y <- cbind(Y_act, Y_inact)[, shuff_y_ind]

########################
## Infer associations ##
########################

list_cv <- set_cv(n, p, n_folds = 3, size_p0_av_grid = 3, n_cpus = 2)

vb <- locus(Y = Y, X = X, p0_av = NULL, link = "identity", list_cv = list_cv,
            user_seed = seed)

hruffieux/locus documentation built on Oct. 22, 2018, 6:54 a.m.