subsampling: Subsampling

Description Usage Arguments Value Author(s) Examples

View source: R/subsampling.R

Description

Linear multiple output subsampling using multiple possessors

Usage

1
2
3
4
subsampling(x, y, intercept = TRUE, weights = NULL, grouping = NULL,
  groupWeights = NULL, parameterWeights = NULL, alpha = 1, lambda,
  d = 100, train, test, collapse = FALSE, max.threads = NULL,
  use_parallel = FALSE, algorithm.config = lsgl.standard.config)

Arguments

x

design matrix, matrix of size N \times p.

y

response matrix, matrix of size N \times K.

intercept

should the model include intercept parameters.

weights

sample weights, vector of size N \times K.

grouping

grouping of features, a factor or vector of length p. Each element of the factor/vector specifying the group of the feature.

groupWeights

the group weights, a vector of length m (the number of groups).

parameterWeights

a matrix of size K \times p.

alpha

the α value 0 for group lasso, 1 for lasso, between 0 and 1 gives a sparse group lasso penalty.

lambda

lambda.min relative to lambda.max or the lambda sequence for the regularization path (that is a vector or a list of vectors with the lambda sequence for the subsamples).

d

length of lambda sequence (ignored if length(lambda) > 1)

train

a list of training samples, each item of the list corresponding to a subsample. Each item in the list must be a vector with the indices of the training samples for the corresponding subsample. The length of the list must equal the length of the test list.

test

a list of test samples, each item of the list corresponding to a subsample. Each item in the list must be vector with the indices of the test samples for the corresponding subsample. The length of the list must equal the length of the training list.

collapse

if TRUE the results for each subsample will be collapse into one result (this is useful if the subsamples are not overlapping)

max.threads

Deprecated (will be removed in 2018), instead use use_parallel = TRUE and registre parallel backend (see package 'doParallel'). The maximal number of threads to be used.

use_parallel

If TRUE the foreach loop will use %dopar%. The user must registre the parallel backend.

algorithm.config

the algorithm configuration to be used.

Value

Yhat

if collapse = FALSE then a list of length length(test) containing the predicted responses for each of the test sets. If collapse = TRUE a list of length length(lambda)

Y.true

a list of length length(test) containing the true responses of the test samples

features

number of features used in the models

parameters

number of parameters used in the models.

Author(s)

Martin Vincent

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
set.seed(100) # This may be removed, it ensures consistency of the daily tests

## Simulate from Y=XB+E, the dimension of Y is N x K, X is N x p, B is p x K

N <- 100 #number of samples
p <- 50 #number of features
K <- 25  #number of groups

B <- matrix(sample(c(rep(1,p*K*0.1),rep(0, p*K-as.integer(p*K*0.1)))),nrow=p,ncol=K)
X1 <- matrix(rnorm(N*p,1,1),nrow=N,ncol=p)
Y1 <- X1%*%B+matrix(rnorm(N*K,0,1),N,K)

## Do cross subsampling

train <- replicate(2, sample(1:N, 50), simplify = FALSE)
test <- lapply(train, function(idx) (1:N)[-idx])

lambda <- lapply(train, function(idx)
lsgl::lambda(
	x = X1[idx,],
	y = Y1[idx,],
	alpha = 1,
	d = 15L,
	lambda.min = 5,
	intercept = FALSE)
)

fit.sub <- lsgl::subsampling(
 x = X1,
 y = Y1,
 alpha = 1,
 lambda = lambda,
 train = train,
 test = test,
 intercept = FALSE
)

Err(fit.sub)

## Do the same cross subsampling using 2 parallel units
cl <- makeCluster(2)
registerDoParallel(cl)

# Run subsampling
# Using a lambda sequence ranging from the maximal lambda to 0.1 * maximal lambda
fit.sub <- lsgl::subsampling(
 x = X1,
 y = Y1,
 alpha = 1,
 lambda = 0.1,
 train = train,
 test = test,
 intercept = FALSE
)

stopCluster(cl)

Err(fit.sub)

lsgl documentation built on May 29, 2017, 11:43 a.m.