cv.pengls | R Documentation |
Peform cross-validation pengls
cv.pengls(
data,
glsSt,
xNames,
outVar,
corMat,
nfolds,
foldid,
scale = FALSE,
center = FALSE,
cvType = "blocked",
lambdas,
transFun = "identity",
exclude = NULL,
transFunArgs = list(),
loss = c("R2", "MSE"),
verbose = FALSE,
...
)
data |
A data matrix or data frame |
glsSt |
a covariance structure, as supplied to nlme::gls as "correlation" |
xNames |
names of the regressors in data |
outVar |
name of the outcome variable in data |
corMat |
a starting value for the correlation matrix. Taken to be a diagonal matrix if missing |
nfolds |
an integer, the number of folds used in cv.glmnet to find lambda |
foldid |
An optional vector defining the fold |
scale, center |
booleans, should regressors be scaled to zero mean and variance 1? Defaults to TRUE |
cvType |
A character vector defining the type of cross-validation. Either "random" or "blocked", ignored if foldid is provided |
lambdas |
an optional lambda sequence |
transFun |
a transformation function to apply to predictions and outcome in the cross-validation |
exclude |
indices of predictors to be excluded from intercept + xNames |
transFunArgs |
Additional arguments passed onto transFun |
loss |
a character vector, currently either 'R2' or 'MSE' indicating the loss function (although R2 is not a proper loss...) |
verbose |
a boolean, should output be printed? |
... |
passed onto glmnet::glmnet |
A list with components
lambda |
The series of lambdas |
cvm |
The vector of mean R2's |
cvsd |
The standard error of R2 at the maximum |
cvOpt |
The R2 according to the 1 standard error rule |
coefs |
The matrix of coefficients for every lambda value |
bestFit |
The best fitting pengls model according to the 1 standard error rule |
lambda.min |
Lambda value with maximal R2 |
lambda.1se |
Smallest lambda value within 1 standard error from the maximum |
foldid |
The folds |
glsSt |
The nlme correlation object |
loss |
The loss function used |
library(nlme)
library(BiocParallel)
n <- 20 #Sample size
p <- 50 #Number of features
g <- 10 #Size of the grid
#Generate grid
Grid <- expand.grid("x" = seq_len(g), "y" = seq_len(g))
# Sample points from grid without replacement
GridSample <- Grid[sample(nrow(Grid), n, replace = FALSE),]
#Generate outcome and regressors
b <- matrix(rnorm(p*n), n , p)
a <- rnorm(n, mean = b %*% rbinom(p, size = 1, p = 0.2)) #20% signal
#Compile to a matrix
df <- data.frame("a" = a, "b" = b, GridSample)
# Define the correlation structure (see ?nlme::gls), with initial nugget 0.5 and range 5
corStruct = corGaus(form = ~ x + y, nugget = TRUE,
value = c("range" = 5, "nugget" = 0.5))
#Fit the pengls model, for simplicity for a simple lambda
register(MulticoreParam(3)) #Prepare multithereading
penglsFitCV = cv.pengls(data = df, outVar = "a", xNames = grep(names(df),
pattern = "b", value = TRUE),
glsSt = corStruct, nfolds = 5)
penglsFitCV$lambda.1se #Lambda for 1 standard error rule
penglsFitCV$cvOpt #Corresponding R2
coef(penglsFitCV)
penglsFitCV$foldid #The folds used
#With MSE as loss function
penglsFitCVmse = cv.pengls(data = df, outVar = "a",
xNames = grep(names(df), pattern = "b", value =TRUE),
glsSt = corStruct, nfolds = 5, loss = "MSE")
penglsFitCVmse$lambda.1se #Lambda for 1 standard error rule
penglsFitCVmse$cvOpt #Corresponding MSE
coef(penglsFitCVmse)
predict(penglsFitCVmse)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.