cv.pengls: Peform cross-validation pengls

View source: R/cv.pengls.R

cv.penglsR Documentation

Peform cross-validation pengls

Description

Peform cross-validation pengls

Usage

cv.pengls(
  data,
  glsSt,
  xNames,
  outVar,
  corMat,
  nfolds,
  foldid,
  scale = FALSE,
  center = FALSE,
  cvType = "blocked",
  lambdas,
  transFun = "identity",
  exclude = NULL,
  transFunArgs = list(),
  loss = c("R2", "MSE"),
  verbose = FALSE,
  ...
)

Arguments

data

A data matrix or data frame

glsSt

a covariance structure, as supplied to nlme::gls as "correlation"

xNames

names of the regressors in data

outVar

name of the outcome variable in data

corMat

a starting value for the correlation matrix. Taken to be a diagonal matrix if missing

nfolds

an integer, the number of folds used in cv.glmnet to find lambda

foldid

An optional vector defining the fold

scale, center

booleans, should regressors be scaled to zero mean and variance 1? Defaults to TRUE

cvType

A character vector defining the type of cross-validation. Either "random" or "blocked", ignored if foldid is provided

lambdas

an optional lambda sequence

transFun

a transformation function to apply to predictions and outcome in the cross-validation

exclude

indices of predictors to be excluded from intercept + xNames

transFunArgs

Additional arguments passed onto transFun

loss

a character vector, currently either 'R2' or 'MSE' indicating the loss function (although R2 is not a proper loss...)

verbose

a boolean, should output be printed?

...

passed onto glmnet::glmnet

Value

A list with components

lambda

The series of lambdas

cvm

The vector of mean R2's

cvsd

The standard error of R2 at the maximum

cvOpt

The R2 according to the 1 standard error rule

coefs

The matrix of coefficients for every lambda value

bestFit

The best fitting pengls model according to the 1 standard error rule

lambda.min

Lambda value with maximal R2

lambda.1se

Smallest lambda value within 1 standard error from the maximum

foldid

The folds

glsSt

The nlme correlation object

loss

The loss function used

Examples

library(nlme)
library(BiocParallel)
n <- 20 #Sample size
p <- 50 #Number of features
g <- 10 #Size of the grid
#Generate grid
Grid <- expand.grid("x" = seq_len(g), "y" = seq_len(g))
# Sample points from grid without replacement
GridSample <- Grid[sample(nrow(Grid), n, replace = FALSE),]
#Generate outcome and regressors
b <- matrix(rnorm(p*n), n , p)
a <- rnorm(n, mean = b %*% rbinom(p, size = 1, p = 0.2)) #20% signal
#Compile to a matrix
df <- data.frame("a" = a, "b" = b, GridSample)
# Define the correlation structure (see ?nlme::gls), with initial nugget 0.5 and range 5
corStruct = corGaus(form = ~ x + y, nugget = TRUE,
value = c("range" = 5, "nugget" = 0.5))
#Fit the pengls model, for simplicity for a simple lambda
register(MulticoreParam(3)) #Prepare multithereading
penglsFitCV = cv.pengls(data = df, outVar = "a", xNames = grep(names(df),
pattern = "b", value = TRUE),
glsSt = corStruct, nfolds = 5)
penglsFitCV$lambda.1se #Lambda for 1 standard error rule
penglsFitCV$cvOpt #Corresponding R2
coef(penglsFitCV)
penglsFitCV$foldid #The folds used
#With MSE as loss function
penglsFitCVmse = cv.pengls(data = df, outVar = "a",
xNames = grep(names(df), pattern = "b", value =TRUE),
glsSt = corStruct, nfolds = 5, loss = "MSE")
penglsFitCVmse$lambda.1se #Lambda for 1 standard error rule
penglsFitCVmse$cvOpt #Corresponding MSE
coef(penglsFitCVmse)
predict(penglsFitCVmse)

sthawinke/pengls documentation built on July 2, 2023, 7:27 a.m.