trainLars | R Documentation |
This function calculates the least angle regression (LARS) using possibly overlapping grouped covariates. The model is fit using cross validation (the cv.grpregOverlap
function). The cross-validation is calculated across values of the alpha
, which controls the degree of ridge penalty (alpha ~0
(bit not = 0) imposes the full ridge penalty and alpha) = 1
imposes no ridge penalty). Higher-order terms are constructed (e.g., quadratic, 2-way interaction, etc.) and fitted in a manner that respects marginality (i.e., all lower order terms will have non-zero coefficients if a high-order term is used).
trainLars( data, resp = 1, preds = 2:ncol(data), alphas = c(0.01, seq(0.1, 1, by = 0.1)), scale = TRUE, quadratic = TRUE, cubic = TRUE, interaction = TRUE, interQuad = TRUE, na.rm = FALSE, verbose = FALSE, ... )
data |
Data frame. |
resp |
Character or integer. Name or column index of response variable. Default is to use the first column in |
preds |
Character list or integer list. Names of columns or column indices of predictors. Default is to use the second and subsequent columns in |
alphas |
Numeric or numeric vector in the range |
scale |
Logical. If |
quadratic |
Logical. If |
cubic |
Logical. If TRUE then include cubic terms in model construction stage for non-factor predictors. Cubic columns will be named |
interaction |
Logical. If |
interQuad |
Logical. If TRUE then include all possible interactions of the form |
na.rm |
Logical. If |
verbose |
Logical. If |
... |
Arguments to pass to |
If scale
is TRUE
then predictors with zero variance will be removed from the data before the model is trained.
Object of class grpreg
and grpregOverlap
.
predictLars
, grpreg
, grpregOverlap
, cv.grpregOverlap
## Not run: ### model red-bellied lemurs data(mad0) data(lemurs) # climate data bios <- c(1, 5, 12, 15) clim <- raster::getData('worldclim', var='bio', res=10) clim <- raster::subset(clim, bios) clim <- raster::crop(clim, mad0) # occurrence data occs <- lemurs[lemurs$species == 'Eulemur rubriventer', ] occsEnv <- raster::extract(clim, occs[ , c('longitude', 'latitude')]) # background sites bg <- 2000 # too few cells to locate 10000 background points bgSites <- dismo::randomPoints(clim, 2000) bgEnv <- raster::extract(clim, bgSites) # collate presBg <- rep(c(1, 0), c(nrow(occs), nrow(bgSites))) env <- rbind(occsEnv, bgEnv) env <- cbind(presBg, env) env <- as.data.frame(env) preds <- paste0('bio', bios) al <- c(0.01, 0.5, 1) fit1 <- trainLars(data=data, penalty='cMCP', family='binomial', nfolds=3, alphas=al, quadratic=FALSE, cubic=FALSE, interaction=FALSE, interQuad=FALSE, verbose=TRUE) fit2 <- trainLars(data=data, penalty='cMCP', family='binomial', nfolds=3, alphas=al, quadratic=TRUE, cubic=FALSE, interaction=FALSE, interQuad=FALSE, verbose=TRUE) fit3 <- trainLars(data=data, penalty='cMCP', family='binomial', nfolds=3, alphas=al, quadratic=TRUE, cubic=TRUE, interaction=TRUE, interQuad=TRUE, verbose=TRUE) summary(fit1) summary(fit2) summary(fit3) # predictions using all variables pred1 <- predictLars(fit1, data, type='response') pred2 <- predictLars(fit2, data, type='response') pred3 <- predictLars(fit3, data, type='response') # partial predictions examining effect of just x1 (plus any interactions) pred1bio1 <- predictLars(fit1, data, type='response', preds='bio1') pred2bio1 <- predictLars(fit2, data, type='response', preds='bio1') pred3bio1 <- predictLars(fit3, data, type='response', preds='bio1') par(mfrow=c(3, 3)) xlim <- c(0, 1) breaks <- seq(0, 1, by=0.1) plot(data$bio1, pred1bio1, ylim=c(0, 1)) points(data$bio1, pred2bio1, col='blue') points(data$bio1, pred3bio1, col='red') legend('topright', pch=1, col=c('black', 'blue', 'red'), legend=c('linear-only', 'linear + quadratic', 'all terms')) # predictions using just bio1 and bio12 pred3bio1_12 <- predictLars(fit3, data, type='response', preds=c('bio1', 'bio12')) plot(pred3, pred3bio1_12) abline(0, 1) ## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.