generateModels: Generate Models
In henkelstone/NPEL.Classification: Classifaction and data-handling routines for NPEL Caribou Project

Description Usage Arguments Details Value See Also Examples

This function builds a collection of models from a single input dataset. It can handle either classification or regression data; that is, either categorical or continuous data.

1
2
3

generateModels(data, modelTypes, fx = NULL, x = NULL, y = NULL,
  grouping = NULL, echo = TRUE, rf.args = NULL, nn.args = NULL,
  gbm.args = NULL)

`data`	the input data frame, see `siteData` for more information and an example dataset.
`modelTypes`	a character vector of model types to generate; one or more of `suppModels`.
`fx`	(optional) a formula object specifying the variable relationships; will be generated from x and y if unspecified.
`x`	(optional) vector names of 'predictor' variables to use; defaults to all columns less the y variable; defaults to all columns other than y if fx is also not provided.
`y`	(optional) the name of the column of the 'response' variable; defaults to first column if fx is also not provided. It can be either categorical or continuous data, and it will attempt to coerce vectors of unknown types (e.g. boolean) into one of these two groups, albeit in a rather rudimentary fashion. If it cannot succeed it will complain.
`grouping`	(optional) a transformation vector for input classes; if not provided, no grouping will be used. See `ecoGroup` for more information about this technique.
`echo`	(optional) should the function report it's progress? Defaults to TRUE, but useful for automation.
`rf.args`	(optional) a list of arguments to pass to random forest type models; defaults will be generated for unspecified values.
`nn.args`	(optional) a list of arguments to pass to nearest neighbour type models; defaults will be generated for unspecified values.
`gbm.args`	(optional) a list of arguments to pass to gbm; defaults will be generated for unspecified values.

In the most basic sense, this function is a loop wrapping the code to generate a model. However, it also standardizes the inputs for all the model packages and generates meaningful default arguments for all the supported packages. It is possible to pass the function either a formula object, or a list of x and y names from which to generate the models—it will compute whichever is not specified.

The various arguments are the most complex part of this function. Reasonably meaningful default values are generated within the function, but the user always has the option to override them. In most cases it is likely there will be at least a few arguments that will need to be provided. The argument lists are divided up by model type, not package:

Random Forest—currently: randomForest, and randomForestSRC.
- mtry = floor(sqrt(length(x))) the two different implementation of random forests, while they specify that they compute the number of variables to use at each node split the same way, actually arrive at different answers internally—that is, given the defaults, they do not generate the same output. By specifying it here, using the same formula they specify as the default, it is possible ensure that they are doing the same thing.
- importance = ‘permute’ one of the benefits of random forests is that it is relatively easy to compute a variable importance metric (VIMP). While only randomForestSRC currently allows multiple options for methods, these options can be specified here (including ‘none’ and the arguments for randomForest will be generated automatically.
- na.action = na.omit what to do when na values are encountered.
- proximity = FALSE should proximity information be computed; see packages for more help.
Nearest Neighbour—currently: FNN, class, and kknn.
- k = 2 the number of neighbours considered (for FNN and class).
- kernel = ‘rectangular’ the kknn package allows the selection of different kernel functions as to how to weight the distance metric—this specifies which to use. It is possible to use more than one and it will optimize over them all.
- scale = TRUE should we scale the data before running the model fit.
GBM—currently: gbm
- n.trees = 1000 the maximum number of trees to grow. Note that this is not the optimal number of trees! This is an overfit model; use gbm.perf to find the optimal model.
- keep.data = TRUE should the data be embedded in the model. Since other methods in this package need the data. This also prevents the data from potentially being stored twice.

A named list of models with attributes specifying the data, the function used, and the class.

See the package help NPEL.Classification for an overview of the analysis process.

For reading-in model data: readTile, readShapePoints, and extractPoints; or the raster package help for reading-in raster files directly.

For examples on computing derived raster variables, e.g. NDVI, slope, etc. see the example code in egTile

For examples on what to do with the generated models see: modelAccs, writeTile, and plotTile

Also see any of the supported packages, currently: randomForest, randomForestSRC, FNN, class, kknn, and gbm.

data ('siteData')
modelRun <- generateModels (data = siteData,
                            modelTypes = suppModels,
                            x = c('brtns','grnns','wetns','dem','slp','asp','hsd'),
                            y = 'ecoType',
                            grouping = ecoGroup[['domSpecies','transform']],
                            gbm.args = list (interaction.depth=7, shrinkage=0.005, cv.folds=0) )

henkelstone/NPEL.Classification documentation built on May 17, 2019, 3:42 p.m.

henkelstone/NPEL.Classification index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

henkelstone/NPEL.Classification
Classifaction and data-handling routines for NPEL Caribou Project

generateModels: Generate Models
In henkelstone/NPEL.Classification: Classifaction and data-handling routines for NPEL Caribou Project

Description

Usage

Arguments

Details

Value

See Also

Examples

Related to generateModels in henkelstone/NPEL.Classification...

R Package Documentation

Browse R Packages

We want your feedback!

henkelstone/NPEL.Classification Classifaction and data-handling routines for NPEL Caribou Project

generateModels: Generate Models In henkelstone/NPEL.Classification: Classifaction and data-handling routines for NPEL Caribou Project

Description

Usage

Arguments

Details

Value

See Also

Examples

Related to generateModels in henkelstone/NPEL.Classification...

R Package Documentation

Browse R Packages

We want your feedback!

henkelstone/NPEL.Classification
Classifaction and data-handling routines for NPEL Caribou Project

generateModels: Generate Models
In henkelstone/NPEL.Classification: Classifaction and data-handling routines for NPEL Caribou Project