generateModels: Generate Models

Description Usage Arguments Details Value See Also Examples

Description

This function builds a collection of models from a single input dataset. It can handle either classification or regression data; that is, either categorical or continuous data.

Usage

1
2
3
generateModels(data, modelTypes, fx = NULL, x = NULL, y = NULL,
  grouping = NULL, echo = TRUE, rf.args = NULL, nn.args = NULL,
  gbm.args = NULL)

Arguments

data

the input data frame, see siteData for more information and an example dataset.

modelTypes

a character vector of model types to generate; one or more of suppModels.

fx

(optional) a formula object specifying the variable relationships; will be generated from x and y if unspecified.

x

(optional) vector names of 'predictor' variables to use; defaults to all columns less the y variable; defaults to all columns other than y if fx is also not provided.

y

(optional) the name of the column of the 'response' variable; defaults to first column if fx is also not provided. It can be either categorical or continuous data, and it will attempt to coerce vectors of unknown types (e.g. boolean) into one of these two groups, albeit in a rather rudimentary fashion. If it cannot succeed it will complain.

grouping

(optional) a transformation vector for input classes; if not provided, no grouping will be used. See ecoGroup for more information about this technique.

echo

(optional) should the function report it's progress? Defaults to TRUE, but useful for automation.

rf.args

(optional) a list of arguments to pass to random forest type models; defaults will be generated for unspecified values.

nn.args

(optional) a list of arguments to pass to nearest neighbour type models; defaults will be generated for unspecified values.

gbm.args

(optional) a list of arguments to pass to gbm; defaults will be generated for unspecified values.

Details

In the most basic sense, this function is a loop wrapping the code to generate a model. However, it also standardizes the inputs for all the model packages and generates meaningful default arguments for all the supported packages. It is possible to pass the function either a formula object, or a list of x and y names from which to generate the models—it will compute whichever is not specified.

The various arguments are the most complex part of this function. Reasonably meaningful default values are generated within the function, but the user always has the option to override them. In most cases it is likely there will be at least a few arguments that will need to be provided. The argument lists are divided up by model type, not package:

Value

A named list of models with attributes specifying the data, the function used, and the class.

See Also

See the package help NPEL.Classification for an overview of the analysis process.

For reading-in model data: readTile, readShapePoints, and extractPoints; or the raster package help for reading-in raster files directly.

For examples on computing derived raster variables, e.g. NDVI, slope, etc. see the example code in egTile

For examples on what to do with the generated models see: modelAccs, writeTile, and plotTile

Also see any of the supported packages, currently: randomForest, randomForestSRC, FNN, class, kknn, and gbm.

Examples

1
2
3
4
5
6
7
data ('siteData')
modelRun <- generateModels (data = siteData,
                            modelTypes = suppModels,
                            x = c('brtns','grnns','wetns','dem','slp','asp','hsd'),
                            y = 'ecoType',
                            grouping = ecoGroup[['domSpecies','transform']],
                            gbm.args = list (interaction.depth=7, shrinkage=0.005, cv.folds=0) )

henkelstone/NPEL.Classification documentation built on May 17, 2019, 3:42 p.m.