gradientForest: Create gradientforest objects

gradientForestR Documentation

Create gradientforest objects

Description

gradientForest uses an extended version of the package randomForest (Liaw and Wiener 2002), extendedForest which retains all of the split values and fit improvements for each of the response variables (species catches in our case) for further analysis. gradientForest collates the numerous split values along each gradient and their ' associated fit improvements for each species that were retained by extendedForest, for each predictor in each tree and each forest. Details on the method are given in Ellis et al. (2012) and applications are described in Pitcher et al. (2012). https://rdrr.io/rforge/gradientForest/man/gradientForest.html

Usage

gradientForest(
  data,
  predictor.vars,
  response.vars,
  ntree = 10,
  mtry = NULL,
  transform = NULL,
  maxLevel = 0,
  corr.threshold = 0.5,
  compact = FALSE,
  nbin = 101,
  trace = FALSE
)

Arguments

data

data.frame containing where rows identify sites and columns contain response variables (usually species catch (numbers or weight) or predictor variables such as physical or chemical, variables. Column names identify species or specific predictor variable. If the species are numeric variables, a regression forest is calculated. If the species are factor variables, a classification forest is calculated.

predictor.vars

vector identifying which columns containing predictor variables (e.g., physical variables) are to be used in the randomForest analysis. This vector can contain column names (as a character) or column number.

response.vars

vector identifying which species are to be used in the randomForest analysis. This vector can contain column names (as a character) or column number.

ntree

number of bootstrapped trees to be generated by randomForest. Default set to 10 trees.

mtry

number of predictor variables randomly sampled as candidates at each split. Setting to NULL accepts default values. Note that the default values are different for classification (sqrt(p) where p is number of variables in x) and regression (p/3).

transform

a function defining a transformation to be applied the species data. For example, a square-root transformation would be entered as transform=function(x)sqrt(x). Default set to no transformation.

maxLevel

if maxLevel == 0, compute importance from marginal permutation distribution of each variable (the default). If maxLevel > 0, compute importance from conditional permutation distribution of each variable, permuted within 2^maxLevel partitions of correlated variables.

corr.threshold

if maxLevel > 0, OOB permuting is conditioned on partitions of variables having absolute correlation > corr.threshold.

compact

logical variable to choose standard method or compact method for aggregating importance measures across species. Compact=TRUE to be chosen when memory problems cause a crash in this function. Still experimental.

nbin

number of bins for compact option. Default set to 101.

trace

if TRUE show the progress. Default FALSE.

check.names

if TRUE then ensure that all predictor and response vars are syntactically valid col names in R and throw an error if they are not. gradientForest should still work with invalid col names, but with potentially more bugs. Default TRUE.

Value

A resistance surface


MVan35/resGF_tryout documentation built on May 10, 2022, 12:24 p.m.