selectDVforEV: Select parsimonious sets of derived variables.

Description Usage Arguments Details Value References Examples

View source: R/selectDVforEV.R

Description

For each explanatory variable (EV), selectDVforEV selects the parsimonious set of derived variables (DV) which best explains variation in a given response variable. The function uses a process of forward selection based on comparison of nested models using inference tests. A DV is selected for inclusion when, during nested model comparison, it accounts for a significant amount of remaining variation, under the alpha value specified by the user. See Halvorsen et al. (2015) for a more detailed explanation of the forward selection procedure.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
selectDVforEV(
  dvdata,
  alpha = 0.01,
  retest = FALSE,
  test = "Chisq",
  algorithm = "maxent",
  write = FALSE,
  dir = NULL,
  quiet = FALSE
)

Arguments

dvdata

List containing first the response variable, followed by data frames of derived variables produced for each explanatory variable (e.g. the first item in the list returned by deriveVars).

alpha

Alpha-level used for inference testing in nested model comparison. Default is 0.01.

retest

Logical. Test variables that do not meet the alpha criterion in a given round in subsequent rounds? Default is FALSE.

test

Character string matching either "Chisq" or "F" to determine which inference test is used in nested model comparison. The Chi-squared test is implemented by stats::anova, while the F-test is implemented as described in Halvorsen (2013, 2015). Default is "Chisq".

algorithm

Character string matching either "maxent" or "LR", which determines the type of model used during forward selection. Default is "maxent".

write

Logical. Write the trail of forward selection for each EV to .csv file? Default is FALSE.

dir

Directory for file writing if write = TRUE. Defaults to the working directory.

quiet

Suppress progress bar?

Details

The F-test available in selectDVforEV is calculated using equation 59 in Halvorsen (2013).

If using binary-type derived variables from deriveVars, be aware that a model including all of these DVs will be considered equal to the the closest nested model, due to perfect multicollinearity (i.e. the dummy variable trap).

The maximum entropy algorithm ("maxent") — which is implemented in MIAmaxent as an infinitely-weighted logistic regression with presences added to the background — is conventionally used with presence-only occurrence data. In contrast, standard logistic regression (algorithm = "LR"), is conventionally used with presence-absence occurrence data.

Explanatory variables should be uniquely named. Underscores ('_') and colons (':') are reserved to denote derived variables and interaction terms respectively, and selectDVforEV will replace these — along with other special characters — with periods ('.').

Value

List of 2:

  1. dvdata: A list containing first the response variable, followed by data frames of selected DVs for each EV. EVs with zero selected DVs are dropped. This item is recommended as input for dvdata in selectEV.

  2. selection: A list of data frames, where each data frame shows the trail of forward selection of DVs for a given EV.

References

Halvorsen, R. (2013). A strict maximum likelihood explanation of MaxEnt, and some implications for distribution modelling. Sommerfeltia, 36, 1-132.

Halvorsen, R., Mazzoni, S., Bryn, A., & Bakkestuen, V. (2015). Opportunities for improved distribution modelling practice via a strict maximum likelihood interpretation of MaxEnt. Ecography, 38(2), 172-183.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
toydata_seldvs <- selectDVforEV(toydata_dvs$dvdata, alpha = 0.4)

## Not run: 
# From vignette:
grasslandDVselect <- selectDVforEV(grasslandDVs$dvdata, alpha = 0.001)
summary(grasslandDVs$dvdata)
sum(sapply(grasslandDVs$dvdata[-1], length))
summary(grasslandDVselect$dvdata)
sum(sapply(grasslandDVselect$dvdata[-1], length))
grasslandDVselect$selection$terdem

## End(Not run)

MIAmaxent documentation built on Dec. 1, 2020, 5:08 p.m.