Module: Maxent

BACKGROUND

Maxent is a machine learning algorithm that estimates the species' response to the environment, constrained to be as close to uniform across the study region as possible given the input data at hand (Phillips et al. 2006, Elith et al. 2011). Maxent characterizes the background environment available in the study region by a random sample drawn from it, and hence is called a presence-background ENM technique. It has been shown to be among the highest-performing niche/distributional modeling techniques for a wide range of environments and species (Elith et al. 2006), including for small sample sizes (Hernandez et al. 2006).

As a machine learning technique, Maxent has the ability to make internal decisions about variable selection and model fit (James et al. 2013); nevertheless, various external decisions can greatly affect model complexity and geographic predictions. Importantly, the Maxent software leveraged here gives users the ability to increase or decrease the potential for model complexity through two key factors: feature classes and the regularization multiplier. First, various feature classes determine the shape of available modeled relationships in environmental space. More (and more complicated) feature classes lead to the potential for higher model complexity. The features offered by Wallace are linear (L), quadratic (Q), hinge (H), and product (P). Second, higher values for the regularization multiplier penalize complexity to a greater degree, and hence tend to lead to simpler models with fewer variables. These settings can hold especially strong influence on model output for Maxent (Warren and Seifert 2011, Radosavljevic & Anderson 2014). For these reasons, evaluating model performance and estimating optimal model complexity constitute important elements of a niche/distributional modeling study with Maxent (e.g., simultaneously varying the feature classes allowed and the regularization multiplers applied to each of them; finer step values for the regularization multiplier can lead to more precise estimation of the optimal regularization setting). See Phillips and Dudík (2008) for technical information, and both Elith et al. (2011) and Merow et al. (2013) for other explanations.

IMPLEMENTATION

This module uses the R packages ENMeval and dismo to build and evaluate Maxent niche/distributional models across a wide range of model settings for feature classes and regularization multipliers (Muscarella et al. 2014).

It automates two workflows: 1) building a suite of candidate models with differing constraints on complexity; and 2) quantifying their performance. Regarding the first, it makes models with various combinations of feature classes and regularization multipliers. The field remains far from any consensus regarding model evaluation and estimation of optimal model complexity (especially for presence-background datasets like those used in Maxent). Nevertheless, the particular evaluation metrics provided here (see Component Model guidance) can aid the user in selecting optimal settings (Radosavljevic & Anderson 2014). Users can download a .csv file of the evaluation statistics table. Users can also view the lambdas file information for each Maxent model (by selecting the model in the dropdown menu). That file shows the parameter name in the first column, the model coefficient (i.e., lambda value) in the second column, and the minimum and maximum values for that parameter in the third and fourth columns, respectively. Each parameter is a feature of one (or two, for product features) of the original variables, and thus more than one row may correspond to different features of the same variable. Note that parameters with lambda values of 0 were not included in the model. See the Maxent help documentation for details. Further, the evaluation results can be viewed graphically in Component Visualize Model Results with Module Maxent Evaluation Plots, and response curves for each variable can be viewed with Module Plot Response Curves.

Making predictions for the full study extent can be complicated by the need to extrapolate into environemental conditions not found in the training dataset. With Maxent, for raster predictions generated by the model, predicted suitability values for environmental conditions more extreme than the training values (i.e., non-analog conditions) can be set to (i.e., 'clamped' to) the suitability values associated with the minimum (at the low end) or maximum (at the high end) value of the variable in the training dataset. This commonly occurs for any projections made to regions and/or time periods different from the training extent (see Component Project), and even can happen for predictions within the training extent when the background sample did not include all pixels of the full study extent. If 'clamping' is not employed, the model's response is applied (unconstrained) to any pixels requiring environmental extrapolation.

NOTE: However, with the maxent.jar option, model predictions are currently always 'clamped' for grid cells with environmental values more extreme than those of the training dataset (background sample plus occurrence records). This is because, due to a bug in the predict() function in dismo, the "clamping" option is always on for model predictions (i.e., clamping cannot be turned off using dismo, to allow an unconstrained extrapolation beyond the environmental condtions of the training dataset).

TROUBLESHOOTING

If you receive this error in the R console:

Warning: Error in rJava::.jarray: java.lang.OutOfMemoryError: Java heap space

Start a new R session to ensure rJava is not loaded, then run the following in the R console, replacing the number "8000" with any arbitrarily high number if "8000" still results in an error. This will allocate more memory to Java and allow it to proceed.

options(java.parameters = "-Xmx8000m")

REFERENCES

Elith J., Graham C.H., Anderson R.P., Dudík M., Ferrier S., Guisan A., Hijmans R.J., Huettmann F., Leathwick J.R., Leahmann A., Li J., Lohmann L.G., Loiselle B.A., Manion G., Moritz C., Nakamura M., Nakazawa Y., Overton J.M., Peterson A.T., Phillips S.J., Richardson K.S., Scachetti-Pereira R., Schapire R.E., Soberón J., Williams S., Wisz M.S., Zimmermann N.E. 2006. Novel methods improve prediction of species' distributions from occurrence data. Ecography. 29: 129-151.

Elith, J., Phillips, S.J., Hastie, T., Dudík, M., Chee, Y.E., & Yates, C.J. (2011). A statistical explanation of MaxEnt for ecologists. Diversity and Distributions. 17: 43-57.

Hernandez, P.A., Graham, C. H., Master, L.L., & Albert, D.L. (2006). The effect of sample size and species characteristics on performance of different species distribution modeling methods. Ecography. 29: 773-785.

James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An Introduction to Statistical Learning (Vol. 6). New York: Springer.

Merow C., Smith M.J., Silander J.A. (2013). A practical guide to MaxEnt for modeling species' distributions: What it does, and why inputs and settings matter. Ecography. 36: 1058-1069.

Muscarella, R., Galante, P. J., Soley-Guardia, M., Boria, R. A., Kass, J. M., Uriarte, M., & Anderson, R. P. (2014). ENMeval: An R package for conducting spatially independent evaluations and estimating optimal model complexity for Maxent ecological niche models. Methods in Ecology and Evolution. 5: 1198-1205.

Phillips, S.J., Anderson, R.P. & Schapire, R.E. (2006) Maximum entropy modeling of species geographic distributions. Ecological Modelling. 190: 231-259.

Phillips, S.J., & Dudík, M. (2008). Modeling of species distributions with Maxent: new extensions and a comprehensive evaluation. Ecography. 31: 161-175.

Radosavljevic A., Anderson R.P. (2014). Making better Maxent models of species distributions: complexity, overfitting and evaluation. Journal of Biogeography. 41: 629-643.

Warren, D. L., & Seifert, S. N. (2011). Ecological niche modeling in Maxent : the importance of model complexity and the performance of model selection criteria. Ecological Applications. 21: 335-342.



chhetrid/rangemapR documentation built on May 13, 2019, 11:09 a.m.