Ensemble.Forecasting: Make ensemble forecasts for future projection of species'...
In BIOMOD: Ensemble platform for species distribution modeling

Description Usage Arguments Details Value Author(s) References See Also Examples

The ensemble forecasts approach avoids the selection of one particular technique and is an answer to inter-model variability. Comitee and weighted averages are calculated from the full projections produced by the models of BIOMOD, and means/medians are calculated across the binary projections of the models.

Ensemble.Forecasting(ANN = TRUE, CTA = TRUE, GAM = TRUE, GBM = TRUE, GLM = TRUE, MARS = TRUE, FDA = TRUE,
    RF = TRUE, SRE = TRUE, Proj.name, weight.method, decay = 1.6, PCA.median = TRUE, binary = TRUE,
    bin.method = "Roc", Test = FALSE, repetition.models=TRUE, final.model.out=FALSE, qual.th=0, compress="xz")

Ensemble.Forecasting.raster(ANN = TRUE, CTA = TRUE, GAM = TRUE, GBM = TRUE, GLM = TRUE, MARS = TRUE,
   FDA = TRUE, RF = TRUE, SRE = TRUE, Proj.name, weight.method, decay = 1.6, PCA.median = TRUE,
   binary = TRUE, bin.method = "Roc", Test = FALSE, repetition.models = TRUE, final.model.out = FALSE, qual.th=0, compress="xz")

`ANN`	type TRUE if you want this model to be used for building the ensemble forecasts (setting a model to False is usefull if you don't like or trust a model that has high evaluation scores)
`CTA`	type TRUE if you want this model to be used for building the ensemble forecasts
`GAM`	type TRUE if you want this model to be used for building the ensemble forecasts
`GBM`	type TRUE if you want this model to be used for building the ensemble forecasts
`GLM`	type TRUE if you want this model to be used for building the ensemble forecasts
`MARS`	type TRUE if you want this model to be used for building the ensemble forecasts
`FDA`	type TRUE if you want this model to be used for building the ensemble forecasts
`RF`	type TRUE if you want this model to be used for building the ensemble forecasts
`SRE`	type TRUE if you want this model to be used for building the ensemble forecasts
`Proj.name`	the same name used for rendering the projections with the Projection() function. It will be used for loading the projection result files. It is also the base under which the results will be stored.
`weight.method`	the evaluation method used to rank the models and then weight them according to their evaluation scores
`decay`	will define the relative importance of the weights. A high value will strongly discriminate the 'good' models from the 'bad' ones (see the details section). If the value of this parameter is set to 'proportional' (entered as a string), then the attributed weights are proportional to the chosen 'weight.method'
`PCA.median`	if True, a single model using a standard PCA will be selected to be considered as the best representation of the predictions' variability. See details below.
`binary`	set to True if you also want the results in binary (they are produced in probabilities by default)
`bin.method`	the method to convert the probabilities to binary values (if wanted)
`Test`	set to True if you want the predictive performance of the ensemble method to be evaluated on current data
`repetition.models`	set to True if you want the ensembles to be build with the repetition models taken into account
`final.model.out`	set to True if you want the total ensemble to be build without the final models taken into account
`qual.th`	enables to switch off the models under a certain evaluation score. This will be applied to all models on all species.
`compress`	logical or character string specifying whether saving to a named file is to use compression. FALSE corresponds to no compression, and character strings "gzip", or "xz" specify the type of compression. See ?save for more details. Default is "xz"

The ensemble forecasting function aims at summarising the inter-model variability. There are several ways to perform such a procedure. One might think at representing the ensemble of projections into a multivariate space and then analyse how does the variability stretches out along the first and subsequent axes. PCA.median = TRUE will thus run a Principal Component Analysis (PCA) over projections from the selected models. If the projections are all the same, the first axis will explain most of the variations. In average, the first axis will thus summarise the consensus projections (we will assume that if most projections are correlated, they are the ones most probably right). PCA.median will thus select the model/projection the most correlated to the first axis.

One might also want to credit some weights to the models that performed well in the calibration-evaluation phase. This is done by setting weight.method. For each model/projection, the function will calculate a weight according the evaluation metric selected (ROC, TSS, Kappa). This will be done by using the evaluation scores from either the cross-validation or the independent data (if none has been used, the evaluation score from the full model will be used)

The decay is the ratio between a weight and the following or prior one. The formula is : W = W(-1) * decay. For example, with the default value of 1.6 and 4 weights wanted, the relative importance of the weights will be 1 /1.6/2.56(=1.6*1.6)/4.096(=2.56*1.6) from the weakest to the strongest, and gives 0.11/0.17/0.275/0.445 considering that the sum of the weights is equal to one. The lowest the decay, the smoother the differences between the weights enhancing a weak discrimination between models (please read the BIOMOD manual for some examples).

The value 'proportional' is also possible for the decay: the weights are awarded for each method proportionally to their evaluation scores. The advantage is that the discrimination is more fair than with the decay. In the latter case, close scores can strongly diverge in the weights they are awarded, when the proportional method will consider them as being fairly similar in prediction quality and award them a similar weight.

Another alternative is to give no weight to any model (committee averaging). The function will transform all probability projections from all selected models into binary presence-absence data (using the appropriate threshold calculated during the Models phase using the evaluation metric selected). Then, all binary projections will be simply summed and rescaled between 0 and 1000. This is an interesting format because it gives both the average and variability. Indeed, the values for each pixel/site will not be the "probability of presence" but the "agreement between models to predict presence". A value close to 0 will mean that all models/projections agree to predict absence. A value close to 1000 will mean that all models/projections agree to predict presence. Intermediate value thus gives also the uncertainty. A value of 500 means that half the models predicts presence while the other half predicts an absence. There is no re-estimated threshold for this measure. We only considered the majority as a fair measure to transform the committee averaging into presence-absence. Thus the thresholds are automatically set to 500.

Compression : See help for the save function for more detailed explanation about compression options. Only gzip and xz are available here.

Some objects are stored directly on the hard disk in the same directory as for the projections.

A list is returned, named after the Proj.name argument (see examples). It contains the details of the ensemble forecast process for every species modelled and for every run. For each species, the information is as follows :

`stats`	contains a 2-columns matrix : if Test=TRUE, the first one gives the assumed predictive performance for each consensus method. These scores are obtained by applying the same ensemble computation on the current predictions as on the future forecasts, and compared with the data input for that species (using the Roc evaluation method). If binary=TRUE, the second column gives the threshold obtained by each method for converting the probabilities into presence-absence predictions (see the manual for more details).
`weights`	contains the weights awarded to each model selected for the ensemble forecasts (only concerns the weighted.mean).
`PCA.median`	defines the model selected by the median PCA consensus approach. Note that nothing else is produced in regards to this method.

Wilfried Thuiller, Bruno Lafourcade

Araujo, M. B. & New, M. 2007 Ensemble forecasting of species distributions. Trends in Ecology and Evolution 22, 42-47. Araujo, M. B., Whittaker, R. J., Ladle, R. & Erhard, M. 2005 Reducing uncertainty in projections of extinction risk from climate change. Global Ecology and Biogeography 14, 529-538. Marmion, M., Parviainen, M., Luoto, M., Heikkinen, R. K. & Thuiller, W. 2009 Evaluation of consensus methods in predictive species distribution modelling. Diversity and Distributions 15, 59-69. Thuiller, W. 2004 Patterns and uncertainties of species' range shifts under climate change. Global Change Biology 10, 2020-2027.

Models, Projection, ProjectionBestModel

## Not run: 
example(Projection)

#Run the ensemble forecasting function on this scenario
Ensemble.Forecasting(Proj.name= "Future1", weight.method='Roc', PCA.median=TRUE, binary=TRUE, bin.method='Roc', Test=TRUE, repetition.models=TRUE)

#The list returned contains some of the results (see the value section for details)
load("proj.Future1/consensus_Sp277_Future1")

#this next one contains the consensus projections accross all runs for each species and consensus method
load("proj.Future1/Total_consensus_Future1")
dim(Total_consensus_Future1)
dimnames(Total_consensus_Future1)[-1]

#lets plot all these consensus results and compare them to the original data used for calibrating the models
data(CoorXY)
data <- Total_consensus_Future1

par(mfrow=c(2,7))
par(mar=c(0.6,0.6,2,0.6))
level.plot(DataBIOMOD[,8], CoorXY, show.scale=FALSE, title='sp277_original_data', cex=0.5)
level.plot(data[,1,1], CoorXY, show.scale=FALSE, title='sp277_mean', cex=0.5)
level.plot(data[,1,2], CoorXY, show.scale=FALSE, title='sp277_weighted_mean', cex=0.5)
level.plot(data[,1,3], CoorXY, show.scale=FALSE, title='sp277_median', cex=0.5)
level.plot(data[,1,4], CoorXY, show.scale=FALSE, title='sp277_Roc_mean', cex=0.5)
level.plot(data[,1,5], CoorXY, show.scale=FALSE, title='sp277_Kappa_mean', cex=0.5)
level.plot(data[,1,6], CoorXY, show.scale=FALSE, title='sp277_Roc_mean', cex=0.5)

level.plot(DataBIOMOD[,9], CoorXY, show.scale=FALSE, title='sp164_original_data', cex=0.5)
level.plot(data[,2,1], CoorXY, show.scale=FALSE, title='sp164_mean', cex=0.5)
level.plot(data[,2,2], CoorXY, show.scale=FALSE, title='sp164_weighted_mean', cex=0.5)
level.plot(data[,2,3], CoorXY, show.scale=FALSE, title='sp164_median', cex=0.5)
level.plot(data[,2,4], CoorXY, show.scale=FALSE, title='sp164_Roc_mean', cex=0.5)
level.plot(data[,2,5], CoorXY, show.scale=FALSE, title='sp164_Kappa_mean', cex=0.5)
level.plot(data[,2,6], CoorXY, show.scale=FALSE, title='sp164_TSS_mean', cex=0.5)


## End(Not run)