sgdm.param: Estimates penalization parameters for SGDM model

Description Usage Arguments Value

Description

This function estimates the penalization parameters for the SCCA within the SGDM model.

The SCCA parameterization is done in a heuristic grid search manner, by testing all possible pairs of parameters (for both the biological and the predictor datasets), according to the resulting GDM performance (RMSE) in 5-fold cross-validation.

It requires a predictor dataset ("predData" format), a biological dataset ("bioData" format), the number of components to be extracted in the SCCA, the penalization values to be tested (for both datasets), and the optional use of geographical distance as predictor variable in the GDM.

This current implementation only allows biological data in the format 1 using abundance values, as described in the gdm package.

For more details relating to "bioData" and "predData" data formats, check gdm package.

Usage

1
2
sgdm.param(predData, bioData, k = 10, predPenalization = seq(0.6, 1, 0.1),
  bioPenalization = seq(0.6, 1, 0.1), geo = F)

Arguments

predData

Predictor dataset ("predData" format).

bioData

Biological dataset ("bioData" format).

k

Number of sparse canonical components to be extracted. Set to 10 per default.

predPenalization

Vector with predictor data penalisation values to be tested in the grid search procedure (between 0 and 1). Set to (0.6, 0.7, 0.8, 0.9, 1) per default.

bioPenalization

Vector with biological data penalisation values to be tested in the grid search procedure (between 0 and 1). Set to (0.6, 0.7, 0.8, 0.9, 1) per default.

geo

Optional use of geographical distance as predictor in GDM model. Set to FALSE per default

Value

Returns performance matrix with RMSE values for each tested penalization parameter pair.


steppebird/sparsegdm documentation built on May 16, 2019, 2:55 a.m.