Fitting Generalized Dissimilarity Models from Tabular Data

Share:

Description

gdm.fitfromtable is used to fit generalized dissimilarity models, specified by giving a data table of site pairs formatted as follows: Observed,Weights,X0,Y0,X1,Y1,Pred1SiteA,Pred2SiteA,...,PredNSiteA, Pred1SiteB,Pred2SiteB,...,PredNSiteB The first column should be a ratio based dissimilarity measure between SitesA and SitesB. The second column defines any weighting to be applied. It should be set to 1.0 as default (no weights) The third and fourth columns, X0 and Y0 represent the coordinates of the first site from a site pair. The fifth and sixth columns, X1 and Y1 represent the coordinates of the second site from a site pair. Note that the first six columns MUST be included, even if you do not intend to use geographic distance as a predictor. These columns can be loaded with dummy data if the actual coordinates are unknown. The next N*2 columns are for N predictors for SiteA and followed by N predictors for Site B. The following is an example of a GDM input table header... Response,Weights,X0,Y0,X1,Y1,S1_Alk,S1_Avrain,S1_Bedrock,S2_Alk,S2_Avrain,S2_Bedrock

Usage

1

Arguments

data

A data frame representing the model data values for a collection of site pairs. The observed response data must be located in the first column. The weights data must be located in the second column. If geo is TRUE, then the X0,Y0 and Y0,Y1 columns will be used for calculating the geographic distance between each site in each site pair for inclusion of the geographic predictor term into the GDM model. If geo is FALSE, the default, then the X0,Y0,X1 and Y1 data columns are ignored. The predictor data for Site A and the predictor data for Site B follows.

geo

Set to TRUE if geographical distance between sites is to be included as a model term. (refer to the details for the x argument for more details on how to format the input data). Set to FALSE if geographical distance is to be omitted from the model. Default is FALSE.

splines

An optional vector of I-Spline counts to be used in the fitting process. If supplied, it must have the same length as the number of predictors (including geographic distance if in use).

quantiles

An optional vector of quantiles to be used in the fitting process. If quantiles are supplied and splines=NULL, then it must have the same length as the number of predictors * 3. If both quantiles and splines are supplied, then the length of quantiles must be the same as the sum of splines.

Value

gdm.fitfromtable returns a gdm model object. The function summary (i.e., gdm.summaryfromtable) can be used to obtain or print a summary of the results. A gdm model object is a list containing at least the following components:

dataname

the name of the table used as the data argument to the model

geo

whether geographic distance was used in the model

gdmdeviance

the deviance of the gdm model

nulldeviance

the NULL deviance of the gdm model

explained

the percentage of deviance explained by the model

intercept

the intercept value that is added to the overall model

predictors

a list of the names of the predictors that were used to fit the model. Only predictors that have contributed to the fitted model are included.

coefficients

a list of the coefficients for each spline for all the predictors included in the x data.

quantiles

a vector of the percentiles derived from the x data (or user defined), for each predictor.

splines

a vector of I-Spline counts for each predictor

creationdate

the date and time of model creation.

observed

the observed response for each site pair (from the data column 1).

predicted

the predicted response after applying the GDM link function.

ecological

the predicted ecological distance from the GDM model.

Examples

1
## Not run: test.mod <- gdm.fitfromtable(test.df, geo=TRUE, splines = c(3,3,4))