GAM: R6 Class for GAM modeling of crop responses

GAMR Documentation

R6 Class for GAM modeling of crop responses

Description

R6 class using Generalized Additive Models (GAM) to fit a crop response model with the experimental variable and remotely sensed covariate data. This class is initialized with a named list of training (named 'trn') and validation (named 'val') datasets, the response variable, the experimental variable, and the means of the centered data.

The initialization creates a data frame ('parm_df') containing the parameter names, the k value to use for the GAM, the mean and standard deviation, and whether it meets the criteria to be omitted from the model, making it a 'bad_parm'. The criteria for this is over 30% of data for a given year missing for a parameter or a standard deviation of zero, indicating singularity.

The process then creates a formula for a final model with all parameters that are not considered 'bad'. This is used to fit the final model that is returned to the user for use in the simulation to predict the response under varying rates of the experimental variable.

The 'saveDiagnostics' method include residuals vs. fitted, normal QQ- plots, etc. The fitting process also prepares data for validation plots in the 'ModClass' R6 class. This includes predicting observations in the validation dataset, making a unique id using the year and fieldname, uncentering data, and identifying a field name to use for plotting that reflects all fields in the dataset.

Public fields

dat

Named list of traning (named 'trn') and validation (named 'val') datasets with the response, experimental, and remotely sensed variables.

respvar

Character, the response variable of interest.

expvar

Character, the experimental variable of interest.

covars

Character vector of covariates to use for training the model.

m

Fitted GAM.

form

Final GAM formula.

parm_df

Data frame of parameter names, starting k for the GAM, and a column named 'bad_parms' to indicate whether to include in the model formula. Also includes columns for the mean and standard deviation of each parameter.

fieldname

Unique name for the field(s) analyzed. If multiple fields are used they are separated by an ampersand, otherwise the singular field name is used. This is used for plottting.

mod_type

Name of the model of this class, used for plotting.

Methods

Public methods


Method new()

The initialization creates a data frame ('parm_df') containing the parameter names, the k value to use for the GAM, the mean and standard deviation, and whether it meets the criteria to be omitted from the model, making it a 'bad_parm'. The criteria for this is over 30% of data for a given year missing for a parameter or a standard deviation of zero, indicating singularity.

Usage
GAM$new(dat, respvar, expvar, covars, init_k = -1)
Arguments
dat

Named list of training (named 'trn') and validation (named 'val') datasets with the response, experimental, and remotely sensed variables.

respvar

Character, the response variable of interest.

expvar

Character, the experimental variable of interest.

covars

Character vector of covariates to use for training the model.

init_k

Optional, provide an initial 'k' value to use for the GAM. If no selection automatically 50. K is the the dimension of the basis used to represent the smooth term. Multiple k values will be tested, consider this the upper limit and starting place.

Returns

A instantiated 'GAM' object.


Method fitMod()

Method for fitting the GAM to response variables using experimental and covariate data.

The fitting begins by taking the data frame ('parm_df') containing the parameter names, the k value to use for the GAM, the mean and standard deviation, and identifying whether each parameter meets the criteria to be omitted from the model, making it a 'bad_parm'. The criteria for this is over 30% of data for a given year missing for a parameter or a standard deviation of zero, indicating singularity.

This implements another method for identifying 'bad_parms' and fits an initial k value to pass into the GAM. it sequentially tests a series of k values and if it does not converge after setting k = 1, it is considered a bad parm because it does not induce model convergence.

The process then creates a formula for a final model with all parameters that are not considered 'bad'. This is used to fit the final model that is returned to the user for use in the simulation to predict the response under varying rates of the experimental variable.

Finally, this method prepares the validation data for plotting by using the model to predict the response for each of the observations in the validation dataset, uncentering data if necessary, and identifying a unique field name from the data.

Usage
GAM$fitMod()
Arguments
None

Parameters provided upon class instantiation.

Returns

A fitted GAM.


Method predResps()

Method for predicting response variables using data and a model.

Usage
GAM$predResps(dat, m)
Arguments
dat

Data for predicting response variables for.

m

The fitted model to use for predicting the response variable for each observation in 'dat'.

Returns

Vector of predicted values for each location in 'dat'.


Method saveDiagnostics()

Method for saving diagnostic plots of the fitted model. These include residual vs. fitted values, normal QQ plots, etc.

Usage
GAM$saveDiagnostics(out_path, SAVE)
Arguments
out_path

The path to the folder in which to store and save outputs from the model fitting process

SAVE

Whether to save diagnostic plots.

Returns

Diagnostic plots.


Method clone()

The objects of this class are cloneable with this method.

Usage
GAM$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.

See Also

ModClass for the class that calls the ModClass interface, NonLinear_Logistic, RF, and BayesLinear for alternative model classes.


paulhegedus/OFPE documentation built on Nov. 23, 2022, 5:09 a.m.