GAM | R Documentation |
R6 class using Generalized Additive Models (GAM) to fit a crop response model with the experimental variable and remotely sensed covariate data. This class is initialized with a named list of training (named 'trn') and validation (named 'val') datasets, the response variable, the experimental variable, and the means of the centered data.
The initialization creates a data frame ('parm_df') containing the parameter names, the k value to use for the GAM, the mean and standard deviation, and whether it meets the criteria to be omitted from the model, making it a 'bad_parm'. The criteria for this is over 30% of data for a given year missing for a parameter or a standard deviation of zero, indicating singularity.
The process then creates a formula for a final model with all parameters that are not considered 'bad'. This is used to fit the final model that is returned to the user for use in the simulation to predict the response under varying rates of the experimental variable.
The 'saveDiagnostics' method include residuals vs. fitted, normal QQ- plots, etc. The fitting process also prepares data for validation plots in the 'ModClass' R6 class. This includes predicting observations in the validation dataset, making a unique id using the year and fieldname, uncentering data, and identifying a field name to use for plotting that reflects all fields in the dataset.
dat
Named list of traning (named 'trn') and validation (named 'val') datasets with the response, experimental, and remotely sensed variables.
respvar
Character, the response variable of interest.
expvar
Character, the experimental variable of interest.
covars
Character vector of covariates to use for training the model.
m
Fitted GAM.
form
Final GAM formula.
parm_df
Data frame of parameter names, starting k for the GAM, and a column named 'bad_parms' to indicate whether to include in the model formula. Also includes columns for the mean and standard deviation of each parameter.
fieldname
Unique name for the field(s) analyzed. If multiple fields are used they are separated by an ampersand, otherwise the singular field name is used. This is used for plottting.
mod_type
Name of the model of this class, used for plotting.
new()
The initialization creates a data frame ('parm_df') containing the parameter names, the k value to use for the GAM, the mean and standard deviation, and whether it meets the criteria to be omitted from the model, making it a 'bad_parm'. The criteria for this is over 30% of data for a given year missing for a parameter or a standard deviation of zero, indicating singularity.
GAM$new(dat, respvar, expvar, covars, init_k = -1)
dat
Named list of training (named 'trn') and validation (named 'val') datasets with the response, experimental, and remotely sensed variables.
respvar
Character, the response variable of interest.
expvar
Character, the experimental variable of interest.
covars
Character vector of covariates to use for training the model.
init_k
Optional, provide an initial 'k' value to use for the GAM. If no selection automatically 50. K is the the dimension of the basis used to represent the smooth term. Multiple k values will be tested, consider this the upper limit and starting place.
A instantiated 'GAM' object.
fitMod()
Method for fitting the GAM to response variables using experimental and covariate data.
The fitting begins by taking the data frame ('parm_df') containing the parameter names, the k value to use for the GAM, the mean and standard deviation, and identifying whether each parameter meets the criteria to be omitted from the model, making it a 'bad_parm'. The criteria for this is over 30% of data for a given year missing for a parameter or a standard deviation of zero, indicating singularity.
This implements another method for identifying 'bad_parms' and fits an initial k value to pass into the GAM. it sequentially tests a series of k values and if it does not converge after setting k = 1, it is considered a bad parm because it does not induce model convergence.
The process then creates a formula for a final model with all parameters that are not considered 'bad'. This is used to fit the final model that is returned to the user for use in the simulation to predict the response under varying rates of the experimental variable.
Finally, this method prepares the validation data for plotting by using the model to predict the response for each of the observations in the validation dataset, uncentering data if necessary, and identifying a unique field name from the data.
GAM$fitMod()
None
Parameters provided upon class instantiation.
A fitted GAM.
predResps()
Method for predicting response variables using data and a model.
GAM$predResps(dat, m)
dat
Data for predicting response variables for.
m
The fitted model to use for predicting the response variable for each observation in 'dat'.
Vector of predicted values for each location in 'dat'.
saveDiagnostics()
Method for saving diagnostic plots of the fitted model. These include residual vs. fitted values, normal QQ plots, etc.
GAM$saveDiagnostics(out_path, SAVE)
out_path
The path to the folder in which to store and save outputs from the model fitting process
SAVE
Whether to save diagnostic plots.
Diagnostic plots.
clone()
The objects of this class are cloneable with this method.
GAM$clone(deep = FALSE)
deep
Whether to make a deep clone.
ModClass
for the class that calls the ModClass interface,
NonLinear_Logistic
, RF
, and BayesLinear
for alternative model classes.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.