addSpatialModelOnIC.asrtests: Uses information criteria to decide whether to add a spatial...

View source: R/spatial.funcs.v7.r

addSpatialModelOnIC.asrtestsR Documentation

Uses information criteria to decide whether to add a spatial model to account for local spatial variation.

Description

Adds either a correlation, two-dimensional tensor-product natural cubic smoothing spline (TPNCSS), or a two-dimensional tensor-product penalized P-spline model (TPPS) to account for the the local spatial variation exhibited by a response variable measured on a potentially irregular grid of rows and columns of the units. The data may be arranged in sections for each of which there is a grid and for which the model is to be fitted separately. Also, the rows and columns of a grid are not necessarily one observational unit wide. The spatial model is only added if the information criterion of the supplied model is decreased with the addition of the local spatial model. For TPPS models for which the order of differencing the penalty matrix is two, the improvement in the fit from rotating the eigenvectors of the penalty matrix can be investigated; if there is no improvement, the unrotated fit will be returned.

A row is added for each section to the test.summary data.frame of the asrtests.object stating whether or not the new model has been swapped for a model in which the spatial model has been add to the supplied model. Convergence and the occurrence of fixed correlations in fitting the model is checked and a note included in the action if there was not. All components of the asrtests.object are updated to exhibit the differences between the supplied and the new model, if a spatial model is added.

Usage

## S3 method for class 'asrtests'
addSpatialModelOnIC(asrtests.obj, spatial.model = "TPPS", 
                    sections = NULL, 
                    row.covar = "cRow", col.covar = "cCol", 
                    row.factor = "Row", col.factor = "Col", 
                    corr.funcs = c("ar1", "ar1"), 
                    row.corrFitfirst = TRUE, allow.corrsJointFit = TRUE, 
                    dropRowterm = NULL, dropColterm = NULL, 
                    nsegs = NULL, nestorder = c(1,1), 
                    degree = c(3,3), difforder = c(2,2), 
                    usRandLinCoeffs = TRUE, 
                    rotateX = FALSE, ngridangles = c(18, 18), 
                    which.rotacriterion = "AIC", nrotacores = 1, 
                    asreml.option = "mbf", tpps4mbf.obj = NULL, 
                    allow.unconverged = FALSE, allow.fixedcorrelation = FALSE,
                    checkboundaryonly = FALSE, update = FALSE, 
                    maxit = 30, IClikelihood = "full", which.IC = "AIC", ...)

Arguments

asrtests.obj

An asrtests.object containing the components (i) asreml.obj, (ii) wald.tab, and (iii) test.summary.

spatial.model

A single character string nominating the type of spatial model to fit. Possible values are corr, TPNCSS and TPPS.

sections

A single character string that specifies the name of the column in the data.frame that contains the factor that identifies different sections of the data to which separate spatial models are to be fitted. Note that, for other terms that involve sections in the random formula, there should be separate terms for each level of sections. For example, in a blocked experiment involving multiple sites, there should be a sum of separate term for the Blocks at each Site i.e. a formula that contains terms like at(Site, i):Block for each site and these are separated by '+'. Otherwise, the combined term (e.g. Site:Block) will impact on the fitting of the local spatial models for the different Sites. Similarly, if considerations other than the fitting of spatial models do not require a residual variance structure e.g. heterogeneous residual variances depending on treatments, there should be separate units terms for each of the sections. This will allow the fitting of a nugget variance for each of the sections. To achieve, this use the sum of dsum terms for each of the sections, each term being of the form dsum(~ units | Section, levels = list(i)) where i is a level of Section. It is important to specify separate terms, rather than compound terms obtained by specifying multiple levels in the list or leaving levels out altogether. Separate terms are needed so that terms for individual sections can be manipulated independently.

row.covar

A single character string nominating a numeric that contains the values of a centred covariate indexing the rows of a grid. The numeric must be a column in the data.frame stored in the asreml.obj that is a component of the supplied asrtests.obj.

col.covar

A single character string nominating a numeric that contains the values of a centred covariate indexing the columns of a grid. The numeric must be a column in the data.frame stored in the asreml.obj that is a component of the supplied asrtests.obj.

row.factor

A single character string nominating a factor that indexes the rows of a grid that are to be one dimension of a spatial correlation model. The factor must a column in the data.frame stored in the asreml.obj that is a component of the supplied asrtests.obj.

col.factor

A single character string nominating a factor that indexes the columns of a grid that are to be one dimension of a spatial correlation model. The factor must a column in the data.frame stored in the asreml.obj that is a component of the supplied asrtests.obj.

corr.funcs

A single character string of length two that specifies the asreml one-dimensional correlation or variance model function for the row and column dimensions of a two-dimensional separable spatial correlation model to be fitted when spatial.model is corr; the two-dimensional model is fitted as a random term. If a correlation or variance model is not to be investigated for one of the dimensions, specify "" for that dimension.

row.corrFitfirst

A logical. If TRUE then, in fitting the model for spatial.model set to corr, the row correlation or variance function is fitted first, followed by the addition of the column correlation or variance function. If FALSE, the order of fitting is reversed.

allow.corrsJointFit

A logical which, if TRUE, will allow the simultaneous fitting of correlation functions for the two dimensions of the grid when separate fits have failed to fit any correlation functions. This argument is available for when a joint fit hangs the system.

dropRowterm

A single character string nominating a factor in the data.frame that has as many levels as there are unique values in row.covar. This argument is applicable for spatial.model set to TPNCSS or TPPS. It is used to remove a term corresponding to the dropRowterm and a random row deviations term based on row.covar will be included in the model. If the argument is NULL, it is assumed that such a term is not included in the fitted model stored in asrtests.obj.

dropColterm

A single character string nominating a factor in the data.frame that has as many levels as there are unique values in col.covar. This argument is applicable for spatial.model set to TPNCSS or TPPS. It is used to remove a term corresponding to the dropColterm and a random column deviations term based on col.covar will be included in the model. If the argument is NULL, it is assumed that such a term is not included in the fitted model stored in asrtests.obj.

nsegs

A pair of numeric values giving the number of segments into which the column and row ranges are to be split, respectively, for fitting a P-spline model (TPPS) (each value specifies the number of internal knots + 1). If not specified, then (number of unique values - 1) is used in each dimension; for a grid layout with equal spacing, this gives a knot at each data value. If sections is not NULL and the grid differs between the sections, then nsegs will differ between the sections.

nestorder

A numeric of length 2. The order of nesting for column and row dimensions, respectively, in fitting a P-spline model (TPPS). A value of 1 specifies no nesting, a value of 2 generates a spline with half the number of segments in that dimension, etc. The number of segments in each direction must be a multiple of the order of nesting.

degree

A numeric of length 2. The degree of polynomial spline to be used for column and row dimensions respectively, in fitting a P-spline (TPSS).

difforder

A numeric of length 2. The order of differencing for column and row dimensions, respectively, in fitting a P-spline (TPSS).

usRandLinCoeffs

A logical which, if TRUE, will attempt to fit an unstructured variance model to the constant and linear terms in the interactions for constant and linear terms in one grid dimension interacting with smoooth terms in the second grid dimension. The unstructured variance model can only be fitted if both the constant and linear interaction terms have been retained in the fitted model. This argument can be used to omit the attempt to fit an unstructured variance model when the attempt results in a stystem error.

rotateX

A logical indicating whether to rotate the eigenvectors of the penalty matrix, as described by Piepho, Boer and Williams (2022), when fitting a P-spline (TPSS). Setting rotateX to TRUE results in an optimized rotation produced by exploring a two-dimensional grid of rotation angle pairs, and for each pair analyzing the data under a model that omits the random interaction terms. The angle pair with the minimum deviance is used to apply an optimized rotation. Rotation of the eigenvectors is only relevant for difforder values greater than 1 and has only been implemented for difforder equal to 2.

ngridangles

A numeric of length 2. The numbers of angles between 0 and 90 degrees for each of the row and column dimensions to be used in determining the optimal pair of angles for rotating the eignevectors of the penalty matrix of a P-spline (TPSS). Specifying factors of 90 will result in integer-valued angles. The number of grid points, and hence re-analyses will be the product of the values of (ngridangles + 1).

which.rotacriterion

A single character string nominating which of the criteria out of the deviance, the likelihood, the AIC and the BIC in determining the optimal rotation of the eigenvectors of the penalty matrix. The deviance uses the REML value computed by asreml; the other criteria use the full likelihood, evaluated using the REML estimates, that is computed by infoCriteria.asreml.

nrotacores

A numeric specifying the number of cores to deploy for running the analyses required to search the two-diemsional grid of rotation angles when rotateX is TRUE. Parallel processing has been implemented for analyzing, for each column angle, the set of angles to be investigated for the row dimension. The default value of one means that parallel processing will not be used. The value chosen for nrotacores needs to balanced against the other processes that are using parallel processing at the same time.

asreml.option

A single character string specifying whether the grp or mbf methods are to be used to supply externally formed covariate matrices to asreml when fitting a P-spline (TPSS). Compared to the mbf method, the grp method is somewhat faster, but creates large asrtests.objects for which the time it takes to save them can exceed any gains in execution speed. The grp method adds columns to the data.frame containing the data. On the other hand, the mbf method adds only the fixed covariates to data and stores the random covariates in the environment of the internal function that calls the spline-fitting function; there are three smaller data.frames for each section that are not stored in the asreml.object resulting from the fitted model.

tpps4mbf.obj

An object made with makeTPPSplineMats.data.frame that contains the spline basis information for fitting P-splines. The argument tpps4mbf.obj only needs to be set when the mbf option of asreml.option is being used and it is desired to use mbf data.frames that have been created and stored prior to calling
addSpatialModelOnIC.asrtests. If tpps4mbf.obj is NULL, makeTPPSplineMats.data.frame will be called internally to produce the required mbf data.frames.

allow.unconverged

A logical indicating whether to accept a new model even when it does not converge. If FALSE and the fit of the new model does not converge, the supplied asrtests.obj is returned. Also, if FALSE and the fit of the new model has converged, but that of the old model has not, the new model will be accepted.

allow.fixedcorrelation

A logical indicating whether to accept a new model even when it contains correlations in the model whose values have been designated as fixed, bound or singular. If FALSE and the new model contains correlations whose values have not been able to be estimated, the supplied asrtests.obj is returned. The fit in the asreml.obj component of the supplied asrtests.obj will also be tested and a warning issued if both fixed correlations are found in it and allow.fixedcorrelation is FALSE.

checkboundaryonly

If TRUE then boundary and singular terms are not removed by rmboundary.asrtests; a warning is issued instead. Note that, for correlation models, the fitting of each dimension and the test for a nugget term are performed with checkboundaryonly set to TRUE and its supplied setting only honoured using a call to rmboundary.asrtests immediately prior to returning the final result of the fitting.

update

If TRUE, then newfit.asreml is called to fit the model to be tested, using the values of the variance parameters stored in the asreml.object, that is stored in asrtests.obj, as starting values. If FALSE, then newfit.asreml will not use the stored variance parameter values as starting values when fitting the new model, the only modifications being (i) to add the terms for the spatial models and (ii) those specified via ....

which.IC

A character specifying the information criterion to be used in selecting the best model. Possible values are AIC and BIC. The value of the criterion for supplied model must exceed that for changed model for the changed model to be returned. (For choosing the rotation angle of the eigenvectors of the penalty matrix, see which.rotacriterion.

maxit

A numeric specifying the maximum number of iterations that asreml should perform in fitting a model.

IClikelihood

A character specifying whether Restricted Maximum Likelihood (REML) or the full likelihood, evaluated using REML estimates, (full) are to be used in calculating the information criteria to be included in the test.summary of an asrtests.object or to be used in choosing the best model.

...

Further arguments passed to changeModelOnIC.asrtests, asreml and tpsmmb.

Details

A fitted spatial model is only returned if it improves the fit over and above that of achieved with the model fit supplied in the asrtests.obj. To fit the spatial model without any hypothoses testing or comparison of informtion criteria use addSpatialModel.asrtests. The model fit supplied in the asrtests.obj should not include terms that will be included in the local spatial model. All spatial model terms are fitted as fixed or random. Consequently, the residual model does not have to be iid. Note that the data must be in the order that correponds to the residual argument with a variable to the right of another variable changes levels in the data frame faster than those of the other variable e.g. Row:Column implies that all levels for Column in consecutive rows of the data.frame with a single Row level.

For the corr spatial model, the default model is an autocorrelation model of order one (ar1) for each dimension. However, any of the single dimension correlation/variance models from asreml can be specified for each dimension, as can no correlation model for a dimension; the models for the two dimensions can differ. Using a forward selection procedure, a series of models are tried, without removing boundary or singular terms, beginning with the addition of row correlation and followed by the addition of column correlation or, if the row.corrFitfirst is set to FALSE, the reverse order. If the fitting of the first-fitted correlation did not result in a model change because the fitting did not converge or correlations were fixed, but the fit of the second correlation was successful, then adding the first correlation will be retried. If one of the metric correlation functions is specified (e.g. exp), then the row.covar or col.covar will be used in the spatial model. However, because the correlations are fitted separately for the two dimensions, the row.factor and col.factor are needed for all models and is used for a dimension that does not involve a correlation/variance function for the fit being performed. Also, the correlation models are fitted as random terms and so the correlation model will include a variance parameter for the grid even when ar1 is used to specify the correlation model, i.e. the model fitted is a variance model and there is no difference between ar1 and ar1v in fitting the model. The variance parameter for this term represents the spatial variance and the fit necessarily includes a nugget term, this being the residual variance. If any correlation is retained in the model, for a section if sections is not NULL, then the need for a nuggest term is assessed by fixing the corresponding residual variance to one, unless there are multiple residual variances and these are not related to the sections. Once the fitting of the correlation model has been completed, the rmboundary function will be executed with the checkboundaryonly value supplied in the addSpatialModelOnIC.asrtests call. Finally, checking for bound and singular random terms associated with the correlation model and residual terms will be carried out when there are correlation terms in the model and checkboundaryonly has been set to FALSE; as many as possible will be removed from the fitted model, in some cases by fixing variance terms to one.

The tensor-product natural-cubic-smoothing-spline (TPNCSS) spatial models are as decribed by Verbyla et al. (2018). The tensor-product penalized-spline (TPPS) models are as described by (Piepho, Boer and Williams, 2022). For the TPPS model, the degree of the polynomial and the order of differencing can be varied. The defaults of 3 and 2, respectively, fit a cubic spline with second order differencing, which is similar to those of Rodriguez-Alvarez et al. (2018). Setting both the degree and order of differencing to 1 will fit a type of linear variance model. The fixed terms for the spline models are row.covar + col.covar + row.covar:col.covar and the random terms are spl(row.covar) + spl(col.covar) + dev(row.covar) + dev(col.covar) + spl(row.covar):col.covar +
row.covar:spl(col.covar) + spl(row.covar):spl(col.covar), except that spl(row.covar) + spl(col.covar) is not included in TPPS models. The supplied model should not include any of these terms. However, any fixed or random main-effect Row or Column term that has been included as an initial model for comparison with a spatial model can be removed prior to fitting the spatial model using dropRowterm or dropColterm. For TPPS with second-order differencing, the model matrices used to fit the random terms spl(row.covar):col.covar and row.covar:spl(col.covar) are transformed using the spectral decomposition of their penalty matrices, and unstructured variance models across the columns of the linear component of each of them will be tried. For TPPS with second-order differencing, it is also possible to investigate the rotation of the penalty matrix eigenvectors for the random terms spl(row.covar):col.covar and row.covar:spl(col.covar) (for more information see Piepho, Boer and Williams, 2022). The investigation takes the form of a search over a grid of rotation angles for a reduced model; the fit of the full model wuth rotation using the optimal rotation angles will only be returned if it improves on the fit of the full, unrotated model.

The TPPCS and TPP1LS models are fitted using functions from the R package TPSbits authored by Sue Welham (2022). There are two methods for supplying the spline basis information produced by tpsmmb to asreml. The grp method adds it to the data.frame supplied in the data argument of the asreml call. The mbf method creates smaller data.frames with the spline basis information in the same environment as the internal function that calls the spline-fitting function. If it is desired to use in a later session, an asreml function, or asrtests function that calls asreml, (e.g. predict.asreml, predictPlus.asreml, or changeTerms.asrtests) on an asreml.object created using mbf terms, then the mbf data.frames will need to be recreated using makeTPPSplineMats.data.frame in the new session, supplying, if there has been rotation of the penalty matrix eigenvectors, the theta values that are returned as the attribute theta.opt of the asreml.obj.

All models utlize the function changeModelOnIC.asrtests to assess the model fit, the information critera used in assessing the fit being calculated using infoCriteria. Any bound terms are remopved from the model. Arguments from tpsmmb and changeModelOnIC.asrtests can be supplied in calls to addSpatialModelOnIC.asrtests and will be passed on to the relevant function through the ellipses argument (...).

The data for experiment can be divided sections and the same spatial model fitted separately to each. The fit over all of the sections is assessed. For more detail see sections above.

Each combination of a row.coords and a col.coords does not have to specify a single observation; for example, to fit a local spatial model to the main units of a split-unit design, each combination would correspond to a main unit and all subunits of the main unit would would have the same combination.

Value

An asrtests.object containing the components (i) asreml.obj, possibly with attribute theta.opt, (ii) wald.tab, and (iii) test.summary for the model whose fit has the smallest information criterion between the supplied and spatial model. The values of the degrees of freedom and the information criteria in the test.summary are differences between those of the changed model and those of the model supplied to addSpatialModelOnIC. If the asrtests.object is the result of fitting a TPPCS model with an exploration of the rotation of the eigenvectors of the penalty matrix for the linear components, then the asreml.obj will have an attribute theta.opt that contains the optimal rotation angles of the eigenvectors.

Author(s)

Chris Brien

References

Piepho, H.-P., Boer, M. P., & Williams, E. R. (2022). Two-dimensional P-spline smoothing for spatial analysis of plant breeding trials. Biometrical Journal, 64, 835-857.

Rodriguez-Alvarez, M. X., Boer, M. P., van Eeuwijk, F. A., & Eilers, P. H. C. (2018). Correcting for spatial heterogeneity in plant breeding experiments with P-splines. Spatial Statistics, 23, 52-71.

Verbyla, A. P., De Faveri, J., Wilkie, J. D., & Lewis, T. (2018). Tensor Cubic Smoothing Splines in Designed Experiments Requiring Residual Modelling. Journal of Agricultural, Biological and Environmental Statistics, 23(4), 478-508.

Welham, S. J. (2022) TPSbits: Creates Structures to Enable Fitting and Examination of 2D Tensor-Product Splines using ASReml-R. Version 1.0.0 https://mmade.org/tpsbits/

See Also

as.asrtests, makeTPPSplineMats.data.frame, addSpatialModel.asrtests,
chooseSpatialModelOnIC.asrtests, changeModelOnIC.asrtests, changeTerms.asrtests,
rmboundary.asrtests, testranfix.asrtests, testresidual.asrtests, newfit.asreml,
reparamSigDevn.asrtests, changeTerms.asrtests, infoCriteria.asreml

Examples

## Not run: 

data(Wheat.dat)

#Add row and column covariates
Wheat.dat <- within(Wheat.dat, 
                    {
                      cColumn <- dae::as.numfac(Column)
                      cColumn <- cColumn  - mean(unique(cColumn))
                      cRow <- dae::as.numfac(Row)
                      cRow <- cRow - mean(unique(cRow))
                    })

#Fit initial model
current.asr <- asreml(yield ~ Rep + WithinColPairs + Variety, 
                      random = ~ Row + Column,
                      data=Wheat.dat)

#Create an asrtests object, removing boundary terms
current.asrt <- as.asrtests(current.asr, NULL, NULL, 
                            label = "Random Row and Column effects")
current.asrt <- rmboundary(current.asrt)

current.asrt <- addSpatialModelOnIC(current.asrt, spatial.model = "TPPS", 
                                    row.covar = "cRow", col.covar = "cColumn",
                                    dropRowterm = "Row", dropColterm = "Column",
                                    asreml.option = "grp")
infoCriteria(current.asrt$asreml.obj)

## End(Not run)

asremlPlus documentation built on Nov. 5, 2023, 5:07 p.m.