fit.ssm: Fit a smooth supersaturated model
In SSM: Fit and Analyze Smooth Supersaturated Models

Description Usage Arguments Details Value See Also Examples

fit.ssm fits a smooth supersaturated model to given data. By default the model smooths over [-1, 1]^d and uses a basis of Legendre polynomials. Many model parameters such as basis size and smoothing criterion can be user-defined. Optionally, sensitivity indices and a metamodel error estimating Gaussian process can be computed.

1
2
3

fit.ssm(design, response, ssm, basis, basis_size, K, P, design_model_matrix,
  SA = FALSE, GP = FALSE, type = "exp", validation = FALSE,
  exclude = list(), distance_type = "distance")

`design`	A matrix containing the design. Each design point is a row in the matrix. Accepts a vector for a design in one variable. If the default options for `P` and `K` are used then it is recommended that the design is transformed to lay within [-1, 1]^d. The function `transform11` is useful in this regard.
`response`	A vector containing the responses at design points. The length must correspond with the number of rows in `design`.
`ssm`	(optional) A pre-existing SSM class object. If this argument is supplied then `basis`, `basis_size`, `K`, and `P` will be carried over rather than re-computed. This is useful for simulation studies where the model structure remains the same and only the design and responses change.
`basis`	(optional) A matrix where each row is an exponent vector of a monomial. This is used in conjunction with `P` to construct the model basis. If not supplied, a hierarchical basis will be used.
`basis_size`	(optional) A number. Specifies the desired number of polynomials in the model basis. If not supplied, `basis_size` is set to 20 d + n*.
`K`	(optional) A semi-positive definite matrix specifying the weighting criterion of basis terms. If not supplied, default behaviour is to use the Frobenius norm of the term Hessian integrated over [-1, 1]^d with respect to a basis of Legendre polynomials.
`P`	(optional) A matrix defining the polynomials used to construct the model basis. Each column corresponds to one polynomial. If not supplied, a Legendre polynomial basis is used.
`design_model_matrix`	(optional) Specify a design model matrix. If provided the new model will be fit to the basis and design implied by the design model matrix regardless of the values of `basis`, `P` and `design`.
`SA`	(optional) Logical. If TRUE then Sobol indices, Total indices and Total interaction indices will be computed.
`GP`	(optional) Logical. If TRUE then a GP metamodel error estimate will be computed.
`type`	(optional) Character. One of "exp" or "matern32". Specifies the correlation function used to compute the GP covariance matrix. Irrelevant if `GP` is FALSE. For further details see `compute.covariance`.
`validation`	(optional) Logical. If TRUE then the Leave-One-Out errors are computed for each design point and the standardised root mean square error calculated. The rmse is standardised against the variance of the ybar estimator. If `GP` is TRUE then these will be calculated regardless of the value of `validation`.
`exclude`	(optional) A list of vectors of integers. These indicate terms in the listed variables should be omitted from the model. e.g. `exclude = list(1)` removes all terms dependent on the first variable only. `exclude = list(1, c(1, 2))` removes terms in the first variable only and interactions between the first and second variables only. To remove a variable and all of its higher order interactions, it is better to remove the appropriate column from the design otherwise the algorithm will generate a lot of basis vectors that will be excluded, wasting computation.
`distance_type`	(optional) Character. Selects the distance function used for the GP metamodel error estimate correlation function. One of "distance", "line", "product", "area", "proddiff", or "smoothdiff". Uses "distance", the standard Euclidean distance between points by default. For further details see `new.distance`. Not needed if `GP` is FALSE.

Returns an SSM object containing the fitted smooth supersaturated model. Minimal arguments required are design and response. This will result in a model using Legendre polynomials and smoothing over [-1, 1]^d. All other arguments will be assigned automatically if not specified.

If the model is unable to be fit due to numerical instability a warning will be given and the returned SSM object will have a zero vector of appropriate length in the theta slot. The basis_size parameter is often useful in this situation. Try reducing the basis_size until a model fit is successful. Ideally the basis size should be as large as possible without causing instability in the predictions (see the example below).

If SA is TRUE then sensitivty analysis will be performed on the model. Sobol indices for main effects, Total indices, and Total interaction indices for second order interactions will be computed and held in the slots main_sobol, total_sobol and total_int respectively. If the number of factors is < 11 then Sobol indices will be computed for all order interactions and stored in int_sobol. Default behaviour is to assume each input is uniformly distributed over [-1, 1]. If P has been used-defined then the polynomials defined by P are assumed to be an orthonormal system with respect to some measure on the reals. See update.sensitivity for more details.

If GP is TRUE (default behaviour is false due to computational cost) then a the metamodel error is estimated using a zero-mean Gaussian process with a constant trend. Scaling and length parameters are estimated using maximum likelihood methods using the Leave-One-Out model errors and stored in the sigma and r slots respectively. Model predictions using predict.SSM will then include credible intervals. The distance between points is defined by the distance_type argument and is set to the standard Euclidean distance by default. See new.distance for other options although they are experimental and subject to erratic behaviour. The default correlation function used is the square exponential although this can be changed to a Matern 3/2 function by setting the type argument to "matern32".

If validation is TRUE then the Leave-One_Out error at each design point will be computed and stored in the residuals slot, and the LOO RMSE computed and stored in the LOO_RMSE slot. Note that if GP is TRUE then these values will be computed regardless of the value of validation as they are required to fit the metamodel error estimate GP.

An SSM object.

predict.SSM for model predictions for SSM, and plot.SSM for plotting main effects of SSM. transform11 is useful for transforming data to [-1, 1]^d.

# A simple one factor example
X <- seq(-1,1,0.5) # design
Y <- c(0,1,0,0.5,0) # response
s <- fit.ssm(X,Y)
s
plot(s)
predict(s,0.3)

# used defined basis sizes

# A model that is too large to fit
## Not run: 
s <- fit.ssm(X, Y, basis_size=80)

## End(Not run)
# A large model that can be fit but is unstable
s <- fit.ssm(X, Y, basis_size=70)
plot(s)
# A model larger than default that is not unstable
s <- fit.ssm(X, Y, basis_size=40)
plot(s)

# with metamodel error estimate

s <- fit.ssm(X, Y, GP=TRUE)
plot(s)
predict(s,0.3)

# Sensitivity analysis and main effect plotting

# A design of 20 points over [-1, 1]^d
X <- matrix(runif(20, -1, 1), ncol = 2)
Y <- runif(10)
s <- fit.ssm(X, Y, SA = TRUE)
s
sensitivity.plot(s)
plot(s)