fitGP | R Documentation |
Fits a GP model with separable length scales and automatic relevance determination (ARD).
Also fits a hierarchical version of the GP model if distinct populations are indicated
using pop
.
There are several ways to specify the training data:
A. supply data frame data
, plus column names or indices for y
, x
, pop
, and time
.
B. supply vectors for y
, pop
, and time
, and a matrix or vector for x
.
For each of the above 2 options, there are 3 options for specifying the predictors.
supplying y
and x
(omitting E
and tau
) will use the columns of x
as predictors. This allows
for the most customization.
supplying y
, E
, and tau
(omitting x
) will use E
lags of y
(with spacing tau
) as predictors.
This is equivalent to option 3 with x
=y
.
supplying y
, x
, E
, and tau
will use E
lags of each column of x
(with spacing tau
) as predictors.
Do not use this option if x
already contains lags, in that case use option 1.
Arguments pop
and time
are optional in all of the above cases.
If pop
and time
are supplied, they should NOT contain missing values.
This function will also, optionally, produce predictions.
See Details for more information.
fitGP(
data = NULL,
y,
x = NULL,
pop = NULL,
time = NULL,
E = NULL,
tau = NULL,
scaling = c("global", "local", "none"),
initpars = NULL,
modeprior = 1,
fixedpars = NULL,
rhofixed = NULL,
rhomatrix = NULL,
augdata = NULL,
predictmethod = NULL,
newdata = NULL,
xnew = NULL,
popnew = NULL,
timenew = NULL,
ynew = NULL,
returnGPgrad = FALSE,
exclradius = 0,
linprior = c("none", "local", "global")
)
data |
A data frame, or matrix with named columns. |
y |
The response variable (required). If |
x |
Predictor variables. If |
pop |
Identifies separate populations (optional, if not supplied, defaults to 1
population). Population values can be either numeric, character, or factor.
If |
time |
A time index (optional, if not supplied, defaults to a numeric index).
Important: The time index is not used for model fitting (timesteps are
assumed to be evenly spaced) but supplying |
E |
Embedding dimension. If supplied, will be used to constuct lags of |
tau |
Time delay. If supplied, will be used to constuct lags of |
scaling |
How the variables should be scaled (see Details). Scaling can be |
initpars |
Starting values for hyperparameters (see Details) in the order
|
modeprior |
This value is used by the phi prior and sets the expected number of modes over the unit interval. Defaults to 1. |
fixedpars |
Fixes values of the hyperparameters phi, ve, and sigma2 (if desired). Should be a numeric
vector with length |
rhofixed |
Fixes the rho parameter, if desired (see Details). |
rhomatrix |
Symmetrical square matrix of fixed pairwise rho values to use, with 1's on the diagonal.
The rows and columns should be named with the population identifier.
The output of |
augdata |
A data frame with augmentation data (see Details). |
predictmethod |
Using the training data, |
newdata |
Data frame containing the same columns supplied in the original model. |
xnew |
New predictor matrix or vector. Not required if |
popnew |
New population vector. Not required if |
timenew |
New time vector. Not required if |
ynew |
New response vector. Optional, unless |
returnGPgrad |
Return the gradient (derivative) of the GP model at each time point with
respect to each input. This is only computed for out-of-sample predictions using |
exclradius |
For |
linprior |
Fit GP model to the residuals of a linear relationship
between |
The data are assumed to be in time order. This is particularly important if E
and tau
are supplied or predictmethod="sequential"
is used.
Missing values:
Missing values for y
and x
are allowed. Any rows containing
missing values will be excluded from the model fitting procedure (reinserted
as NAs in the output). If pop
and time
are supplied, they
should NOT contain missing values.
Hyperparameters:
The model uses a squared exponential covariance function. See the "Extended introduction" vignette for mathematical details.
There is one inverse length scale phi
estimated for each predictor
variable (input dimension). Values near 0 indicate that the predictor has
little influence on the response variable, and it was likely dropped by ARD.
Higher values of phi
indicate greater nonlinearity.
The estimated variance terms are ve
(process variance) and
sigma2
(pointwise prior variance).
Hyperparameter rho
is the dynamic correlation in the hierarchical GP
model, indicating similarity of dynamics across populations. If there is only
1 population (e.g. if pop
is not supplied), rho
is irrelevant
(not used by the model) and will revert to the mode of its prior (~0.5). The
value of rho
can be fixed to any value from 0.0001 to 0.9999 using
rhofixed
, otherwise it is estimated from the data. Alternatively, a matrix
of fixed pairwise rho values can be supplied using rhomatrix
. In this case,
the single rho parameter will also not be used and will revert to the mode of the
prior (~0.5). A pairwise rho matrix can be generated using getrhomatrix
,
or you can create a custom one (e.g. based on geographical distance).
Scaling:
The model priors assume the response variable and all predictor variables
have approximately zero mean and unit variance, therefore it is important to
scale the data properly when fitting these models. For convenience, this
function will scale the input data automatically by default, and
backtransform output to the original scale. Automatic scaling can be
"global"
(default), or "local"
. The latter will scale variables
within (as opposed to across) populations, so is obviously only applicable if
there is more than 1 population. You can also scale the data yourself
beforehand in whatever manner you wish and set scaling="none"
. In this
case, you will obviously have to do the backtransformation yourself.
Predictions:
There are several options for producing predictions:
Use predictmethod="loo"
for leave-one-out prediction using the training data.
Use predictmethod="lto"
for leave-timepoint-out prediction using the training data. This will leave
out values with the same time index across multiple populations, rather than each individual datapoint.
If there is only one population, "lto"
will be equivalent to "loo"
.
Use predictmethod="sequential"
for sequential (leave-future-out) prediction using the training data.
If data frame data
was supplied, supply data frame newdata
containing same column names.
Column for y
is optional, unless E
and tau
were supplied in lieu of x
.
If vectors/matrices were supplied for y
, x
, etc, equivalent vector/matrices xnew
,
popnew
(if pop
was supplied), and timenew
(optional).
ynew
is optional, unless E
and tau
were supplied in lieu of x
.
After fitting, predictions can also be made using predict.GP
.
Predictions can plotted using plot.GPpred
.
Get conditional responses using getconditionals
.
It should be noted that "loo"
is not a "true" leave-one-out, for although each prediction is
made by removing one of the training points, the hyperparameters are fit using all of the training data.
The same goes for "sequential"
.
For the out-of-sample prediction methods "loo"
, "lto"
, and
newdata
, the partial derivatives of the fitted GP function at each
time point with respect to each input can be obtained by setting
returnGPgrad = T
. If you want the in-sample gradient, pass the
original (training) data as back in as newdata
. It automatic scaling
was used, the gradient will also be backtransformed to the original units.
Fit statistics:
The R2 and other fit statistics are always computed for y
in the units in which it was
supplied to the function. The R2 is specifically computed as:
R^2=1-SS_{err}/SS_{y}
which is equivalent to
R^2=1-MSE/Var_{y}
As a result, the returned R-squared may be negative, particularly for out-of-sample predictions, which are not guaranteed to pass through the mean. A negative R2 indicates that the model predictions are worse than just using the mean (MSE is larger than the variance).
If there are multiple populations, there will be a total R2, which is the R2 across populations, as well as within-population R2 values. For the total R2, the denominator is the variance across all datapoints, ignoring population. Note that if the populations have very different local means, the total R2 can be potentially misleading because the across-population variance will be much larger than the within population variance, increasing R2 even if MSE is constant. Put another way, simply getting the local means right can explain a lot of the across-population variance even if there is little prediction accuracy within a population. Very different local variances can cause similar issues. In this case, we would recommend looking at the within-population R2 values, or at one of the alternative R2 values provided in the fit stats: R2centered (total R2 with local means removed) or R2scaled (total R2 with local centering and scaling), which may be more accurate measures of performance than total R2 if the populations have very different local means and/or variances.
A list (class GP and GPpred) with the following elements:
pars |
Posterior mode of hyperparameters (named vector). |
parst |
Posterior mode of hyperparameters (named vector), transformed to real line (used for optimization). |
grad |
Likelihood gradients of hyperparameters at posterior mode. Can be used to assess convergence. |
covm |
List containing various covariance matrices and the inverse covariance matrix. |
iter |
Number of iterations until convergence was reached. Currently fixed at a max of 200. |
inputs |
Inputs and scaled inputs. Note that if you use |
scaling |
Scaling information. |
linprior |
If |
insampresults |
Data frame with in-sample predictions. |
insampfitstats |
Fit statistics for in-sample predictions. Includes R2, rmse, ln_post (log posterior likelihood evaluated at the mode), lnL_LOO (generalized cross-validation approximate leave-one-out negative log likelihood), and df (estimated degrees of freedom, equal to the trace of the smoother matrix). lnL_LOO does not include the prior, so is not directly comparable to ln_post. For diagnostics, also includes likelihood components ln_prior, SS, logdet. If there are multiple populations, also includes R2centered and R2scaled. |
insampfitstatspop |
If >1 population, fit statistics (R2 and rmse) for in-sample predictions by population. |
outsampresults |
Data frame with out-of-sample predictions (if requested). |
outsampfitstats |
Fit statistics for out-of-sample predictions (if requested).
Only computed if using |
outsampfitstatspop |
If >1 population, fit statistics for out-of-sample predictions (if requested) by population. |
GPgrad |
If |
call |
Function call. Allows use of |
Munch, S. B., Poynor, V., and Arriaza, J. L. 2017. Circumventing structural uncertainty: a Bayesian perspective on nonlinear forecasting for ecology. Ecological Complexity, 32: 134.
predict.GP
, plot.GPpred
, getconditionals
, getrhomatrix
yrand=rnorm(20)
testgp=fitGP(y=yrand,E=2,tau=1,predictmethod = "loo")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.