SDE: R6 class for stochastic differential equation

SDER Documentation

R6 class for stochastic differential equation

Description

R6 class for stochastic differential equation

R6 class for stochastic differential equation

Details

Contains the model formulas and data.

This creates an attribute tmb_obj, which can be used to evaluate the negative log-likelihood function.

This applies check_fn to the observed data (returned by data() method) to obtain observed statistics. It then repeatedly simulates a realisation from the fitted SDE (based on observed covariates), and applies check_fn to the simulated data. The simulations use the posterior = TRUE option from the codesimulate method, i.e., parameters of model used for simulation are generated from posterior distribution.

Methods

Public methods


Method new()

Create a SDE object

Usage
SDE$new(
  formulas = NULL,
  data,
  type,
  response,
  par0 = NULL,
  fixpar = NULL,
  other_data = NULL
)
Arguments
formulas

List of formulas for model parameters, with one element for each SDE parameter. Formulas can use standard R syntax, as well as mgcv-style syntax for splines and random effects.

data

Data frame with covariates, response variable, time, and ID

type

Type of SDE. Options are "BM" (Brownian motion), "OU" (Ornstein- Uhlenbeck process), "CTCRW" (continuous-time correlated random walk, a.k.a. integrated Ornstein-Uhlenbeck process), "CIR" (Cox-Ingersoll-Ross process), "BM_SSM" (BM with measurement error), "OU_SSM" (OU with measurement error), "BM_t" (BM with Student's t-distributed increments)

response

Name of response variable, correspond to a column name in data. Can be a vector of names if multiple response variables

par0

Vector of initial values for SDE parameters, with one value for each SDE parameter. If not provided, parameters are initialised to zero on the link scale.

fixpar

Vector of names of fixed SDE parameters

other_data

Named list of data objects to pass to likelihood, only required for special models

Returns

A new SDE object


Method formulas()

Formulas of SDE object

Usage
SDE$formulas()

Method data()

Data of SDE object

Usage
SDE$data()

Method type()

Type of SDE object

Usage
SDE$type()

Method response()

Name(s) of response variable(s)

Usage
SDE$response()

Method fixpar()

Name(s) of fixed parameter(s)

Usage
SDE$fixpar()

Method mats()

List of model matrices (X_fe, X_re, and S)

Usage
SDE$mats()

Method other_data()

Named list of additional data objects

Usage
SDE$other_data()

Method link()

Link functions

Usage
SDE$link()

Method invlink()

Inverse link functions

Usage
SDE$invlink()

Method coeff_fe()

Fixed effect parameters

Usage
SDE$coeff_fe()

Method coeff_re()

Random effect parameters

Usage
SDE$coeff_re()

Method lambda()

Smoothness parameters

Usage
SDE$lambda()

Method sdev()

Standard deviations of smooth terms

This function transforms the smoothness parameter of each smooth term into a standard deviation, given by SD = 1/sqrt(lambda). It is particularly helpful to get the standard deviations of independent normal random effects.

Usage
SDE$sdev()

Method rho()

Decay parameter

Usage
SDE$rho()

Method terms()

Terms of model formulas

Usage
SDE$terms()

Method out()

Output of optimiser after model fitting

Usage
SDE$out()

Method tmb_obj()

Model object created by TMB. This is the output of the TMB function MakeADFun, and it is a list including elements

  • fnObjective function

  • grGradient function of fn

  • parVector of initial parameters on working scale

Usage
SDE$tmb_obj()

Method tmb_obj_joint()

Model object created by TMB for the joint likelihood of the fixed and random effects. This is the output of the TMB function MakeADFun, and it is a list including elements

  • fnObjective function

  • grGradient function of fn

  • parVector of initial parameters on working scale

Usage
SDE$tmb_obj_joint()

Method tmb_rep()

Output of the TMB function sdreport, which includes estimates and standard errors for all model parameters.

Usage
SDE$tmb_rep()

Method obs()

Data frame of observations (subset response variables out of full data frame)

Usage
SDE$obs()

Method eqn()

Print equation for this model

Usage
SDE$eqn()

Method X_re_decay()

Get design matrix for random effects in decay model

The design matrix is obtained by taking X_re (returned by make_mat), and multiplying the relevant columns by something like exp(-rho * time) to force the splines to decay to zero with a rate determined by rho.

Usage
SDE$X_re_decay()
Returns

Design matrix


Method update_coeff_fe()

Update fixed effect coefficients

Usage
SDE$update_coeff_fe(new_coeff)
Arguments
new_coeff

New coefficient vector


Method update_coeff_re()

Update random effect coefficients

Usage
SDE$update_coeff_re(new_coeff)
Arguments
new_coeff

New coefficient vector


Method update_lambda()

Update smoothness parameters

Usage
SDE$update_lambda(new_lambda)
Arguments
new_lambda

New smoothness parameter vector


Method update_rho()

Update decay parameter

Usage
SDE$update_rho(new_rho)
Arguments
new_rho

New decay parameter vector


Method make_mat()

Create model matrices

Usage
SDE$make_mat(new_data = NULL)
Arguments
new_data

Optional new data set, including covariates for which the design matrices should be created.

Returns

A list of

  • X_fe Design matrix for fixed effects

  • X_re Design matrix for random effects

  • S Smoothness matrix

  • ncol_fe Number of columns for X_fe for each parameter

  • ncol_re Number of columns of X_re and S for each random effect

Design matrices for grid of covariates


Method make_mat_grid()

Usage
SDE$make_mat_grid(var, covs = NULL)
Arguments
var

Name of variable

covs

Optional data frame with a single row and one column for each covariate, giving the values that should be used. If this is not specified, the mean value is used for numeric variables, and the first level for factor variables.

Returns

A list with the same elements as the output of make_mat, plus a data frame of covariates values.


Method setup()

TMB setup

Usage
SDE$setup(silent = TRUE, map = NULL)
Arguments
silent

Logical. If TRUE, all tracing outputs are hidden (default).

map

List passed to MakeADFun to fix parameters. (See TMB documentation.)


Method fit()

Model fitting

The negative log-likelihood of the model is minimised using the function optim. TMB uses the Laplace approximation to integrate the random effects out of the likelihood.

After the model has been fitted, the output of optim can be accessed using the method res.

Usage
SDE$fit(silent = TRUE, map = NULL)
Arguments
silent

Logical. If TRUE, all tracing outputs are hidden (default).

map

List passed to MakeADFun to fix parameters. See TMB documentation.


Method linear_predictor()

Get linear predictor for SDE parameters

Usage
SDE$linear_predictor(
  new_data = NULL,
  t = "all",
  X_fe = NULL,
  X_re = NULL,
  coeff_fe = NULL,
  coeff_re = NULL,
  term = NULL
)
Arguments
new_data

Optional data set of covariates. If new_data, X_fe and X_re are not provided, then the observed covariates are used.

t

Time points for which the parameters should be returned. If "all", returns parameters for all time steps (default).

X_fe

Optional design matrix for fixed effects, as returned by make_mat. If new_data, X_fe and X_re are not provided, then the observed covariates are used.

X_re

Optional design matrix for random effects, as returned by make_mat. If new_data, X_fe and X_re are not provided, then the observed covariates are used.

coeff_fe

Optional vector of fixed effect parameters

coeff_re

Optional vector of random effect parameters

term

Name of model term as character string, e.g., "time", or "s(time)". Use $coeff_fe() and $coeff_re() methods to find names of model terms. This uses fairly naive substring matching, and may not work if one covariate's name is a substring of another one.

Returns

Matrix of linear predictor (X_fe with one row for each time step and one column for each SDE parameter


Method par()

Get SDE parameters

Usage
SDE$par(
  t = NULL,
  new_data = NULL,
  X_fe = NULL,
  X_re = NULL,
  coeff_fe = NULL,
  coeff_re = NULL,
  resp = TRUE,
  term = NULL
)
Arguments
t

Time points for which the parameters should be returned. If "all", returns parameters for all time steps. Default: 1.

new_data

Optional data set of covariates. If new_data, X_fe and X_re are not provided, then the observed covariates are used.

X_fe

Optional design matrix for fixed effects, as returned by make_mat. By default, uses design matrix from data.

X_re

Optional design matrix for random effects, as returned by make_mat. By default, uses design matrix from data.

coeff_fe

Optional vector of fixed effect parameters

coeff_re

Optional vector of random effect parameters

resp

Logical (default: TRUE). Should the output be on the response scale? If FALSE, the output is on the linear predictor scale.

term

Name of model term as character string, e.g., "time", or "s(time)". Use $coeff_fe() and $coeff_re() methods to find names of model terms. This uses fairly naive substring matching, and may not work if one covariate's name is a substring of another one.

Returns

Matrix with one row for each time point in t, and one column for each SDE parameter


Method post_coeff()

Posterior draws (coefficients)

Usage
SDE$post_coeff(n_post)
Arguments
n_post

Number of posterior draws

Returns

Matrix with one column for each coefficient and one row for each posterior draw


Method post_par()

Posterior draws of SDE parameters (for uncertainty quantification)

Usage
SDE$post_par(X_fe, X_re, n_post = 100, resp = TRUE, term = NULL)
Arguments
X_fe

Design matrix (fixed effects)

X_re

Design matrix (random effects)

n_post

Number of posterior draws (default: 100)

resp

Logical (default: TRUE). Should the output be on the response scale? If FALSE, the output is on the linear predictor scale.

term

Name of model term as character string, e.g., "time", or "s(time)". Use $coeff_fe() and $coeff_re() methods to find names of model terms. This uses fairly naive substring matching, and may not work if one covariate's name is a substring of another one.

Returns

Array with one row for each time step, one column for each SDE parameter, and one layer for each posterior draw


Method CI_pointwise()

Pointwise confidence intervals for SDE parameters

Usage
SDE$CI_pointwise(
  t = NULL,
  new_data = NULL,
  X_fe = NULL,
  X_re = NULL,
  level = 0.95,
  n_post = 1000,
  resp = TRUE,
  term = NULL
)
Arguments
t

Time points for which the parameters should be returned. If "all", returns parameters for all time steps. Defaults to 1 if new data not provided, or "all" if new data provided.

new_data

Optional data frame containing covariate values for which the CIs should be computed

X_fe

Optional design matrix for fixed effects, as returned by make_mat. By default, uses design matrix from data.

X_re

Optional design matrix for random effects, as returned by make_mat. By default, uses design matrix from data.

level

Confidence level (default: 0.95 for 95% confidence intervals)

n_post

Number of posterior samples from which the confidence intervals are calculated. Larger values will reduce approximation error, but increase computation time. Defaults to 1000.

resp

Logical (default: TRUE). Should the output be on the response scale? If FALSE, the output is on the linear predictor scale.

term

Name of model term as character string, e.g., "time", or "s(time)". Use $coeff_fe() and $coeff_re() methods to find names of model terms. This uses fairly naive substring matching, and may not work if one covariate's name is a substring of another one.

This method generates pointwise confidence intervals by simulation. That is, it generates n_post posterior samples of the estimated parameters from a multivariate normal distribution, where the mean is the vector of estimates and the covariance matrix is provided by TMB (using post_par). Then, the SDE parameters are derived for each set of posterior parameter values, and pointwise confidence intervals are obtained as quantiles of the posterior simulated SDE parameters.

Returns

List with elements:

  • lowMatrix of lower bounds of confidence intervals.

  • uppMatrix of upper bounds of confidence intervals.


Method CI_simultaneous()

Simultaneous confidence intervals for SDE parameters

Usage
SDE$CI_simultaneous(
  t = NULL,
  new_data = NULL,
  X_fe = NULL,
  X_re = NULL,
  level = 0.95,
  n_post = 1000,
  resp = TRUE,
  term = NULL
)
Arguments
t

Time points for which the parameters should be returned. If "all", returns parameters for all time steps. Defaults to 1 if new data not provided, or "all" if new data provided.

new_data

Optional data frame containing covariate values for which the CIs should be computed

X_fe

Optional design matrix for fixed effects, as returned by make_mat. By default, uses design matrix from data.

X_re

Optional design matrix for random effects, as returned by make_mat. By default, uses design matrix from data.

level

Confidence level (default: 0.95 for 95% confidence intervals)

n_post

Number of posterior samples from which the confidence intervals are calculated. Larger values will reduce approximation error, but increase computation time. Defaults to 1000.

resp

Logical (default: TRUE). Should the output be on the response scale? If FALSE, the output is on the linear predictor scale.

term

Name of model term as character string, e.g., "time", or "s(time)". Use $coeff_fe() and $coeff_re() methods to find names of model terms. This uses fairly naive substring matching, and may not work if one covariate's name is a substring of another one.

This method closely follows the approach suggested by Gavin Simpson at fromthebottomoftheheap.net/2016/12/15/simultaneous-interval-revisited/, itself based on Section 6.5 of Ruppert et al. (2003).

Returns

List with elements:

  • lowMatrix of lower bounds of confidence intervals.

  • uppMatrix of upper bounds of confidence intervals.


Method residuals()

Model residuals

Usage
SDE$residuals()

Method check_post()

Posterior predictive checks

Usage
SDE$check_post(check_fn, n_sims = 100, silent = FALSE)
Arguments
check_fn

Goodness-of-fit function which accepts "data" as input and returns a statistic or a vector of statistics, to be compared between observed data and simulations.

n_sims

Number of simulations to perform

silent

Logical. If FALSE, simulation progress is shown. (Default: FALSE)

Returns

List with elements:

  • obs_statVector of values of goodness-of-fit statistics for the observed data

  • statsMatrix of values of goodness-of-fit statistics for the simulated data sets (one row for each statistic, and one column for each simulation)

  • plotggplot object showing, for each statistic returned by check_fn, a histogram of simulated values and a vertical line for the observed value. An observed value in the tails of the simulated distribution suggests lack of fit.

Conditional Akaike Information Criterion

The conditional AIC is for example defined by Wood (2017), as AIC = - 2L + 2k where L is the maximum joint log-likelihood (of fixed and random effects), and k is the number of effective degrees of freedom of the model (accounting for flexibility in non-parametric terms implied by smoothing)


Method AIC_conditional()

Usage
SDE$AIC_conditional()
Returns

Conditional AIC Marginal Akaike Information Criterion

The marginal AIC is for example defined by Wood (2017), as AIC = - 2L + 2k where L is the maximum marginal log-likelihood (of fixed effects), and k is the number of degrees of freedom of the fixed effect component of the model


Method AIC_marginal()

Usage
SDE$AIC_marginal()
Returns

Marginal AIC


Method edf_conditional()

Effective degrees of freedom

Usage
SDE$edf_conditional()
Returns

Number of effective degrees of freedom (accounting for flexibility in non-parametric terms implied by smoothing)


Method simulate()

Simulate from SDE model

Usage
SDE$simulate(data, z0 = 0, posterior = FALSE)
Arguments
data

Data frame for input data. Should have at least one column 'time' for times of observations, and columns for covariates if necessary.

z0

Optional value for first observation of simulated time series. Default: 0.

posterior

Logical. If TRUE, the parameters used for the simulation are drawn from their posterior distribution using SDE$post_coeff, therefore accounting for uncertainty.

Returns

Input data frame with extra column for simulated time series


Method plot_par()

Plot observation parameters

Usage
SDE$plot_par(
  var,
  par_names = NULL,
  covs = NULL,
  n_post = 100,
  show_CI = "none",
  resp = TRUE,
  term = NULL
)
Arguments
var

Name of covariate as a function of which the parameters should be plotted

par_names

Optional vector for the names of SDE parameters to plot. If not specified, all parameters are plotted.

covs

Optional data frame with a single row and one column for each covariate, giving the values that should be used. If this is not specified, the mean value is used for numeric variables, and the first level for factor variables.

n_post

Number of posterior draws to plot. Default: 100.

show_CI

Should confidence bands be plotted rather than posterior draws? Can takes values 'none' (default; no confidence bands), 'pointwise' (show pointwise confidence bands obtained with CI_pointwise), or 'simultaneous' (show simultaneous confidence bands obtained with CI_simultaneous)

resp

Logical (default: TRUE). Should the output be on the response scale? If FALSE, the output is on the linear predictor scale.

term

Name of model term as character string, e.g., "time", or "s(time)". Use $coeff_fe() and $coeff_re() methods to find names of model terms. This uses fairly naive substring matching, and may not work if one covariate's name is a substring of another one.

Returns

A ggplot object


Method ind_fixcoeff()

Indices of fixed coefficients in coeff_fe

Usage
SDE$ind_fixcoeff()

Method message()

Print SDE and parameter formulas

Usage
SDE$message()

Method print_par()

Print parameter values for t = 1

Usage
SDE$print_par()

Method print()

Print SDE object

Usage
SDE$print()

Method clone()

The objects of this class are cloneable with this method.

Usage
SDE$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.


TheoMichelot/smoothSDE documentation built on Jan. 26, 2024, 6:41 p.m.