opt: Create 'opt' (optimisation) object
In episode: Estimation with Penalisation in Systems of Ordinary Differential Equations

Description Usage Arguments Details Value See Also Examples

This function creates an object of class opt, which holds data, weights, tuning parameters and control list for optimisation. This is basically the control panel for the loss function optimised in rodeo and aim.

1 2	opt(y, weights = NULL, nlambda = 25, lambda_min_ratio = NULL, lambda_decrease = TRUE, lambda = NULL, tol_l = 1e-05)

`y`	Numeric matrix (nx(d+1)). First column is time and remaining columns hold observations from different species, see details. Missing values (except in first column) are allowed and are marked with `NA` or `NaN`.
`weights`	Numeric matrix (nxd) of observation weights (optional).
`nlambda`	Number of lambda values.
`lambda_min_ratio`	Ratio between smallest and largest value of lambda (latter derived using data). If (n - s) * d > p, the default is 0.0001, else 0.01.
`lambda_decrease`	Logical indicating if automatically generated lambda sequence should be decreasing (default is `TRUE`). Consider switching to `FALSE` if nonzero initialisations are given to `rodeo.ode`.
`lambda`	A custom lambda sequence. If not `NULL`, `nlambda`, `lambda_min_ratio` and `decrease.lambda` are ignored. The optimisation reuses last optimal parameter value for next step, so `lambda` should preferably be monotonic.
`tol_l`	Positive numeric tolerance level used for stopping criterion (max-norm of gradients).

Data format

Whenever time (first column of y) decreases, the system restarts. Hence the data is assumed generated from s different contexts, where s - 1 is the number of decreases in the time vector.

Each context has its own initial condition and parameter vector specified through contexts (see reg for details).

Loss function

The loss function optimised in rodeo is:

RSS/(2 * (n - s)) + λ*∑_{parameter \ argument}λ_{factor}*∑_{j=1}^p v_jpen(param_j)

where λ belongs to the lambda-sequence and v is penalty_factor. Moreover, the residual sum of squares, RSS, is given as:

RSS = ∑^{n}_{i=1}||(y(t_i) - x(t_i, {x_0}_l, context_l * param))*√{w(t_i)}||_2^2

where param has been (internally) scaled with scales, and w(t_i) and y(t_i) refers to the i'th row of weights and y (with first column removed), respectively. The solution to the ODE system is the x()-function. The subscript 'l' refers to the context, i.e., the columns of contexts in reg-object and x0 in rodeo-functions (x0 is the initial state of the system at the first time point after a decrease in the time-vector). All products are Hadamard products.

Tuning parameter

The lambda sequence can either be fully customised through lambda or automatically generated. In the former case, a monotonic lambda-sequence is highly recommended. Throughout all optimisations, each optimisation step re-uses the old optimum, when sweeping through the lambda-sequence.

If lambda is NULL, an automatically generated sequence is used. A maximal value of lambda (the smallest at which 0 is a optimum in the rate parameter) is calculated and log-equi-distant sequence of length nlambda is generated, with the ratio between the smallest and largest lambda value being lambda_min_ratio. Note: when the opt-object is passed to rodeo.ode, one may want to initialise the optimisation at a non-zero parameter and run the optimisation on the reversed lambda sequence. This is indicated by setting decrease.lambda = FALSE. If, however, the opt-object is passed to aim, glmnet ultimately decides on the lambda sequence, and may cut it short.

Optimisation

A proximal-gradient type of optimisation is employed. The step length is denoted τ. The convergence criteria is Δ η < 10^3 * \max(|(Δ param \neq 0)| + |(Δ x0 \neq 0)|, 1) AND Δ loss / Δ η < tol_l, where

Δ η = ||Δ param||/tol_{param} + ||Δ x0|| / tol_{x_0}

An object with S3 class "opt".

aim, rodeo, rodeo.aim, rodeo.ode

# Generate some data
set.seed(123)
A <- matrix(c(1, 0, 1,
              1, 1, 0), byrow = TRUE, nrow = 2)
p <- plk(A)
x0 <- c(10, 4, 1)
theta <- matrix(c(0, -0.25,
                  0.75, 0,
                  0, -0.1), byrow = TRUE, nrow = 3)
Time <- seq(0, 1, by = .025)
y <- numsolve(p, Time, x0, theta)
y[, -1] <- y[, -1] + matrix(rnorm(prod(dim(y[, -1])), sd = .1), nrow = nrow(y))

# Minimally, you need to supply data:
op <- opt(y)

# More weight on early observations
w <- outer(1 / seq_len(nrow(y)), rep(1, length(x0)))
op <- opt(y, weights = w)

# Less weight on first coordinate
w <- outer(rep(1, nrow(y)), c(1, 2, 2))
op <- opt(y, weights = w)

# Adjust tuning parameter sequence
op <- opt(y, nlambda = 10, lambda_min_ratio = 0.05)