opt: Create 'opt' (optimisation) object

Description Usage Arguments Details Value See Also Examples

Description

This function creates an object of class opt, which holds data, weights, tuning parameters and control list for optimisation. This is basically the control panel for the loss function optimised in rodeo and aim.

Usage

1
2
opt(y, weights = NULL, nlambda = 25, lambda_min_ratio = NULL,
  lambda_decrease = TRUE, lambda = NULL, tol_l = 1e-05)

Arguments

y

Numeric matrix (nx(d+1)). First column is time and remaining columns hold observations from different species, see details. Missing values (except in first column) are allowed and are marked with NA or NaN.

weights

Numeric matrix (nxd) of observation weights (optional).

nlambda

Number of lambda values.

lambda_min_ratio

Ratio between smallest and largest value of lambda (latter derived using data). If (n - s) * d > p, the default is 0.0001, else 0.01.

lambda_decrease

Logical indicating if automatically generated lambda sequence should be decreasing (default is TRUE). Consider switching to FALSE if nonzero initialisations are given to rodeo.ode.

lambda

A custom lambda sequence. If not NULL, nlambda, lambda_min_ratio and decrease.lambda are ignored. The optimisation reuses last optimal parameter value for next step, so lambda should preferably be monotonic.

tol_l

Positive numeric tolerance level used for stopping criterion (max-norm of gradients).

Details

Data format

Whenever time (first column of y) decreases, the system restarts. Hence the data is assumed generated from s different contexts, where s - 1 is the number of decreases in the time vector.

Each context has its own initial condition and parameter vector specified through contexts (see reg for details).

Loss function

The loss function optimised in rodeo is:

RSS/(2 * (n - s)) + λ*∑_{parameter \ argument}λ_{factor}*∑_{j=1}^p v_jpen(param_j)

where λ belongs to the lambda-sequence and v is penalty_factor. Moreover, the residual sum of squares, RSS, is given as:

RSS = ∑^{n}_{i=1}||(y(t_i) - x(t_i, {x_0}_l, context_l * param))*√{w(t_i)}||_2^2

where param has been (internally) scaled with scales, and w(t_i) and y(t_i) refers to the i'th row of weights and y (with first column removed), respectively. The solution to the ODE system is the x()-function. The subscript 'l' refers to the context, i.e., the columns of contexts in reg-object and x0 in rodeo-functions (x0 is the initial state of the system at the first time point after a decrease in the time-vector). All products are Hadamard products.

Tuning parameter

The lambda sequence can either be fully customised through lambda or automatically generated. In the former case, a monotonic lambda-sequence is highly recommended. Throughout all optimisations, each optimisation step re-uses the old optimum, when sweeping through the lambda-sequence.

If lambda is NULL, an automatically generated sequence is used. A maximal value of lambda (the smallest at which 0 is a optimum in the rate parameter) is calculated and log-equi-distant sequence of length nlambda is generated, with the ratio between the smallest and largest lambda value being lambda_min_ratio. Note: when the opt-object is passed to rodeo.ode, one may want to initialise the optimisation at a non-zero parameter and run the optimisation on the reversed lambda sequence. This is indicated by setting decrease.lambda = FALSE. If, however, the opt-object is passed to aim, glmnet ultimately decides on the lambda sequence, and may cut it short.

Optimisation

A proximal-gradient type of optimisation is employed. The step length is denoted τ. The convergence criteria is Δ η < 10^3 * \max(|(Δ param \neq 0)| + |(Δ x0 \neq 0)|, 1) AND Δ loss / Δ η < tol_l, where

Δ η = ||Δ param||/tol_{param} + ||Δ x0|| / tol_{x_0}

Value

An object with S3 class "opt".

See Also

aim, rodeo, rodeo.aim, rodeo.ode

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
# Generate some data
set.seed(123)
A <- matrix(c(1, 0, 1,
              1, 1, 0), byrow = TRUE, nrow = 2)
p <- plk(A)
x0 <- c(10, 4, 1)
theta <- matrix(c(0, -0.25,
                  0.75, 0,
                  0, -0.1), byrow = TRUE, nrow = 3)
Time <- seq(0, 1, by = .025)
y <- numsolve(p, Time, x0, theta)
y[, -1] <- y[, -1] + matrix(rnorm(prod(dim(y[, -1])), sd = .1), nrow = nrow(y))

# Minimally, you need to supply data:
op <- opt(y)

# More weight on early observations
w <- outer(1 / seq_len(nrow(y)), rep(1, length(x0)))
op <- opt(y, weights = w)

# Less weight on first coordinate
w <- outer(rep(1, nrow(y)), c(1, 2, 2))
op <- opt(y, weights = w)

# Adjust tuning parameter sequence
op <- opt(y, nlambda = 10, lambda_min_ratio = 0.05)

episode documentation built on May 1, 2019, 11:17 p.m.