netcox: fit a (time-dependent) Cox model with structured variable...

View source: R/netcox.R

netcoxR Documentation

fit a (time-dependent) Cox model with structured variable selection

Description

Fit a (time-dependent) Cox model via penalized maximum likelihood, where the penalization is a weighted sum of infinity norm of (overlapping) groups of coefficients. The regularization path is computed at a grid of values for the regularization parameter lambda.

Usage

netcox(
  x,
  ID,
  time,
  time2,
  event,
  lambda,
  group,
  group_variable,
  penalty_weights,
  par_init,
  stepsize_init = 1,
  stepsize_shrink = 0.8,
  tol = 1e-05,
  maxit = 1000L,
  verbose = FALSE
)

Arguments

x

Predictor matrix with dimension nm * p, where n is the number of subjects, m is the maximum observation time, and p is the number of predictors. See Details.

ID

The ID of each subjects, each subject has one ID (many rows in x share one ID).

time

Represents the start of each time interval.

time2

Represents the stop of each time interval.

event

Indicator of event. event = 1 when event occurs and event = 0 otherwise.

lambda

Sequence of regularization coefficients λ's.

group

G * G matrix describing the relationship between the groups of variables, where G represents the number of groups. Denote the i-th group of variables by g_i. The (i,j) entry is 1 if and only if i\neq j and g_i is a child group (subset) of g_j, and is 0 otherwise. See Examples and Details.

group_variable

p * G matrix describing the relationship between the groups and the variables. The (i,j) entry is 1 if and only if variable i is in group g_j, but not in any child group of g_j, and is 0 otherwise. See Examples and Details.

penalty_weights

Optional, vector of length G specifying the group-specific penalty weights. If not specified, the default value is \mathbf{1}_G. Modify with caution.

par_init

Optional, vector of initial values of the optimization algorithm. Default initial value is zero for all p variables.

stepsize_init

Initial value of the stepsize of the optimization algorithm. Default is 1.

stepsize_shrink

Factor in (0,1) by which the stepsize shrinks in the backtracking linesearch. Default is 0.8.

tol

Convergence criterion. Algorithm stops when the l_2 norm of the difference between two consecutive updates is smaller than tol.

maxit

Maximum number of iterations allowed.

verbose

Logical, whether progress is printed.

Details

The predictor matrix should be of dimension nm * p. Each row records the values of covariates for one subject at one time, for example, the values at the day from time (Start) to time2 (Stop). An example dataset sim is provided. The dataset has the same format produced by the R package PermAlgo. The specification of arguments group and group_variable for the grouping structure can be found in http://thoth.inrialpes.fr/people/mairal/spams/doc-R/html/doc_spams006.html#sec27, the same as the grouping structure specification in the R package spams.

In the Examples below, p=9,G=5, the group structure is:

g_1 = \{A_{1}, A_{2}, A_{1}B, A_{2}B\},

g_2 = \{B, A_{1}B, A_{2}B, C_{1}B, C_{2}B\},

g_3 = \{A_{1}B, A_{2}B\},

g_4 = \{C_1, C_2, C_{1}B, C_{2}B\},

g_5 = \{C_{1}B, C_{2}B\}.

where g_3 is a subset of g_1 and g_2, and g_5 is a subset of g_2 and g_4.

Value

A list with the following three elements.

lambdas

The user-specified regularization coefficients lambda sorted in decreasing order.

estimates

A matrix, with each column corresponding to the coefficient estimates at each λ in lambdas.

iterations

A vector of number of iterations it takes to converge at each λ in lambdas.

Examples

# g3 in g1 -> grp_31 = 1
# g3 in g2 -> grp_32 = 1
# g5 in g2 -> grp_52 = 1
# g5 in g4 -> grp_54 = 1
grp <- matrix(c(0, 0, 0, 0, 0,
                0, 0, 0, 0, 0,
                1, 1, 0, 0, 0,
                0, 0, 0, 0, 0,
                0, 1, 0, 1, 0),
              ncol = 5, byrow = TRUE)

# Variable A1 is in g1 only: grp.var_11 = 1
# Variable A1B is in g1 and g3, but g3 is a child group of g1,
# so grp.var_63 = 1 while grp.var_61 = 0.
grp.var <- matrix(c(1, 0, 0, 0, 0, #A1
                    1, 0, 0, 0, 0, #A2
                    0, 0, 0, 1, 0, #C1
                    0, 0, 0, 1, 0, #C2
                    0, 1, 0, 0, 0, #B
                    0, 0, 1, 0, 0, #A1B
                    0, 0, 1, 0, 0, #A2B
                    0, 0, 0, 0, 1, #C1B
                    0, 0, 0, 0, 1  #C2B
                   ), ncol = 5, byrow = TRUE)
eta_g <- rep(1, 5)
x <- as.matrix(sim[, c("A1","A2","C1","C2","B",
                       "A1B","A2B","C1B","C2B")])
lam.seq <- 10^seq(0, -2, by = -0.2)

fit <- netcox(x = x,
              ID = sim$Id,
              time = sim$Start,
              time2 = sim$Stop,
              event = sim$Event,
              lambda = lam.seq,
              group = grp,
              group_variable = grp.var,
              penalty_weights = eta_g,
              tol = 1e-4,
              maxit = 1e3,
              verbose = FALSE)

netcox documentation built on March 7, 2023, 6:15 p.m.