Cosmonet: Survival prediction via screening-network Cox methods

Description Usage Arguments Details Value References

View source: R/Cosmonet.R

Description

This function performs two main step: (i) penalized Cox regression methods to select a subset of potential biomarkers by using the training set T; (ii) the validation test by using the testing set D.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
Cosmonet(
  k,
  x1,
  y1,
  x2,
  y2,
  screenVars,
  family = "Cox",
  penalty = "Net",
  Omega,
  alpha = 0.5,
  lambda = NULL,
  nlambda = 50,
  nfolds = 5,
  foldid = NULL,
  selOptLambda = min("min", "1se"),
  optCutpoint = c("minPValue", "median", "survCutpoint")
)

Arguments

k

times to loop through cross validation.

x1

input training matrix n1xp.

y1

response variable, y1 should be a two-column matrix with columns named time and status. The latter is a binary variable, with 1 indicating event, and 0 indicating right censored.

x2

input testing matrix n2xp. Each row represents an observation, while each column a variable.

y2

response variable, y2 should be a two-column data frame with columns named time and status. The latter is a binary variable, with 1 indicating event, and 0 indicating right censored. The rownames indicate the sample names ordered as the samples in the input testing matrix.

screenVars

screened variables obtained from BMD- or DAD-, or BMD+DAD-screening.

family

Cox proportional hazards regression model. Family=Cox.

penalty

penalty type. Can choose Net where Omega matrix is requested. For penalty = Net, the penalty is defined as λ*{α*||β||_1+(1-\lapha)/2*(β^{T}Lβ)}, where L is a Laplacian matrix calculated from Omega.

Omega

adjacency matrix with zero diagonal and non-negative off-diagonal used to calculate Laplacian matrix.

alpha

ratio between L1 and Laplacian for Net. Default is alpha = 0.5'.

lambda

a user supplied decreasing sequence. If lambda = NULL, a random sequence of lambda is generated. For more details for instance APML0 package

nlambda

number of lambda values. Default is 50.

nfolds

number of folds performed for tuning optimal parameters over runs. Default is nfolds = 5.

foldid

an optional vector of values between 1 and nfolds specifying which fold each observation is in.

selOptLambda

a character string for selecting the lambda parameter. Options are min which uses the regularisation procedure implemented in the APML0 package or 1se" to select the lambda parameter within one standard error from the optimal value.

optCutpoint

a character string for choosing the optimal cutpoint on training set T based on prognostic index PI^{T}. Can choose minPValue, median and survCutpoint.

Details

The first step is the variable screening of the data which aimed to reduce the number of variables for a large to a moderate scale. To this purpose, we assume that only a small number of these p variables is affecting the survival outcome. Therefore, we filter out variables that are considered not relevant for the disease under investigation. To this purpose, we consider three different types of variable screenings: biomedical screening (BMD-screening), data-driven screening (DAD-screening) and the fusion of biomedical and data-driven screening (BMD+DAD-screening). The second step is the application of penalized methods using the subset of screened variables \{x_j,j \in \mathcal{I}\} (where \mathcal{I} depends on the type screening performed) as new feature space to further remove not significant variables from the model. To assess the stability of the survival prediction we performed the k-fold cross-validation different times and we take as estimate the average value of λ and the corresponding α. These two parameters are used to fit the corresponding penalized Cox model and obtain the parameter estimate of β_\mathcal{I}. Survival analysis is performed using the Kaplan Meier curves after dividing the patients in two risk groups (high-and-low risk group) on the basis of the prognostic index PI computed with the gene signature. The p-value, used to test the null hypothesis that the survival curves are identical vs. the alternative that the two groups have different survival, is calculated by using the log-rank test.

Value

An object of class COSMONET is returned composed by:

fitTrain

see CosmonetTraining function outputs.

fitTest

see CosmonetTesting function outputs.

References

Iuliano, A., Occhipinti, A., Angelini, C., De Feis, I., and Liò, p. (2018). Combining Pathway Identification and Breast Cancer Survival Prediction via Screening-Network Methods.
Frontiers in genetics, 9, 206.

Iuliano, A., Occhipinti, A., Angelini, C., De Feis, I., & Lió, P. (2016). Cancer markers selection using network-based Cox regression: A methodological and computational practice.
Frontiers in physiology, 7, 208.


cosmonet-package/COSMONET documentation built on Dec. 24, 2021, 9:12 p.m.