getTrainTest: getTrainTest
In Coxmos: Cox MultiBlock Survival

View source: R/Coxmos_common_functions.R

getTrainTest

R Documentation

getTrainTest

Description

Splits input data (X and Y) into training and test sets for survival analysis, ensuring balanced event distributions. Supports single or multiple splits (repeats) for cross-validation and multiblock data in X parameter.

Usage

getTrainTest(X, Y, p = 0.8, times = 1, seed = 123)

Arguments

`X`	Numeric matrix, data.frame or list of matrices or data.frames. Predictor variables (features). Rows are samples, columns are variables.
`Y`	Numeric matrix or data.frame. Response variables. Object must have two columns named as "time" and "event". For event column, accepted values are: 0/1 or FALSE/TRUE for censored and event observations.
`p`	Numeric (0 < p < 1). Proportion of samples to allocate to the training set (default: 0.8).
`times`	Integer. Number of splits to perform repeated partitioning (default: 1).
`seed`	Integer. Random seed for reproducibility (default: 123).

Details

This function uses caret::createDataPartition() to partition the data while preserving the proportion of events (e.g., deaths) in both training and test sets. It is designed for survival data where Y must contain an event column (binary: 1=event, 0=censored).

Value

If times = 1: A list with:
X_train: Training features.
Y_train: Training survival data.
X_test: Test features.
Y_test: Test survival data.
If times > 1: A named list of length times, each element containing the above structure.

Author(s)

Pedro Salguero Garcia. Maintainer: pedsalga@upv.edu.es

Examples

# Single split (80% training, 20% test)
data(X_proteomic, Y_proteomic)
lst <- getTrainTest(X_proteomic, Y_proteomic, p = 0.8)

# Repeated splits (3x)
lst_repeats <- getTrainTest(X_proteomic, Y_proteomic, p = 0.7, times = 3)

Coxmos documentation built on June 8, 2025, 10:30 a.m.