load_and_prepare_data_pro | R Documentation |
Loads a CSV file containing patient data, extracts features, outcome, and time columns, and prepares them into a format suitable for survival analysis models. Handles basic data cleaning like NA removal and column type conversion.
load_and_prepare_data_pro(
data_path,
outcome_col_name,
time_col_name,
time_unit = c("day", "month", "year")
)
data_path |
A character string, the file path to the input CSV data. The first column is assumed to be a sample ID. |
outcome_col_name |
A character string, the name of the column containing event status (0 for censored, 1 for event). |
time_col_name |
A character string, the name of the column containing event or censoring time. |
time_unit |
A character string, the unit of time in |
A list containing:
X
: A data frame of features (all columns except ID, outcome, and time).
Y_surv
: A survival::Surv
object created from time and outcome.
sample_ids
: A vector of sample IDs (the first column of the input data).
outcome_numeric
: A numeric vector of outcome status.
time_numeric
: A numeric vector of time, converted to days.
temp_csv_path <- tempfile(fileext = ".csv")
dummy_data <- data.frame(
ID = paste0("Patient", 1:50),
FeatureA = rnorm(50),
FeatureB = runif(50, 0, 100),
CategoricalFeature = sample(c("A", "B", "C"), 50, replace = TRUE),
Outcome_Status = sample(c(0, 1), 50, replace = TRUE),
Followup_Time_Months = runif(50, 10, 60)
)
write.csv(dummy_data, temp_csv_path, row.names = FALSE)
# Load and prepare data
prepared_data <- load_and_prepare_data_pro(
data_path = temp_csv_path,
outcome_col_name = "Outcome_Status",
time_col_name = "Followup_Time_Months",
time_unit = "month"
)
# Check prepared data structure
str(prepared_data$X)
print(prepared_data$Y_surv[1:5])
# Clean up dummy file
unlink(temp_csv_path)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.