ppfpca: Main function implementing the PPFPCA algorithm

Description Arguments Details Value References

View source: R/PPFPCA-package.R

Description

This function builds an algorithm to identify the occurrence of event outcome from trajectories of several predictors.

Arguments

datadir_org

a path for the directory where the original data files are saved. If NULL is specified (default), a directory named "./data_org" will be automatically specified.
Note that 6 original data must be saved as the following file names;
1. TrainSurv.csv: baseline survival data for training (labeled); 1st colum: patient id, 2nd colum: event indicator (1=event, 0=censoring), 3rd colum: event time, 4th colum: follow-up time, 5th colum–: covariates.
2. ValidSurv.csv: baseline survival data for validation; baseline survival data for training (labeled); 1st colum: patient id, 2nd colum: event indicator (1=event, 0=censoring), 3rd colum: event time, 4th colum: follow-up time, 5th colum–: covariates.
3. TrainCode.csv: 1st colum: patient id, 2nd colum: follow-up time, 3rd colum: time label (month), 4th colum– : predictors.
4. ValidCode.csv: 1st colum: patient id, 2nd colum: follow-up time, 3rd colum: time label (month), 4th colum– : predictors.
5. TrainN.csv: 1st colum: patient id, 2nd colum– : total number of counts for each predictor.
6. ValidN.csv: 1st colum: patient id, 2nd colum– : total number of counts for each predictor.

datadir_base_func

a path for the directory where the base function data will be saved. If NULL is specified (default), a directory named "./data_base_func" will be automatically created under the current working directly.

outdir

a path for the directory where output files will be saved. If NULL is specified (default), a directory named "./outdir" will be automatically created under the current working directly.

read_base_func

a logical indicating whether to create base function data FALSE or to read base function data files you already created TRUE. Default is TRUE.

n.grid

an integer value for grid points used in estimating covariance function g. Default is 401.

PPIC_K

a logical indicating whether you want to use Pseudo-Poisson Information Criterion to choose the number of principal components K (K.select="PPIC") TRUE or another criterion to choose K (K.select="PropVar") FALSE in the PP_FPCA_CPP_Parallel function (hidden). Default is FALSE.

cov_group

a vector of consecutive integers describing the grouping only for covariates. When NULL is specified (default), each covariate will be in different group.

propvar

a proportion of variation used to select number of FPCs. Default is 0.85.

n_core

an integer to specify the number of core using for parallel computing. Default is 4.

StdFollowUp

a logical indicating whether to use standardize follow-up time or not. Standardize follow-up time will be calculated as TrainCode$month/TrainCode$analysisfu).

thresh

a default is 0.7, which means if there are codes with >70% patients no codes, only use first code time.

PCAthresh

a threshold value for PCA. Default is 0.9.

seed

random seed used for the sampling. Default is 1234.

seed2

random seed used for the sampling. Default is 100.

Details

For more details, please contact to the package manager.

Value

A list with components:

bgbbest_FromChengInit_BFGS

Details of the fitted model

Cstat_BrierSc_ChengInit_BFGS

Performance of the derived algorithm. C-statistics, etc.

group

A vector of consecutive integers describing the grouping coefficients

Several output files will be saved in the outdir directory.

References

Wu, S., Müller, H., & Zhang, Z. (2013). FUNCTIONAL DATA ANALYSIS FOR POINT PROCESSES WITH RARE EVENTS. Statistica Sinica, 23(1), 1-23.


celehs/PPFPCA documentation built on March 10, 2020, 10:16 a.m.