runPlp | R Documentation |
This provides a general framework for training patient level prediction models. The user can select various default feature selection methods or incorporate their own, The user can also select from a range of default classifiers or incorporate their own. There are three types of evaluations for the model patient (randomly splits people into train/validation sets) or year (randomly splits data into train/validation sets based on index year - older in training, newer in validation) or both (same as year spliting but checks there are no overlaps in patients within training set and validaiton set - any overlaps are removed from validation set)
runPlp(
plpData,
outcomeId = plpData$metaData$databaseDetails$outcomeIds[1],
analysisId = paste(Sys.Date(), outcomeId, sep = "-"),
analysisName = "Study details",
populationSettings = createStudyPopulationSettings(),
splitSettings = createDefaultSplitSetting(type = "stratified", testFraction = 0.25,
trainFraction = 0.75, splitSeed = 123, nfold = 3),
sampleSettings = createSampleSettings(type = "none"),
featureEngineeringSettings = createFeatureEngineeringSettings(type = "none"),
preprocessSettings = createPreprocessSettings(minFraction = 0.001, normalize = TRUE),
modelSettings = setLassoLogisticRegression(),
logSettings = createLogSettings(verbosity = "DEBUG", timeStamp = TRUE, logName =
"runPlp Log"),
executeSettings = createDefaultExecuteSettings(),
saveDirectory = NULL
)
plpData |
An object of type |
outcomeId |
(integer) The ID of the outcome. |
analysisId |
(integer) Identifier for the analysis. It is used to create, e.g., the result folder. Default is a timestamp. |
analysisName |
(character) Name for the analysis |
populationSettings |
An object of type |
splitSettings |
An object of type |
sampleSettings |
An object of type |
featureEngineeringSettings |
An object of |
preprocessSettings |
An object of |
modelSettings |
An object of class
|
logSettings |
An object of |
executeSettings |
An object of |
saveDirectory |
The path to the directory where the results will be saved (if NULL uses working directory) |
This function takes as input the plpData extracted from an OMOP CDM database and follows the specified settings to develop and internally validate a model for the specified outcomeId.
An plpResults object containing the following:
model The developed model of class plpModel
executionSummary A list containing the hardward details, R package details and execution time
performanceEvaluation Various internal performance metrics in sparse format
prediction The plpData cohort table with the predicted risks added as a column (named value)
covariateSummary A characterization of the features for patients with and without the outcome during the time at risk
analysisRef A list with details about the analysis
# simulate some data
data('simulationProfile')
plpData <- simulatePlpData(simulationProfile, n = 1000)
# develop a model with the default settings
saveLoc <- file.path(tempdir(), "runPlp")
results <- runPlp(plpData = plpData, outcomeId = 3, analysisId = 1,
saveDirectory = saveLoc)
# to check the results you can view the log file at saveLoc/1/plpLog.txt
# or view with shiny app using viewPlp(results)
# clean up
unlink(saveLoc, recursive = TRUE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.