prepareData | R Documentation |
Input data could be of matrix, MultiAssayExperiment, or DataFrame format and this function will prepare a DataFrame of features and a vector of outcomes and help to exclude nuisance features such as dates or unique sample identifiers from subsequent modelling.
## S4 method for signature 'matrix'
prepareData(measurements, outcome, ...)
## S4 method for signature 'data.frame'
prepareData(measurements, outcome, ...)
## S4 method for signature 'DataFrame'
prepareData(
measurements,
outcome,
useFeatures = NULL,
maxMissingProp = 0,
maxSimilarity = 1,
topNvariance = NULL
)
## S4 method for signature 'MultiAssayExperiment'
prepareData(measurements, outcomeColumns = NULL, useFeatures = NULL, ...)
## S4 method for signature 'list'
prepareData(measurements, outcome = NULL, useFeatures = NULL, ...)
measurements |
Either a |
... |
Variables not used by the |
outcome |
Either a factor vector of classes, a |
useFeatures |
Default: |
maxMissingProp |
Default: 0.0. A proportion less than 1 which is the maximum tolerated proportion of missingness for a feature to be retained for modelling. |
maxSimilarity |
Default: 1. A number between 0 and 1 which is the maximum similarity between a pair of variables to be both kept in the data set. For numerical variables, the Pearson correlation is used and for categorical variables, the Chi-squared test p-value is used. For a pair that is too similar, the second variable will be excluded from the data set. |
topNvariance |
Default: NULL. If |
outcomeColumns |
If |
A list of length two. The first element is a DataFrame
of features
and the second element is the outcomes to use for modelling.
Dario Strbenac
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.