View source: R/tidyDiscreteBinaryMI.R
calculateDiscretePresentValuesMI | R Documentation |
This calculates the mutual information of a feature not being present in all samples
calculateDiscretePresentValuesMI( df, discreteVars, sampleVars, sampleCount = NULL, sampleCountDf = NULL, ... )
df |
- may be grouped, in which case the value is interpreted as different types of variable (features) |
discreteVars |
- the column(s) of the categorical value (X) quoted by vars(...) (e.g. outcome) |
sampleVars |
- the column(s) of the sample identifier |
sampleCount |
- (optional) an integer containing the count of all samples per outcome (discreteVars) |
sampleCountDf |
- (optional) a dataframe containing columns for df grouping (features), and discreteVars (outcomes), N and N_x columns with expected counts see expectSamplesByOutcome(...) |
This is relevant for sparse data sets with many features such as NLP terms, where a term as a feature is only flagged as present in the samples.
a dataframe containing the distinct values of the groups of df, and for each group a mutual information column (I). If df was not grouped this will be a single entry
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.