View source: R/tidyDiscreteBinaryMI.R
calculateDiscreteAbsentValuesMI | R Documentation |
This calculates the mutual information of a feature not being present in every sample
calculateDiscreteAbsentValuesMI( df, discreteVars, sampleVars, sampleCount = NULL, sampleCountDf = NULL, ... )
df |
- may be grouped, in which case the value is interpreted as different types of variable (features) |
discreteVars |
- the column(s) of the categorical value (X) quoted by vars(...) (e.g. outcome) |
sampleVars |
- the column(s) of the sample identifier |
sampleCount |
- (optional) an integer containing the count of all samples per outcome (discreteVars) |
sampleCountDf |
- (optional) a dataframe containing columns for df grouping (features), and discreteVars (outcomes), N and N_x columns with expected counts see expectSamplesByOutcome(...) |
This is relevant for sparse data sets with many features such as NLP terms, where a term as a feture may not be present in a given document, and this absense may be assymetrically distributed between different classes.
a dataframe containing the distinct values of the groups of df, and for each group a mutual information column (I). If df was not grouped this will be a single entry
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.