View source: R/missingValues.R
adjustMIForAbsentValues | R Documentation |
This corrects normal mutual information calculations for information carried by the absense of a variable. This is relevant for sparse data sets with many features such as NLP terms. Unequal "missingness" of features can contain information about the outcomes. To deterimine whether a feature is missung for a given sample we need to make some assumptions. These are generally calculated from the data but can alternatively be specificied directly.
adjustMIForAbsentValues(df, discreteVars, sampleVars, mutualInformationFn, ...)
df |
- may be grouped, in which case the value is interpreted as different types of continuous variable |
discreteVars |
- the column(s) of the categorical value (X) quoted by vars(...) (e.g. outcome) |
sampleVars |
- the column(s) which uniquely identify the sample (e.g. person identifier) |
mutualInformationFn |
- the function that will calculate the unadjusted MI |
... |
- the other parameters are passed onto the function specified in mutualInformationFn and the observedVersusExpected(...) function. Particularly sampleCount or sampleCountDf |
a dataframe containing the disctinct values of the groups of df, and for each group a mutual information column (I). If df was not grouped this will be a single entry
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.