calculateDiscreteContinuousMI: calculate mutual information between a categorical value (X)...

View source: R/tidyDiscreteContinuousMI.R

calculateDiscreteContinuousMIR Documentation

calculate mutual information between a categorical value (X) and a continuous value (Y)

Description

This is specifically designed to supprt tidy data where there are many features, with associated values and outcomes in different columns of a dataframe or database table

Usage

calculateDiscreteContinuousMI(
  df,
  discreteVars,
  continuousVar,
  method = "KWindow",
  ...
)

Arguments

df

- may be grouped, in which case the value is interpreted as different types of continuous variable

discreteVars

- the column(s) of the categorical value (X) quoted by vars(...)

continuousVar

- the column of the continuous value (Y)

method

- the method employed - valid options are "KWindow","KNN","Discretise","Grassberger","Compression","Entropy","Quantile","PDF","SGolay","Kernel"

...

- the other parameters are passed onto the implementations

Details

N.B. this result is the mutual information between feature value and outcome GIVEN that the feature is present. It does not account for missing values.

Value

a dataframe containing the disctinct values of the groups of df, and for each group a mutual information column (I). If df was not grouped this will be a single entry

Examples

observations %>% group_by(feature) %>% calculateDiscreteContinuousMI(vars(outcome), value)

terminological/tidy-info-stats documentation built on Nov. 19, 2022, 11:23 p.m.