Description Usage Arguments Details Value Author(s) Examples
Calculates variable-wise Kullback-Leibler divergence between the two groups of samples.
1 |
data |
A numerical dataframe with no missing value. |
group1 |
A vector of integers. Demonstrates the row indices of group 1. |
group2 |
A vector of integers. Demonstrates the row indices of group 2. |
permute |
An integer indicating the number of permutations for permutation test. If 0 (the default) no permutation test will be carried out. |
levels |
An integer value indicating the maximum number of levels of a categorical variable. To be used to distinguish the categorical variable. Defaults to NULL because it is supposed that |
The function helps users to find out the variables with the most divergence between two groups with different states of one specific variable. For instance, within a dataset of health measurements, we are interested in finding the most important variables in occurring cardiovascular disease. The function is able to carry out the permutation test to calculate the p_value for each variable.
if permute = 0 returns a dataframe including sorted Kullback-Liebler (KL) divergence. if permute > 0 returns a dataframe including p.values and sorted KL divergence.
Elyas Heidari
1 2 3 4 5 6 7 8 9 10 11 12 | data("NHANES")
## Using preprocessed data
data <- data_preproc(NHANES, levels = 15)
data$SEQN <- NULL
# Construct two groups of samples
g1 <- which(data$PAD590 == 1)
g2 <- which(data$PAD590 == 6)
# Set permute to calculate p.values
kl <- VKL(data, group1 = g1, group2 = g2, permute = 100, levels = NULL)
## Using raw data
kl <- VKL(NHANES, group1 = g1, group2 = g2, permute = 0, levels = 15)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.