VKL: Calculate Variable-wise Kullback-Leibler divergence

Description Usage Arguments Details Value Author(s) Examples

View source: R/VKL.R

Description

Calculates variable-wise Kullback-Leibler divergence between the two groups of samples.

Usage

1
VKL(data, group1, group2, permute = 0, levels = NULL)

Arguments

data

A numerical dataframe with no missing value.

group1

A vector of integers. Demonstrates the row indices of group 1.

group2

A vector of integers. Demonstrates the row indices of group 2.

permute

An integer indicating the number of permutations for permutation test. If 0 (the default) no permutation test will be carried out.

levels

An integer value indicating the maximum number of levels of a categorical variable. To be used to distinguish the categorical variable. Defaults to NULL because it is supposed that data has been preprocessed using data_preproc and the categorical variables are specified.

Details

The function helps users to find out the variables with the most divergence between two groups with different states of one specific variable. For instance, within a dataset of health measurements, we are interested in finding the most important variables in occurring cardiovascular disease. The function is able to carry out the permutation test to calculate the p_value for each variable.

Value

if permute = 0 returns a dataframe including sorted Kullback-Liebler (KL) divergence. if permute > 0 returns a dataframe including p.values and sorted KL divergence.

Author(s)

Elyas Heidari

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
data("NHANES")
## Using preprocessed data
data <- data_preproc(NHANES, levels = 15)
data$SEQN <- NULL
# Construct two groups of samples
g1 <- which(data$PAD590 == 1)
g2 <- which(data$PAD590 == 6)
# Set permute to calculate p.values
kl <- VKL(data, group1 = g1, group2 = g2, permute = 100, levels = NULL)

## Using raw data
kl <- VKL(NHANES, group1 = g1, group2 = g2, permute = 0, levels = 15)

bAIo-lab/Questools documentation built on Nov. 9, 2019, 3:59 a.m.