The Kendall Interaction Filter for Variable Interaction Screening in High Dimensional Classification Problems
Kendall Interaction Fliter (KIF) approach, which conducts Interaction Screening in the high dimensional data for multi-class classification outcome
This is a package implementing the Kendall Interaction Filter (KIF), an efficient interaction screening method aiming to select relevant couples to the classification task in the high dimensional data frame. The measure KIF is presented in the paper "The Kendall Interaction Filter for Variable Interaction Screening in High Dimensional Classification Problems". It has several advantages:
The KIF package implements two methods; namely KIF.couple and KIF.all. The method KIF.couple takes a couple as an input and returns its Kendall Interaction Filter score as an output while the method KIF.all takes as an input the complete dataset and returns the most relevant couples, to the classification task, as an output.
To install KIF package, run:
library(devtools)
devtools::install_github("KarimOualkacha/KIF", build_vignettes = TRUE)
library(KIF)
## Loading required package : mvtnorm
## Loading required package : ccaPP
## Loading required package : parallel
## Loading required package : pcaPP
## Loading required package : robustbase
We generate a toy dataset to illustrate the usage of the functions KIF.couple and KIF.all. The dataset has 200 observations and 500 explanatory variables. It is a two class example where each class has 100 observations.
library(mvtnorm)
set.seed(1)
n1 <- 100
n2 <- 100
n <- n1 +n2
p <- 500
sigma <- diag(p)
sigma[upper.tri(sigma)] <- 0.2
sigma[lower.tri(sigma)] <- 0.2
sigma1 <- sigma
sigma2 <- sigma
sigma1[1,2] <- 0.8
sigma1[2,1] <- 0.8
sigma1[3,4] <- 0.8
sigma1[4,3] <- 0.8
sigma2[3,4] <- -0.8
sigma2[4,3] <- -0.8
mean1 <- c(rep(0,p))
mean2 <- c(rep(0,p))
Sample <- rbind(rmvnorm(n1, mean1, sigma1), rmvnorm(n2, mean2, sigma2))
y <- c(rep(1,n1), rep(0,n2))
The relevenant couples, to the classification task, are "1,2" and "3,4". Couple "3,4" is more relevant than "1,2".
The KIF.couple function requires as arguments a pair of explanatory variables and the labels variable and returns as an output the corresponding Kendall Interaction Filter score.
out12 <- KIF.couple(Sample, y, c(1,2))
out34 <- KIF.couple(Sample, y, c(3,4))
The result is:
out12
## [1] 0.199798
out34
## [1] 0.5165657
Kendall Interaction Filter score of couple "3,4" is higher than that of couple "1,2", as expected.
The KIF.all function requires as arguments the dataset, the labels variable, the number of cores to use for parallelization and the number of pairs to select among the first selected couples. It returns as an output the couples selected as relevant ones based on thier decreasing Kendall Interaction Filter scores order.
outall <- KIF.all(Sample, y, 1, 10)
outall
## [,1] [,2]
## [1,] 3 4
## [2,] 3 263
## [3,] 1 2
## [4,] 59 475
## [5,] 42 350
## [6,] 162 303
## [7,] 393 483
## [8,] 42 101
## [9,] 223 437
##[10,] 246 366
Couples "1,2" and "3,4" are both among the first 10 selected couples, impliying that Kendall Interaction Filter indeed has the ability to select the relevant couples.
Youssef Anzarmou, Abdallah Mkhadri, Karim Oualkacha
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.