KFOCI: Kernel Feature Ordering by Conditional Independence

Description Usage Arguments Details Value See Also Examples

View source: R/KPC.R

Description

Variable selection with KPC using directed K-NN graph or minimum spanning tree (MST)

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
KFOCI(
  Y,
  X,
  k = kernlab::rbfdot(1/(2 * stats::median(stats::dist(Y))^2)),
  Knn = 1,
  num_features = NULL,
  stop = TRUE,
  numCores = 1,
  verbose = FALSE
)

Arguments

Y

a matrix of responses (n by dy)

X

a matrix of predictors (n by dx)

k

a function k(y, y') of class kernel. It can be the kernel implemented in kernlab e.g. Gaussian kernel: rbfdot(sigma = 1), linear kernel: vanilladot().

Knn

the number of nearest neighbor; or "MST"

num_features

the number of variables to be selected, cannot be larger than dx. The default value is NULL and in that case it will be set equal to dx. If stop == TRUE (see below), then num_features is the maximal number of variables to be selected.

stop

If stop == TRUE, then the automatic stopping criterion (stops at the first instance of negative Tn, as mentioned in the paper) will be implemented and continued till num_features many variables are selected. If stop == FALSE then exactly num_features many variables are selected.

numCores

number of cores that are going to be used for parallelizing the process.

verbose

whether to print each selected variables during the forward stepwise algorithm

Details

A stepwise forward selection of variables using KPC. At each step the X_j maximizing \hat{ρ^2}(Y,X_j | selected X_i) is selected. It is suggested to normalize the predictors before applying KFOCI. Euclidean distance is used for computing the K-NN graph and the MST.

Value

The algorithm returns a vector of the indices from 1,...,dx of the selected variables

See Also

KPCgraph, KPCRKHS

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
n = 200
p = 10
X = matrix(rnorm(n * p), ncol = p)
Y = X[, 1] * X[, 2] + sin(X[, 1] * X[, 3])
KFOCI(Y, X, kernlab::rbfdot(1), Knn=1, numCores=1)
## Not run: 
### install the package olsrr first
surgical = olsrr::surgical
for (i in 1:9) surgical[,i] = (surgical[,i] - mean(surgical[,i]))/sd(surgical[,i])
ky = kernlab::rbfdot(1/(2*stats::median(stats::dist(surgical$y))^2))
colnames(surgical)[KFOCI(surgical[,9],surgical[,1:8],ky,Knn=1)]
#### "enzyme_test" "pindex" "liver_test"  "alc_heavy"

n = 200
p = 1000
set.seed(1)
X = matrix(rnorm(n * p), ncol = p)
Y = X[, 1] * X[, 2] + sin(X[, 1] * X[, 3])
KFOCI(Y, X, kernlab::rbfdot(1), Knn=1, numCores = 7, verbose=TRUE)
# 1 2 3

## End(Not run)

KPC documentation built on Dec. 11, 2021, 9:58 a.m.

Related to KFOCI in KPC...