signflip: Sign-flipping of Predictor Variables to Obtain Equal Polarity

Description Usage Arguments Value Author(s) See Also Examples

Description

Computes the score for each predictor variable (gene) in the x-Matrix, and multiplies its values with (-1) if its score is greater or equal than half of the maximal score. For gene expression data, this amounts to treating under- and overexpression symmetrically. After the sign-flip procedure, low (expression) values point towards response class 0 and high (expression) values point towards class 1.

Usage

1
sign.flip(x, y)

Arguments

x

Numeric matrix of explanatory variables (p variables in columns, n cases in rows). For example, these can be microarray gene expression data which should be sign-flipped and then clustered.

y

Numeric vector of length n containing the class labels of the individuals. These labels have to be coded by 0 and 1.

Value

Returns a list containing:

flipped.matrix

The sign-flipped x-matrix.

signs

Numeric vector of length p, which for each predictor variable indicates whether it was sign-flipped (coded by -1) or not (coded by +1).

Author(s)

Marcel Dettling, dettling@stat.math.ethz.ch

See Also

wilma also for the references and score, as well as for a newer methodology, pelora and sign.change.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
data(leukemia, package="supclust")

op <- par(mfrow=c(1,3))
plot(leukemia.x[,69],leukemia.y)
title(paste("Margin = ", round(margin(leukemia.x[,69], leukemia.y),2)))

## Sign-flipping is very important
plot(leukemia.x[,161],leukemia.y)
title(paste("Margin = ", round(margin(leukemia.x[,161], leukemia.y),2)))
x <- sign.flip(leukemia.x, leukemia.y)$flipped.matrix
plot(x[,161],leukemia.y)
title(paste("Margin = ", round(margin(x[,161], leukemia.y),2)))
par(op)# reset

supclust documentation built on Sept. 27, 2021, 5:11 p.m.