VarPermBiclust.chisqdiff: 'SCBiclust' method for identifying variance-based biclusters

View source: R/SC-Var-Biclust.R

VarPermBiclust.chisqdiffR Documentation

'SCBiclust' method for identifying variance-based biclusters

Description

'SCBiclust' method for identifying variance-based biclusters

Usage

VarPermBiclust.chisqdiff(
  x,
  min.size = max(5, round(nrow(x)/20)),
  nperms = 1000,
  silent = TRUE
)

Arguments

x

a dataset with n rows and p columns, with observations in rows.

min.size

Minimum size of observations included in a valid bicluster (default=max(5,round(nrow(x)/20)))

nperms

number of χ^2_{n_1} and χ^2_{n_2} variables generated for each feature where n_1 and n_2 are the number of observations in cluster 1 and cluster 2, respectively. (default=100)

silent

should progress be printed? (default=TRUE)

Details

Observations in the bicluster are identified such that they maximize the feature-weighted sum of between cluster difference in feature variances. Features in the bicluster are identified based on their contribution to the clustering of the observations. This algoritm uses a numerical approximation log(abs(χ^2_{n_1}-chi^2_{n_2})+1) as the expected null distribution for feature weights.

VarPermBiclust.chisqdiff will identify at most one variance bicluster. To identify additional biclusters first the feature signal of the identified bicluster should be removed by scaling the variance of elements in the previously identified bicluster, Then VarPermBiclust.chisqdiff can be used on the residual data matrix. (see example)

Value

The function returns a S3-object with the following attributes:

  • which.x: A list of length num.bicluster with each list entry containing a logical vector denoting if the data observation is in the given bicluster.

  • which.y: A list of length num.bicluster with each list entry containing a logical vector denoting if the data feature is in the given bicluster.

Author(s)

Erika S. Helgeson, Qian Liu, Guanhua Chen, Michael R. Kosorok , and Eric Bair

Examples

test <- matrix(rnorm(100*50, mean=1, sd=2), nrow=100)
test[1:30, 1:20] <- matrix(rnorm(30*20, mean=1, sd=15), nrow=30)
test.VarPermBiclust <- VarPermBiclust.chisqdiff(test)
x=test.VarPermBiclust$which.x
y=test.VarPermBiclust$which.y
# Code for identifying additional biclusters after removing bicluster signal

temp <- scale(test)
temp[x,y] <-t(t(temp[x,y])*(apply(temp[!x,y],2,sd)/
                              apply(temp[x,y],2,sd)))
test.VarPermBiclust.2 <- VarPermBiclust.chisqdiff(temp)


SCBiclust documentation built on June 10, 2022, 1:06 a.m.