clusterRange: Run COMMUNAL over a range of data subsets

Description Usage Arguments Value Author(s) Examples

View source: R/test_range.R

Description

Convenience harness to run COMMUNAL on a range of data subsets, for a fixed set of parameters. In the data format, the columns are the items to be clustered. The rows are (optionally) first sorted by variance. The top x rows are used for clustering by COMMUNAL, for each x in varRange. Output is used by plotRange3D to generate 3D plot.

Usage

1
2
clusterRange(dataMtx, ks, varRange, validation = "all", verbose = T, ..., 
             parallel = F, mc.cores = NULL, row.order = NULL)

Arguments

dataMtx

The data for input to COMMUNAL.

ks

The range of K to test with COMMUNAL.

varRange

Numeric vector of how many items of data matrix to cluster. clusterRange runs COMMUNAL on the 1:x rows with the greatest variance for each element x in varRange (unless row.order is passed, in which case that is used instead of variance).

validation

Validation measures to use in COMMUNAL. Defaults to "all".

verbose

Whether to be verbose. Very helpful in identifying points of failure or delay; but, the output is very verbose.

...

Arguments to pass down to COMMUNAL. If reorder=FALSE is specified, then the rows will not be first sorted by variance in this function, and the rows will be taken in the given order. By default, rows are sorted by variance.

parallel

clusterRange performs the same COMMUNAL run for each data subset; parallel capabilities have been implemented using the parallel::mclapply(). This function DOES NOT RUN ON WINDOWS MACHINES (sorry).

mc.cores

optionally set the number of cores to use. Ignored if not parallel.

row.order

If left NULL, the data is reordered in descending order of variance. Otherwise, data is subsetted by row.order according to varRange. So, if varRange = c(100,500), then the first subset will be the rows of data at row.order[1:100], and the second subset will be at row.order[1:500].

Value

all.results

list of COMMUNAL objects, one for each range in varRange

varRange

the varRange input parameter

Author(s)

Albert Chen and Timothy E Sweeney
Maintainer: Albert Chen acc2015@stanford.edu

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
## Not run: 
## To identify k, use clusterRange and plotRange3D to visualize validation measures
data(BRCA.100) # 533 tissues to cluster, with measurements of 100 genes each
varRange <- c(50, 75, 100)
clus.methods <- c("hierarchical", "kmeans")
validation <- c('wb.ratio', 'dunn', 'avg.silwidth')
range.results <- clusterRange(BRCA.100, varRange, ks=2:5, clus.methods=clus.methods,
                              validation=validation)
plot.data <- plotRange3D(range.results, ks=2:5, clus.methods, validation)
## Note: the BRCA.results dataset was generated by running clusterRange on 
## a larger range than the one here (with a larger input dataset)

## End(Not run)

COMMUNAL documentation built on May 29, 2017, 6:36 p.m.