clusterContigs-StrandStateMatrix-method: clusterContigs - agglomeratively clusters contigs into...

Description Usage Arguments Details Value Examples

Description

clusterContigs – agglomeratively clusters contigs into linkage groups based on strand inheritance

Usage

1
2
3
4
5
## S4 method for signature 'StrandStateMatrix'
clusterContigs(object, similarityCutoff = 0.7,
  recluster = NULL, minimumLibraryOverlap = 5, randomise = TRUE,
  randomSeed = NULL, randomWeight = NULL, clusterParam = NULL,
  clusterBy = "hetero", verbose = TRUE)

Arguments

object

data.frame containing strand inheritance information for every contig (rows) in every library (columns). This should be the product of strandSeqFreqTable

similarityCutoff

place contigs in a cluster when their strand state is at least this similar

recluster

Number of times to recluster and take the consensus of. If NULL, clustering is run only once.

minimumLibraryOverlap

for two contigs to be clustered together, the strand inheritance must be present for both contigs in at least this many libraries (in addition to their similarity being at least similarityCutoff)

randomise

whether to reorder contigs before clustering

randomSeed

random seed to initialize clustering

randomWeight

vector of weights for contigs for resampling. If NULL, uniform resampling is used. Typically this should be a measure of contig quality, such as library coverage, so that clustering tends to start from the better quality contigs.

clusterParam

optional BiocParallelParam specifying cluster to use for parallel execution. When NULL, execution will be serial.

clusterBy

Method for performing clustering. Default is 'hetero' (for comparing heterozygous calls to homozygous). Alternative is 'homo' (for compairson between the two homozygous calls)

verbose

prints function progress

Details

Note that a more stringent similarity cutoff will result in more clusters, and a longer run time, since at every iteration a distance is computed to the existing clusters. However, in lower-quality data, a more stringent cutoff may be necessary to reduce the number of contigs that are erroneously grouped.

Note that clusterParam requires BiocParallel to be installed.

Value

LinkageGroupList of vectors containing labels of contigs belonging to each linkage group

Examples

1
2
3
4
5
data("exampleWCMatrix")
 
clusteredContigs <- clusterContigs(exampleWCMatrix, verbose=FALSE)
show(clusteredContigs)
show(clusteredContigs[[1]])

oneillkza/ContiBAIT documentation built on June 1, 2020, 5:49 a.m.