Description Usage Arguments Details Value Note Author(s) References See Also Examples
Calculates pairwise distance matrix from DNA k-mer counts based on a modified Canberra distance. Before calculating canberra distances, read counts are normalized (in order to correct systematic effects on the distance) by saling up read counts in each DNA k-mer count vector so that normalized read counts in each sample are nearly equal.
1 | cbDistMatrix(object,nReadNorm=max(nReads(object)))
|
object |
|
nReadNorm |
|
The distance between two DNA k-mer normalized count vectors is calculated by
df_0(X,Y)=\frac{∑_{i=1}^n cbd(x_i,y_i)}{4^k}
where cb is given by
cbd(x,y)=\frac{|x-y|}{x+y}
.
Square matrix
. The number of rows equals the number of files (=nFiles(object)
).
The static size of the retured k-mer array is 4^k.
Wolfgang Kaisers
Cock PJA, Fields CJ, Goto N, Heuer ML, Rice PM The sanger fastq file format for sequences with quality scores and the Solexa/Illumina fastq variants. Nucleic Acids Research 2010 Vol.38 No.6 1767-1771
hclust
1 2 3 4 5 | basedir<-system.file("extdata",package="seqTools")
basenames<-c("g4_l101_n100.fq.gz","g5_l101_n100.fq.gz")
filenames<-file.path(basedir,basenames)
fq<-fastqq(filenames,6,c("g4","g5"))
dm<-cbDistMatrix(fq)
|
Loading required package: zlibbioc
[fastqq] File ( 1/2) '/usr/lib/R/site-library/seqTools/extdata/g4_l101_n100.fq.gz' done.
[fastqq] File ( 2/2) '/usr/lib/R/site-library/seqTools/extdata/g5_l101_n100.fq.gz' done.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.