cbDistMatrix: cbDistMatrix function: Calculates pairwise distance matrix...
In seqTools: Analysis of nucleotide, sequence and quality content on fastq files.

Description Usage Arguments Details Value Note Author(s) References See Also Examples

Calculates pairwise distance matrix from DNA k-mer counts based on a modified Canberra distance. Before calculating canberra distances, read counts are normalized (in order to correct systematic effects on the distance) by saling up read counts in each DNA k-mer count vector so that normalized read counts in each sample are nearly equal.

1	cbDistMatrix(object,nReadNorm=max(nReads(object)))

object

Fastqq: Object from which DNA k-mer counts are used.

nReadNorm

numeric: Number of reads per file to wich all contained DNA k-mer counts are normalized. Because the normalization is intended to increase counts the value must be greater than all fastq file read counts (as reported by nReads). Therefore the standard value is chosen to the maximal number of reads recorded in this object. This normalization is necessary to compensate for systematic effects in the canberra distance.

The distance between two DNA k-mer normalized count vectors is calculated by

df_0(X,Y)=\frac{∑_{i=1}^n cbd(x_i,y_i)}{4^k}

where cb is given by

cbd(x,y)=\frac{|x-y|}{x+y}

Square matrix. The number of rows equals the number of files (=nFiles(object)).

The static size of the retured k-mer array is 4^k.

Wolfgang Kaisers

Cock PJA, Fields CJ, Goto N, Heuer ML, Rice PM The sanger fastq file format for sequences with quality scores and the Solexa/Illumina fastq variants. Nucleic Acids Research 2010 Vol.38 No.6 1767-1771

hclust

basedir<-system.file("extdata",package="seqTools")
basenames<-c("g4_l101_n100.fq.gz","g5_l101_n100.fq.gz")
filenames<-file.path(basedir,basenames)
fq<-fastqq(filenames,6,c("g4","g5"))
dm<-cbDistMatrix(fq)

Loading required package: zlibbioc
[fastqq] File ( 1/2) '/usr/lib/R/site-library/seqTools/extdata/g4_l101_n100.fq.gz'	done.
[fastqq] File ( 2/2) '/usr/lib/R/site-library/seqTools/extdata/g5_l101_n100.fq.gz'	done.