Description Usage Arguments Details Value Author(s) Examples
Writes compressed FASTQ files where sequence sections contain concatenated k-mers which are uniformly distributed in the range of k-mers for given k. The function first writes a batch of randomly FASTQ files containing randomly simulated DNA sequence. In a second step the function repeatedly writes FASTQ files with random DNA sequence where a fraction of the reads is 'contaminated' with given DNA k-mers. In a third step, for each set of simulated and contaminated files, a hierarchical cluster (HC) tree based on DNA k-mers is calculated. For each set of files, the size of the smaller fraction in the first half of the tree is counted (perc). The value can be used as measure for separation capability of the HC algorithm.
1 2 |
nRep |
|
nContamVec |
|
grSize |
|
nSeq |
|
k |
|
kIndex |
|
pos |
|
The function is intended to be used as explorative tool (not for routine quality assessment). There are some files written and there will be a lot of output on the terminal. It is therefore recommended to switch to a separate working directory and to run this function on a separate terminal. The function is not exported.
data.frame
containing results of the counted perc values for
each repetition of the simulation.
Wolfgang Kaisers
1 2 3 4 | kMerIndex("CCCCCC")
## Not run: res <- seqTools:::sim_fq(nRep=2, nContamVec=c(10, 100),
grSize=4, nSeq=1e2)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.