Description Usage Arguments Details Value Author(s) References

View source: R/qualityControl.R

This function evaluates the sequence complexity using the DUST algorithm.

1 2 | ```
complexity.dust(object, xlab="Complexity score (0=high, 100=low)", ylab="Number of sequences",
xlim=c(0, 100), col="firebrick1", breaks=100, ...)
``` |

`object` |
An object of class DNAStringSet, ShortRead or SFFContainer. |

`xlab` |
The X axis label. |

`ylab` |
The Y axis label. |

`xlim` |
The limits of the X axis. |

`col` |
The plotting color. |

`breaks` |
The number of breaks in the histogram (see ‘hist’). |

`...` |
Arguments to be passed to methods, such as graphical parameters (see ‘par’). |

The complexity score is based on how often different trinucleotides occur and is scaled between 0 and 100. A sequence of homopolymer repeats (e.g. TTTTTTTTTT) has a score of 100, of dinucleotide repeats (e.g. TATATATATA) has a score around 49, and of trinucleotide repeats (e.g. TAGTAGTAG) has a score around 32. Scores above seven can be considered low-complexity.

A numeric vector containing the complexity score for each sequence.

Christian Ruckert

Schmieder R. (2011) Quality control and preprocessing of metagenomic datasets.
*Bioinformatics*, 2011 Mar 15;27(6):863-4.

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.