generateDistn: Generate theoretical GC content distributions

Description Usage Arguments Value References

Description

This function generates random simulated reads from either provided seqs (best for RNA-seq) or from a genome (best for DNA-seq). The GC content of these reads is then tabulated to produce a distribution file which can be read by MultiQC to be displayed on top of the FASTQC GC content module. Either seqs or genome is required, and only one can be specified. Specifying genome requires also specifying nchrom.

Usage

1
2
generateDistn(seqs, genome, nchrom, file = "fastqc_theoretical_gc.txt",
  n = 1e+06, bp = 100, wts = 1, name = "")

Arguments

seqs

a DNAStringSet of the sequences to simulate read from. E.g. for RNA-seq, the transcripts, which can be generated with extractTranscriptSeqs from the GenomicFeatures package. See the example script located in inst/script/human_mouse.R

genome

a BSgenome object. See the example script located in inst/script/human_mouse.R

nchrom

the number of chromosomes from the genome to simulate reads from

file

the path of the file to write out

n

the number of reads to simulate

bp

the basepair of the reads

wts

optional weights to go along with the seqs or the chromosomes in genome, e.g. to represent more realistic expression of transcripts

name

the name to be printed at the top of the file

Value

the name of the file which was written

References

MultiQC: http://multiqc.info/

FASTQC: http://www.bioinformatics.babraham.ac.uk/projects/fastqc/


mikelove/fastqcTheoreticalGC documentation built on May 22, 2019, 10:52 p.m.