estGcDistn: Estimate a GC Content Distribution From Sequences
In UofABioinformaticsHub/fastqcReports: Load FastqQC reports and other NGS related files

estGcDistn

R Documentation

Estimate a GC Content Distribution From Sequences

Description

Generate a GC content distribution from sequences for a given read length and fragment length

Usage

estGcDistn(x, n = 1e+06, rl = 100, fl = 200, fragSd = 30, bins = 101, ...)

## S4 method for signature 'ANY'
estGcDistn(x, n = 1e+06, rl = 100, fl = 200, fragSd = 30, bins = 101, ...)

## S4 method for signature 'character'
estGcDistn(x, n = 1e+06, rl = 100, fl = 200, fragSd = 30, bins = 101, ...)

## S4 method for signature 'DNAStringSet'
estGcDistn(x, n = 1e+06, rl = 100, fl = 200, fragSd = 30, bins = 101, ...)

Arguments

`x`	`DNAStringSet` or path to a fasta file
`n`	The number of reads to sample
`rl`	Read Lengths to sample
`fl`	The mean of the fragment lengths sequenced
`fragSd`	The standard deviation of the fragment lengths being sequenced
`bins`	The number of bins to estimate
`...`	Not used

Details

The function takes the supplied object and returns the theoretical GC content distribution. Using a fixed read length essentially leads to a discrete distribution so the bins argument is used to define the number of bins returned. This defaults to 101 for 0 to 100% inclusive.

The returned values are obtained by interpolating the values obtained during sampling. This avoids returned distributions with gaps and jumps as would be obtained setting readLengths at values not in multiples of 100.

Based heavily on https://github.com/mikelove/fastqcTheoreticalGC

Value

A tibble with two columns: GC_Content and Freq denoting the proportion of GC and frequency of occurence reqpectively

Examples

faDir <- system.file("extdata", package = "ngsReports")
faFile <- list.files(faDir, pattern = "fasta", full.names = TRUE)
df <- estGcDistn(faFile, n = 200)

UofABioinformaticsHub/fastqcReports documentation built on June 10, 2025, 11:01 a.m.

UofABioinformaticsHub/fastqcReports index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

UofABioinformaticsHub/fastqcReports
Load FastqQC reports and other NGS related files

estGcDistn: Estimate a GC Content Distribution From Sequences
In UofABioinformaticsHub/fastqcReports: Load FastqQC reports and other NGS related files

Estimate a GC Content Distribution From Sequences

Description

Usage

Arguments

Details

Value

Examples

Related to estGcDistn in UofABioinformaticsHub/fastqcReports...

R Package Documentation

Browse R Packages

We want your feedback!

UofABioinformaticsHub/fastqcReports Load FastqQC reports and other NGS related files

estGcDistn: Estimate a GC Content Distribution From Sequences In UofABioinformaticsHub/fastqcReports: Load FastqQC reports and other NGS related files

Estimate a GC Content Distribution From Sequences

Description

Usage

Arguments

Details

Value

Examples

Related to estGcDistn in UofABioinformaticsHub/fastqcReports...

R Package Documentation

Browse R Packages

We want your feedback!

UofABioinformaticsHub/fastqcReports
Load FastqQC reports and other NGS related files

estGcDistn: Estimate a GC Content Distribution From Sequences
In UofABioinformaticsHub/fastqcReports: Load FastqQC reports and other NGS related files