GCbias: GCbias

View source: R/GCbias.R

GCbiasR Documentation

GCbias

Description

Plot GC content versus read counts.

Usage

GCbias(
  bamFiles,
  bamNames = bamFiles,
  minMQS = 255,
  maxFrag = 500,
  pe = "none",
  restrict = "chr11",
  winWidth = 5000,
  col = inferno,
  genome,
  GCprob = TRUE,
  span = 0.1,
  plot = TRUE,
  logCPM = TRUE,
  priorCount = 1
)

Arguments

bamFiles

Character vector containing the filenames filenames (including the full path) of read alignment files in bam format.

bamNames

Character vector containing the names to describe the bamFiles you are using (for example: "H3K9me3_reads"). If no names are supplied, the full bamFiles names are used.

minMQS

Integer scalar, specifying the minimum mapping quality that a read must have to be included. Default is 255, which eliminates multimapping reads in case the STAR aligner was used to generate the bamFiles.

maxFrag

Integer scalar, specifying the maximum fragment length corresponding to a read pair. Defaults to 500 base pairs.

pe

Character scalar indicating whether paired-end data is present; set to "none" (the default), "both", "first" or "second".

restrict

Character vector containing the names of allowable chromosomes from which reads will be extracted. Default is "chr11".

winWidth

Integer scalar specifying the width of the window, in which reads are counted and GC content calculated. Default is 5000 base pairs.

col

Color scheme for the smooth scatter plots. If not provided, viridis::inferno is used.

genome

BSGenome object. Required parameter. For example,use BSgenome.Mmusculus.UCSC.mm10 for mouse.

GCprob

Logical scalar, indicating whether the GC content should be displayed as absolute counts (GCprob=FALSE) or as fraction of GCs (GCprob=TRUE,default).

span

Numeric scalar specifying the span that is used for loess trendline. Default= 0.1

plot

If TRUE, the output will be plotted, otherwise the matrix to generate the plots will be returned.

logCPM

Logical, should the cpm be reported on log scale?

priorCount

Prior Count for calculating cpm.

Details

This function generates a scatter plot of the number of Gs and Cs on the x-axis and the read count (cpm) on the y-axis in windows of size winWidth bp across the genome. A seperate plot is generated for each read alignment file in bamFiles. Supports both single and paired-end experiments. These plots allow the user to check if there is a potential GCbias in the (ChIPseq) data.

Value

This function generates a scatter plot of the number of Gs and Cs on the x-axis and the read count (cpm) on the y-axis in windows of size winWidth bp across the genome. A loess trendline is added to allow the user to see a potential GCbias trend in the data provided.

Examples

library(BSgenome.Mmusculus.UCSC.mm10)
bamFiles <- list.files(system.file("extdata", package = "MiniChip"),
 full.names=TRUE,pattern="*bam$")[1:2]
bamNames <- gsub(paste(system.file("extdata", package = "MiniChip"),
"/",sep=""),"",bamFiles)
bamNames <- gsub("_chr11.bam","",bamNames)
GCbias(bamFiles=bamFiles,bamNames=bamNames,
genome=BSgenome.Mmusculus.UCSC.mm10)


fmi-basel/gbuehler-MiniChip documentation built on June 13, 2025, 6:15 a.m.