genCov: Construct a region of interest coverage plot
In griffithlab/GenVisR: Genomic Visualizations in R

genCov

R Documentation

Construct a region of interest coverage plot

Description

Given a list of data frames construct a sequencing coverage view over a region of interest.

Usage

genCov(
  x,
  txdb,
  gr,
  genome,
  reduce = FALSE,
  gene_colour = NULL,
  gene_name = "Gene",
  gene_plotLayer = NULL,
  label_bgFill = "black",
  label_txtFill = "white",
  label_borderFill = "black",
  label_txtSize = 10,
  lab2plot_ratio = c(1, 10),
  cov_colour = "blue",
  cov_plotType = "point",
  cov_plotLayer = NULL,
  base = c(10, 2, 2),
  transform = c("Intron", "CDS", "UTR"),
  gene_labelTranscript = TRUE,
  gene_labelTranscriptSize = 4,
  gene_isoformSel = NULL,
  out = "plot",
  subsample = FALSE
)

Arguments

`x`	Named list with list elements containing data frames representing samples. Data frame rows should represent read pileups observed in sequencing data. Data frame column names must include "end" and "cov" corresponding to the base end position and coverage of a pileup respectively. Data within data frames must be on the same chromosome as the region of interest, see details!
`txdb`	Object of class TxDb giving transcription meta data for a genome assembly. See Bioconductor annotation packages.
`gr`	Object of class GRanges specifying the region of interest and corresponding to a single gene. See Bioconductor package GRanges.
`genome`	Object of class BSgenome specifying the genome sequence of interest. See Bioconductor annotation packages.
`reduce`	Boolean specifying whether to collapse gene isoforms within the region of interest into one representative transcript. Experimental use with caution!
`gene_colour`	Character string specifying the colour of the gene to be plotted in the gene track.
`gene_name`	Character string specifying the name of the gene or region of interest.
`gene_plotLayer`	Valid ggplot2 layer to be added to the gene sub-plot.
`label_bgFill`	Character string specifying the desired background colour of the track labels.
`label_txtFill`	Character string specifying the desired text colour of the track labels.
`label_borderFill`	Character string specifying the desired border colour of the track labels.
`label_txtSize`	Integer specifying the size of the text within the track labels.
`lab2plot_ratio`	Numeric vector of length 2 specifying the ratio of track labels to plot space.
`cov_colour`	Character string specifying the colour of the data in the coverage plots.
`cov_plotType`	Character string specifying one of "line", "bar" or "point". Changes the ggplot2 geom which constructs the data display.
`cov_plotLayer`	Valid ggplot2 layer to be added to the coverage sub-plots.
`base`	Numeric vector of log bases to transform the data corresponding to the elements supplied to the variable transform See details.
`transform`	Character vector specifying what objects to log transform, accepts "Intron", "CDS", and "UTR" See details.
`gene_labelTranscript`	Boolean specifying whether to plot the transcript names in the gene plot.
`gene_labelTranscriptSize`	Integer specifying the size of the transcript name text in the gene plot.
`gene_isoformSel`	Character vector specifying the names (from the txdb object) of isoforms within the region of interest to display.
`out`	Character vector specifying the object to output, one of "data", "grob", or "plot", defaults to "plot" (see returns).
`subsample`	Boolean value specifying whether to reduce the provided coverage data to a subset of approximately 1000 points. Used to generate sparse plots that use less disk space and are faster to render.

Details

genCov is a function designed construct a series of tracks based on a TxDb object giving transcript features, and coverage data supplied to parameter 'x'. The function will look at a region of interest specified by the argument supplied to gr and plot transcript features and the corresponding coverage information. The argument supplied to 'genome' enables gc content within genomic features to be calculated and displayed. The argument supplied to x must contain data on the same chromosome as the region of interest specified in the parameter 'gr'!

Typically, introns of a transcript are much larger than exons, while exons are sometimes of greater interest. To address this, genCov will by default scale the x-axis to expand track information according to region type: coding sequence (CDS), untranslated region (UTR), or intron / intergenic (Intron). The amount by which each region is scaled is controlled by the 'base' and 'transform' arguments. 'transform' specifies which regions to scale, and 'base' corresponds to the log base transform to apply to those regions. To keep one or more region types from being scaled, omit the corresponding entries from the 'base' and 'transform' vectors.

Value

One of the following, a list of dataframes containing data to be plotted, a grob object, or a plot.

Examples

# Load transcript meta data
library(TxDb.Hsapiens.UCSC.hg19.knownGene)
txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene

# Load BSgenome
library(BSgenome.Hsapiens.UCSC.hg19)
genome <- BSgenome.Hsapiens.UCSC.hg19

# Define a region of interest
gr <- GRanges(seqnames=c("chr10"),
ranges=IRanges(start=c(89622195), end=c(89729532)), strand=strand(c("+")))

# Create Data for input
start <- c(89622194:89729524)
end <- c(89622195:89729525)
chr <- 10
cov <- c(rnorm(100000, mean=40), rnorm(7331, mean=10))
cov_input_A <- as.data.frame(cbind(chr, start, end, cov))

start <- c(89622194:89729524)
end <- c(89622195:89729525)
chr <- 10
cov <- c(rnorm(50000, mean=40), rnorm(7331, mean=10), rnorm(50000, mean=40))
cov_input_A <- as.data.frame(cbind(chr, start, end, cov))

# Define the data as a list
data <- list("Sample A"=cov_input_A)

# Call genCov
genCov(data, txdb, gr, genome, gene_labelTranscriptSize=3)

griffithlab/GenVisR documentation built on May 14, 2024, 12:40 a.m.