lohSpec: Plot LOH data

Description Usage Arguments Details Value Examples

View source: R/lohSpec.R

Description

Construct a graphic visualizing Loss of Heterozygosity in a cohort

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
lohSpec(
  x = NULL,
  path = NULL,
  fileExt = NULL,
  y = NULL,
  genome = "hg19",
  gender = NULL,
  step = 1e+06,
  window_size = 2500000,
  normal = 0.5,
  colourScheme = "inferno",
  plotLayer = NULL,
  method = "slide",
  out = "plot"
)

Arguments

x

object of class data frame with rows representing germline calls. The data frame must contain columns with the following names "chromosome", "position", "n_vaf", "t_vaf", "sample". required if path is set to NULL (see details). vaf should range from 0-1.

path

Character string specifying the path to a directory containing germline calls for each sample. Germline calls are expected to be stored as tab-seperated files which contain the following column names "chromosome", "position", "n_vaf", "t_vaf", and "sample". required if x is set to null (see details).

fileExt

Character string specifying the file extensions of files within the path specified. Required if argument is supplied to path (see details).

y

Object of class data frame with rows representing chromosome boundaries for a genome assembly. The data frame must contain columns with the following names "chromosome", "start", "end" (optional: see details).

genome

Character string specifying a valid UCSC genome (see details).

gender

Character vector of length equal to the number of samples, consisting of elements from the set "M", "F". Used to suppress the plotting of allosomes where appropriate.

step

Integer value specifying the step size (i.e. the number of base pairs to move the window). required when method is set to slide (see details).

window_size

Integer value specifying the size of the window in base pairs in which to calculate the mean Loss of Heterozygosity (see details).

normal

Numeric value within the range 0-1 specifying the expected normal variant allele frequency to be used in Loss of Heterozygosity calculations. defaults to .50%

colourScheme

Character vector specifying the colour scale to use from the viridis package. One of "viridis", "magma", "plasma", or "inferno".

plotLayer

Valid ggpot2 layer to be added to the plot.

method

character string specifying the approach to be used for displaying Loss of Heterozygosity, one of "tile" or "slide" (see details).

out

Character vector specifying the the object to output, one of "data", "grob", or "plot", defaults to "plot" (see returns).

Details

lohSpec is intended to plot the loss of heterozygosity (LOH) within a sample. As such lohSpec expects input data to contain only LOH calls. Input can be supplied as a single data frame given to the argument x with rows containing germline calls and variables giving the chromosome, position, normal variant allele frequency, tumor variant allele frequency, and the sample. In lieu of this format a series of .tsv files can be supplied via the path and fileExt arguments. If this method is choosen samples will be infered from the file names. In both cases columns containing the variant allele frequency for normal and tumor samples should range from 0-1. Two methods exist to calculate and display LOH events. If the method is set to "tile" mean LOH is calculated based on the window_size argument with windows being placed next to each other. If the method is set to slide the widnow will slide and calculate the LOH based on the step parameter. In order to ensure the entire chromosome is plotted lohSpec requries the location of chromosome boundaries for a given genome assembly. As a convenience this information is available for the following genomes "hg19", "hg38", "mm9", "mm10", "rn5" and can be tetrieved by supplying one of the afore mentioned assemblies via the 'genome'paramter. If an argument is supplied to the 'genome' parameter and is unrecognized a query to the UCSC MySQL database will be attempted to obtain the required information. If chromosome boundary locations are unavailable for a given assembly this information can be supplied to the 'y' parameter which has priority over the 'genome' parameter.

Value

One of the following, a list of dataframes containing data to be plotted, a grob object, or a plot.

Examples

1
2
# plot loh within the example dataset
lohSpec(x=HCC1395_Germline)

GenVisR documentation built on Dec. 28, 2020, 2 a.m.