gc: Collect display and correct GC-content related coverage bias

Description Usage Arguments Details Value Examples

Description

Collect information and perform statistics of depth of coverage in relation with GC-content.

Usage

1
2
3
4
5
6
  gc.sample.stats(file, col_types = "c--dd----d----", buffer = 33554432,
                 parallel = 2L, verbose = TRUE)
  gc.summary.plot(gc_list, mean.col = 1, median.col = 2,
     scale.subset = 1.5, ...)
  mean_gc(gc_list)
  median_gc(gc_list)

Arguments

file

name of a file in the seqz format.

col_types

a string describing the classes of each columns of the input file (see read_tsv). The default value corresponds to the columns of a seqz file used for carculating GC statistics.

buffer

maximal size of each chunk in bytes(see chunk.apply).

parallel

integer, number of threads used to process a seqz file (see chunk.apply).

verbose

logical. If TRUE (the default) the function retuns information in the console.

gc_list

a normal or tumor list resulting from the gc.sample.stats function.

mean.col

color for the mean in the summary plot.

median.col

color for the median in the summary plot.

scale.subset

scale the depth values to sho in the plot. A value of 1 will show the average depth at the center of the plot.

...

additional parametrers from colorgram.

Details

gc.sample.stats extracts depths and GC-content inforation for the tumor and the control samples from an seqz file it returns a list with 3 elements: file.metrics, normal and tumor.

file.metrics is a data.frame serving as index of the seqz file; the normal and tumor objects contains each 3 ojects: gc, depth and n.

gc and depth are vectors containing the recorded values of, respectively, GC and coverage depth. the n object is a matrix gcxdepth, recording the number of time a certain gc/depth pairs is observed in the data.

Value

A list with the following elements:

file.metrics

index of the seqz file.

tumor

GC and coverage depth observations in the tumor sample.

normal

GC and coverage depth observations in the control sample.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
## Not run: 

data.file <-  system.file("extdata", "example.seqz.txt.gz", package = "sequenza")
# read all the chromosomes:
gc_info <- gc.sample.stats(data.file)

# mean values of depth coverage vs GC content

mean_gc(gc_info$normal)

# plot the information for the tumor and normal samples
par(mfrow=c(1, 2))
gc.summary.plot(gc_info$normal, main = "Normal GC stats")
gc.summary.plot(gc_info$tumor, main = "Tumor GC stats")

## End(Not run)

sequenza documentation built on May 9, 2019, 5:04 p.m.