makeDF: Merge the coverage information for a group of samples

Description Usage Arguments Value Author(s)

View source: R/makeDF.R

Description

For a group of samples this function reads the coverage information for a specific chromosome directly from the BAM files. It then merges them into a DataFrame and removes the bases that do not pass the cutoff.

Usage

1
2
3
makeDF(chr, datadir = NULL, sampledirs = NULL, samplepatt = NULL,
  cutoff = 5, chrlen = NULL, org = "BSgenome.Hsapiens.UCSC.hg19",
  bamterm = "accepted_hits.bam", output = NULL, verbose = TRUE)

Arguments

chr

Chromosome to read. Should be in simple format: X and not chrX.

datadir

The main directory where each of the sampledirs is a sub-directory of datadir.

sampledirs

A character vector with the names of the sample directories. If datadir is NULL it is then assumed that sampledirs specifies the full path to each sample.

samplepatt

If specified and sampledirs is set to NULL, then the directories matching this pattern in datadir (set to . if it's set to NULL) are used as the sample directories.

cutoff

Per base pair, at least one sample has to have coverage greater than cutoff to be included in the result.

chrlen

The chromosome length in base pairs.

org

If chrlen is set to NULL, then the chromosome length is deduced using the BSgenome annotation packages for either Human or Mouse. This is just meant as an easy to use option for a handful of genomes.

bamterm

Name of the BAM file used in each sample. By default it is set to accepted_hits.bam since that is the automatic name generated when aligning with TopHat.

output

If NULL then no output is saved in disk. If auto then an automatic name is constructed (chrXDF.Rdata for example). If another character is specified, then that name is used for the output file.

verbose

If TRUE basic status updates will be printed along the way.

Value

A list with two objects. The first one, DF, is a DataFrame object where each column represents a sample. The number of rows depends on the number of base pairs that passed the cutoff. The second one, pos, is a logical Rle with the positions of the chromosome that passed the cutoff.

Author(s)

Leonardo Collado-Torres


leekgroup/derfinder documentation built on May 20, 2019, 11:30 p.m.