coverage_matrix_bwtool: Given a set of regions for a chromosome, compute the coverage...

Description Usage Arguments Details Value Author(s) See Also Examples

View source: R/coverage_matrix_bwtool.R

Description

Given a set of genomic regions as created by expressed_regions, this function computes the coverage matrix using bwtool. You can then scale the counts to a 40 million 100 bp reads library using scale_counts.

Usage

1
2
3
4
5
coverage_matrix_bwtool(project, regions,
  bwtool = "/dcl01/leek/data/bwtool/bwtool-1.0/bwtool", bpparam = NULL,
  outdir = NULL, verbose = TRUE, sumsdir = tempdir(), bed = NULL,
  url_table = NULL, commands_only = FALSE, pheno = NULL,
  overwrite = FALSE, ...)

Arguments

project

A character vector with one SRA study id.

regions

A GRanges-class object with regions for which to calculate the coverage matrix.

bwtool

The path to bwtool. Uses as the default the location at JHPCE.

bpparam

A BiocParallelParam-class instance which will be used to calculate the coverage matrix in parallel. By default, SerialParam-class will be used.

outdir

The destination directory for the downloaded file(s) that were previously downloaded with download_study. If the files are missing, but outdir is specified, they will get downloaded first. By default outdir is set to NULL which will use the data from the web. We only recommend downloading the full data if you will use it several times. Note that if you are working at JHPCE or SciServer, the files will be located automatically.

verbose

If TRUE basic status updates will be printed along the way.

sumsdir

The path to an existing directory where the bwtool sum tsv files will be saved.

bed

The path to the BED file for the regions. You are responsible for making sure that the BED file and the regions are in the same order. Could be useful for a scenario where you have a BED file and import it to define regions.

url_table

A custom data.frame named with the same columns as recount::recount_url. If NULL, the default is recount::recount_url. Use local_url saved in /dcl01/leek/data/recount-website/fileinfo/local_url.RData.

commands_only

If TRUE the bwtool commands will be saved in a file called recount-bwtool-commands_PROJECT.txt and exit without running bwtool. This is useful if you have a very large regions set and want to run the commands in an array job. Then run coverage_matrix_bwtool(commands_only = FALSE) to create the RSE object(s).

pheno

NULL by default. Specify only if you are using a custom metadata table.

overwrite

Logical, whether to overwrite output files.

...

Additional arguments passed to download_study when outdir is specified but the required files are missing.

Details

When using outdir = NULL the information will be accessed from the web on the fly. If you encounter internet access problems, it might be best to first download the BigWig files using download_study. This might be the best option if you are accessing all chromosomes for a given project and/or are thinking of using different sets of regions (for example, from different cutoffs applied to expressed_regions). If you are working at JHPCE (and part of leekgroup) or at SciServer, the files will be located automatically.

Check also system.file('extdata', 'jhpce', package = 'recount.bwtool') for some scripts that will run this function for all the projects we have available at JHPCE.

Value

A RangedSummarizedExperiment-class object with the counts stored in the assays slot.

Author(s)

Leonardo Collado-Torres

See Also

coverage_matrix, download_study, findRegions, railMatrix

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
if(.Platform$OS.type != 'windows') {
## Disable the example for now. I'd have to figure out how to install
## bwtool on travis
if(FALSE) {
    ## Reading BigWig files is not supported by rtracklayer on Windows
    ## (only needed for defining the regions in this example)
    ## Define expressed regions for study DRP002835, chrY
    regions <- expressed_regions('DRP002835', 'chrY', cutoff = 5L, 
        maxClusterGap = 3000L)

    ## Now calculate the coverage matrix for this study
    rse <- coverage_matrix_bwtool('DRP002835', regions)

    ## Scale counts
    rse_scaled <- scale_counts(rse, round = FALSE)
}
}

LieberInstitute/recount.bwtool documentation built on Feb. 7, 2020, 3:53 p.m.