read_rnaseq_legacy: Reads RNA sequencing data from the GDC legacy archive.

Description Usage Arguments Value See Also Examples

Description

Reads RNASeq Level 3 data downloaded from the legacy GDC. The functions does some conservative checking in order to validate that all samples and features are annotated correctly. Also note that duplicate data (files with the same extract name) will be treated as independent duplicates for the same sample.

Usage

1
2
read_rnaseq_legacy(manifest, folder, features = "genes",
  normalization = "raw", progress = TRUE)

Arguments

manifest

Path to the GDC file manifest.

folder

Folder where the data files reside.

features

The feature type. Must be one of "genes", "isoforms", "junctions" or "exons".

normalization

The normalization method. Must be one of

"raw"

The raw counts.

"Q75"

Normalization by dividing through the 75 raw counts (TCGA default). This only available for transcripts and isoforms.

"XPM"

X per million. This will use transcripts per million (TPM) for genes and isoforms and reads per kilobase of transcript per million (RPKM) for junctions and exons.

progress

Logical. Show progress info?

Value

A data table containing the features as rows and the samples in the columns.

See Also

read_huex to read exon expression data.

Examples

1
2
3
4
# Not run due to large download...
# gbm <- system.file("extdata", "manifest.tsv", package = "tcgar")
# d <- tempdir()
# rna <- read_rnaseq(gbm, d)

cdiener/tcgar documentation built on May 13, 2019, 2:41 p.m.