prepare.tcga.survival.data: Prepare FPKM data from TCGA project (specific tissue)

Description Usage Arguments Details Value Examples

View source: R/build_data.R

Description

This function will load data and pre-process it to be used in survival models.

Usage

1
2
3
4
5
prepare.tcga.survival.data(project = "brca",
  tissue.type = "primary.solid.tumor", input.type = "rna",
  normalization = "none", log2.pre.normalize = FALSE,
  include.negative.survival = FALSE, handle.duplicates = "keep_first",
  only.coding.genes = FALSE)

Arguments

project

tcga project that has a package avaliable see https://github.com/averissimo/tcga.data

tissue.type

type of tissue, can be 'primary.solid.tumor', 'metastatic', etc... depending on project.

input.type

either 'rna' for RNASeq or 'dna' for mutation data

include.negative.survival

shift the survival times to include negative survival

handle.duplicates

strategy to handle multiple samples for same individual, can take 'keep_first' or 'keep_all'

center

scale xdata by subtracting by mean and dividing by standard deviation

coding.genes

filter the genes to only include coding genes, see loose.rock::coding.genes

Details

It will: * load data * handle duplicate samples for same individal (default is to keep only first) * remove individuals with missing vital_status or both follow-up/death time span * remove individuals with follow-up/death time span == 0 * remove genes from RNASeqData (xdata) with standard deviation == 0

Value

a list with data ready to be used in survival analysis, the 'xdata.raw' and 'ydata.raw' elements have the full dataset for the specific tissue and the 'xdata' and 'ydata' have been cleaned by handling patients with multiple samples, removing individuals with event time <= 0, missing and genes that have standard_deviation == 0. It also returns a sha256 checksum for each of the data

Examples

1
2
3
4
5
6
7
8
9
## Not run: 
    # Install a data package to load cancer data
    source("https://bioconductor.org/biocLite.R")
    biocLite(paste0('https://github.com/averissimo/tcga.data/releases/download/',
                    '2016.12.15-brca/brca.data_1.0.tar.gz'))
    prepare.tcga.survival.data('brca', 'primary.solid.tumor', 'keep_first')
    prepare.tcga.survival.data('brca', 'primary.solid.tumor', 'keep_first', input.type = 'dna')

## End(Not run)

averissimo/glmSparseNetPaper documentation built on Jan. 25, 2021, 12:11 p.m.