R/tcgaTableSplit.R

Defines functions tcgaTableSplit

Documented in tcgaTableSplit

#' @title Split a data matrix into two according to Normal-Cancer status. 
#'
#' @description \code{tcgaTableSplit} splits input dataset into two substes: Normal and Cancer, and returns a list.
#'
#' @param data A data matrix, can be the output from \code{\link[mirNet]{tcgaTableGenerator}} or \code{\link[mirNet]{tcgaConvRownames}}.
#' @param sampleSheet The table containing clinical informations of samples downloaded from GDC together with sequencing data files.
#'
#' @return A list, with entries are matrices of normal/cancer samples in the original dataset.
#'
#' @seealso \code{\link[mirNet]{tcgaTableGenerator}} for generating a gene expression data matrix from single FPKM files downloaded from GDC Data Portal, \code{\link[mirNet]{tcgaConvRownames}} for converting rownames of a data matrix.
#'
#' @importFrom data.table fread
#'
#' @export tcgaTableSplit
#'
#' @examples
#' tcgaTableSplit(exp.luad.m, sampleSheet = 'clinical_LUAD/TCGA_samples_LUAD.tsv')



tcgaTableSplit <- function(data, sampleSheet){

    sample <- as.data.frame(fread(sampleSheet))
    sample2 <- sample[match(colnames(data), sample[, 'File ID']), c('File ID', 'Sample ID', 'Sample Type')]

    colnames(data) <- sample2[, 'Sample ID']
    exp.c <- data[, which(sample2[, 'Sample Type'] == 'Primary Tumor')] 
    exp.n <- data[, which(sample2[, 'Sample Type'] == 'Solid Tissue Normal')]

    list(cancer = exp.c, normal = exp.n)
}
YC3/mirNet documentation built on Sept. 3, 2020, 3:25 a.m.