tcgaNormalizer: Normalize RNA-seq or miRNA-seq dataset

Description Usage Arguments Value See Also Examples

View source: R/tcgaNormalizer.R

Description

tcgaNormalizer normalizes RNA-seq expression dataset by 1) removing genes with > 70% NA values across samples or across genes/microRNAs, 2) removing genes with low variation (sd < 0.2) across samples, 3) log2 transformation and 4) quantile normalization. For miRNA-seq data removing microNRAs with > 95% NA values across samples in step 2).

Usage

1
2
3
4
5
6
7
8
tcgaNormalizer(
  data,
  dataType,
  mir_na_thre = 0.95,
  mir_sd_thre = 0.2,
  plotFig = TRUE,
  filename = NULL
)

Arguments

data

A data matrix, with rows referring to genes/microRNAs and columns to samples, can be the output from tcgaTableGenerator or tcgaConvRownames.

dataType

A string, 'microRNA' for microRNA-seq data or 'mRNA' for RNA-seq data.

mir_na_thre

The threshold used to remove microRNAs with too many NAs across samples, defualt to 95%.

mir_sd_thre

The threshold used to remove microRNAs with low variation across samples, defualt to 0.2.

plotFig

Logic, 'TRUE' if you want to plot a scatter plot of NA proportions, 'FALSE' otherwise.

filename

The name of output scatterplot of NA proportions across samples and genes/microRNAs.

Value

A data matrix with normalized gene/microRNA expression data. A scatter plots of NA proportions across samples and genes/microRNAs.

See Also

tcgaTableGenerator for generating a gene expression data matrix from single FPKM files downloaded from GDC Data Portal, tcgaConvRownames for converting rownames of a data matrix.

Examples

1
2
tcgaNormalizer(gen.luad.m, dataType = 'mRNA', filename = 'scatter plot of genes')
tcgaNormalizer(mir.luad.m, dataType = 'microRNA', filename = 'scatter plot of miNRAs')

YC3/mirNet documentation built on Sept. 3, 2020, 3:25 a.m.