#' @title Convert the rownames of a data matrix from 'ENSEMBL IDs' to 'HUGO Gene Symbols'.
#'
#' @description \code{tcgaConvRownames} converts the rownames of a data matrix (gene identifiers) from 'ENSEMBL IDs' to 'HUGO Gene Symbols', and then summarize the expression of duplicated genes by taking the average.
#'
#' @param data A data matrix, with rows referring to genes and columns to samples. Can be the output from \code{\link[mirNet]{tcgaTableGenerator}}.
#'
#' @return A data matrix, with gene identifiers converted from 'ENSEMBL IDs' to 'HUGO Gene Symbols'.
#'
#' @seealso \code{\link[mirNet]{tcgaTableGenerator}} for generating a gene expression data matrix from single FPKM files downloaded from GDC Data Portal.
#'
#' @import org.Hs.eg.db
#' @importFrom magrittr "%>%"
#' @importFrom dplyr group_by
#' @importFrom dplyr summarise_all
#' @importFrom AnnotationDbi mapIds
#'
#' @export tcgaConvRownames
#'
#' @examples
#' tcgaConvRownames(data)
tcgaConvRownames <- function(data){
symbols <- mapIds(org.Hs.eg.db, keys = sapply(strsplit(rownames(data), "\\."), '[', 1), keytype = 'ENSEMBL', column = 'SYMBOL')
id <- which(!is.na(symbols))
data2 <- data[id, ]
rownames(data2) <- symbols[id]
data3 <- as.data.frame(data2) %>% group_by(rownames(data2)) %>% summarise_all(mean) %>% as.data.frame(stringsAsFactors = FALSE)
rownames(data3) <- data3[, 1]
data3[, -1]
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.