cellmap: Estimate the cell type proportions of mixture bulk RNA

View source: R/cellmap.R

cellmapR Documentation

Estimate the cell type proportions of mixture bulk RNA

Description

This function estimates cell type proportions of mixture bulk RNA samples based on pre-trained cell type profiles.

Usage

cellmap(
  strBulk,
  strProfile,
  strPrefix = substr(strBulk, 1, nchar(strBulk) - 4),
  delCT = NULL,
  cellCol = NULL,
  geneNameReady = FALSE,
  ensemblPath = "Data/",
  ensemblV = 97,
  bReturn = F,
  pCutoff = 0.05,
  core = 2
)

Arguments

strBulk

The full path to the query mixture bulk expression file. Expression matrix separated by tabs with rows are genes, columns are samples. First row includes the sample names or ids, while first column consists of gene symbols or gene ensembl id.

strProfile

The full path to a pre-trained CellMap cell type profile. The profile with ‘rds’ as file extension generated by cellMapTraining function.

strPrefix

The prefix with path of the result files. There are two files produced: a pdf file contains all cell type decomposition figures; a tab separated table file including composition and p-values.

delCT

Cell types should not be considered in the decomposition estimation. A string with exact cell type names defined in the CellMap profile. If more than one cell types needed to be removed, please separate them by commas (,). Default is NULL.

cellCol

R colors for all cell types. A named vector of R colors, where names are cell type names. Default is NULL, which means $para$cellCol from the provided CellMap profile will be used.

geneNameReady

A boolean to indicate if the gene names in the query mixture bulk expression matrix is official symbol already. The FALSE option also works with the official symbol is used in the expression matrix. Default is FALSE, which enable to find official symbol by an R package called biomaRt.

ensemblPath

The path to a folder where ensembl gene definition is/will be saved. The ensembl gene definition file will be saved if it never run before. Default is Data/ in the current working directory.

ensemblV

The version of the ensembl to be used for the input query bulk expression. Default is 97.

bReturn

A Boolean indicate if return object is needed. False, no object returned but plots in a pdf as well as a tables in a tsv file. True, return an R list object including details of raw decomposition results without generating any file.

core

The number of computation nodes could be used. Default is 2.

bCutoff

A numeric indicate the significant level. Default is 0.05.

Value

If bReturn is set to be TRUE, a named list object with detailed decomposition results is returned. The following objects are in the list, and they can be accessed by ($) of the returned list object:

  • compoP A matrix of the raw fitting coefficient for each sample (column) and each cell type (row). It needed to normalize the sum of each column to be 1, in order to martch the output compisition table.

  • compoP A matrix of the fitting p-values for each sample (column) and each cell type (row).

  • overallP A vector of the overall fitting p-value for each sample.

  • rmse A vector of the fitting RMSE for each sample.

  • coverR A numeric indicate the ratio of cell type signature genes covered in the mixture bulk expression data

  • rawComp A named list of all raw composition matrix, p-values, RMSE for each sample.

  • rawSets A matrix of all sets of pure cell type combinations.

  • missingF A vector of cell type signature genes which are not in the query bulk expresion data.

  • missingByCellType A named list of cell type signature genes which are not in the query bulk expresion data for each cell type.

Examples

strMix <- system.file("extdata","bulk.txt",package="cellmap")
strProfile <- system.file("extdata","CNS6.rds",package="cellmap")
cellmap(strMix,strProfile,strPrefix="~/cellmap_CNS6_test")


interactivereport/CellMap documentation built on March 17, 2024, 2:01 a.m.