scMappR_and_pathway_analysis: Generate cellWeighted_Foldchanges, visualize, and enrich.

View source: R/scMappR_and_pathway_analysis.R

scMappR_and_pathway_analysisR Documentation

Generate cellWeighted_Foldchanges, visualize, and enrich.

Description

This function generates cell weighted Fold-changes (cellWeighted_Foldchange), visualizes them in a heatmap, and completes pathway enrichment of cellWeighted_Foldchanges and the bulk gene list using g:ProfileR.

Usage

scMappR_and_pathway_analysis(
  count_file,
  signature_matrix,
  DEG_list,
  case_grep,
  control_grep,
  rda_path = "",
  max_proportion_change = -9,
  print_plots = T,
  plot_names = "scMappR",
  theSpecies = "human",
  output_directory = "scMappR_analysis",
  sig_matrix_size = 3000,
  drop_unknown_celltype = TRUE,
  internet = TRUE,
  up_and_downregulated = FALSE,
  gene_label_size = 0.4,
  number_genes = -9,
  toSave = FALSE,
  newGprofiler = TRUE,
  path = NULL,
  deconMethod = "DeconRNASeq",
  rareCT_filter = TRUE
)

Arguments

count_file

Normalized (i.e. TPM, RPKM, CPM) RNA-seq count matrix where rows are gene symbols and columns are individuals. Inputted data should be a data.frame or matrix. A character vector to a tsv file where this data can be loaded is also acceptable. Gene symbols from the count file, signature matrix, and DEG list should all match (case sensitive, gene symbol or ensembl, etc.)

signature_matrix

Signature matrix: a gene by cell-type matrix populated with the fold-change of gene expression in cell-type marker "i" vs all other cell-types. Object should be a data.frame or matrix.

DEG_list

An object with the first column as gene symbols within the bulk dataset (doesn't have to be in signature matrix), second column is the adjusted p-value, and the third the log2FC path to a .tsv file containing this info is also acceptable.

case_grep

A character representing what designates the "cases" (i.e. upregulated is 'case' biased) in the columns of the count file. A numeric vector of the index of "cases" is also acceptable. Tag in the column name for cases (i.e. samples representing upregulated) OR an index of cases.

control_grep

A character representing what designates the "control" (i.e. downregulated is 'control biased) in the columns of the count file. A numeric vector of the index of "control" is also acceptable. Tag in the column name for cases (i.e. samples representing upregulated) OR an index of cases.

rda_path

If downloaded, path to where data from scMappR_data is stored.

max_proportion_change

Maximum cell-type proportion change – may be useful if there are many rare cell-type. Alternatively, if a cell-type is only present in one condition but not the other, it will prevent possible infinite or 0 cwFold-changes.

print_plots

Whether boxplots of the estimated CT proportion for the leave-one-out method of CT deconvolution should be printed. The same name of the plots will be completed for top pathways.

plot_names

The prefix of plot pdf files.

theSpecies

human, mouse, or a species directly compatible with gProfileR (i.e. g:ProfileR).

output_directory

The name of the directory that will contain output of the analysis.

sig_matrix_size

Maximum number of genes in signature matrix for cell-type deconvolution.

drop_unknown_celltype

Whether or not to remove "unknown" cell-types from the signature matrix.

internet

Whether you have stable Wifi (T/F).

up_and_downregulated

Whether you are additionally splitting up/downregulated genes (T/F).

gene_label_size

The size of the gene label on the plot.

number_genes

The number of genes to cut-off for pathway analysis (good with many DEGs).

toSave

Allow scMappR to write files in the current directory (T/F).

newGprofiler

Whether to use gProfileR or gprofiler2 (T/F).

path

If toSave == TRUE, path to the directory where files will be saved.

deconMethod

Which RNA-seq deconvolution method to use to estimate cell-type proporitons. Options are "WGCNA", "DCQ", or "DeconRNAseq"

rareCT_filter

option to keep cell-types rarer than 0.1 percent of the population (T/F). Setting to FALSE may lead to false-positives.

Details

This function generates cellWeighted_Foldchanges for every cell-type (see deconvolute_and_contextualize), as well as accompanying data such as cell-type proportions with the DeconRNA-seq, WGCNA, or DCQ methods. Then, it generates heatmaps of all cellWeighted_Foldchanges, cellWeighted_Foldchanges overlapping with the signature matrix, the entire signature matrix, the cell-type preference values from the signature matrix that overlap with inputted differentially expressed genes. Then, assuming there is available internet, it will complete gProfileR of the reordered cellWeighted_Foldchanges as well as a the ordered list of genes. This function is a wrapper for deconvolute_and_contextualize and pathway_enrich_internal and the primary function within the package.

Value

List with the following elements:

cellWeighted_Foldchanges

Cellweighted Fold-changes for all differentially expressed genes.

paths

Enriched biological pathways for each cell-type.

TFs

Enriched TFs for each cell-type.

Examples


data(PBMC_example)
bulk_DE_cors <- PBMC_example$bulk_DE_cors
bulk_normalized <- PBMC_example$bulk_normalized
odds_ratio_in <- PBMC_example$odds_ratio_in
case_grep <- "_female"
control_grep <- "_male"
max_proportion_change <- 10
print_plots <- FALSE
theSpecies <- "human"
toOut <- scMappR_and_pathway_analysis(count_file = bulk_normalized, 
                                      signature_matrix = odds_ratio_in, 
                                      DEG_list = bulk_DE_cors, case_grep = case_grep,
                                      control_grep = control_grep, rda_path = "", 
                                      max_proportion_change = 10, print_plots = TRUE, 
                                      plot_names = "tst1", theSpecies = "human", 
                                      output_directory = "tester",
                                      sig_matrix_size = 3000,
                                      up_and_downregulated = FALSE, 
                                      internet = FALSE)




scMappR documentation built on July 9, 2023, 6:26 p.m.