deconvolute_and_contextualize: Generate cell weighted Fold-Changes (cwFold-changes)

View source: R/deconvolute_and_contextualize.R

deconvolute_and_contextualizeR Documentation

Generate cell weighted Fold-Changes (cwFold-changes)

Description

This function takes a count matrix, signature matrix, and differentially expressed genes (DEGs) before generating cwFold-changes for each cell-type.

Usage

deconvolute_and_contextualize(
  count_file,
  signature_matrix,
  DEG_list,
  case_grep,
  control_grep,
  max_proportion_change = -9,
  print_plots = TRUE,
  plot_names = "scMappR",
  theSpecies = "human",
  FC_coef = TRUE,
  sig_matrix_size = 3000,
  drop_unknown_celltype = TRUE,
  toSave = FALSE,
  path = NULL,
  deconMethod = "DeconRNASeq",
  rareCT_filter = TRUE
)

Arguments

count_file

Normalized (e.g. CPM, TPM, RPKM) RNA-seq count matrix where rows are gene symbols and columns are individuals. Either the matrix itself of class "matrix" or data.frame" or a path to a tsv file containing these DEGs. The gene symbols in the count file, signature matrix, and DEG list must match.

signature_matrix

Signature matrix (fold-change ratios) of cell-type specificity of genes. Either the object itself or a pathway to an .RData file containing an object named "wilcoxon_rank_mat_or". We strongly recommend inputting the signature matrix directly.

DEG_list

An object with the first column as gene symbols within the bulk dataset (doesn't have to be in signature matrix), second column is the adjusted P-value, and the third the log2FC. Path to a tsv file containing this info is also acceptable.

case_grep

Tag in the column name for cases (i.e. samples representing upregulated) OR an index of cases.

control_grep

Tag in the column name for control (i.e. samples representing downregulated) OR an index of cases.

max_proportion_change

Maximum cell-type proportion change. May be useful if a cell-type does not exist in one condition, thus preventing infinite values.

print_plots

Whether boxplots of the estimated CT proportion for the leave-one-out method of CT deconvolution should be printed (T/F).

plot_names

If plots are being printed, the pre-fix of their .pdf files.

theSpecies

internal species designation to be passed from 'scMappR_and_pathway_analysis'. It only impacts this function if data are taken directly from the PanglaoDB database (i.e. not reprocessed by scMappR or the user).

FC_coef

Making cwFold-changes based on fold-change (TRUE) or rank := (-log10(Pval)) (FALSE) rank. After testing, we strongly recommend to keep true (T/F).

sig_matrix_size

Number of genes in signature matrix for cell-type deconvolution.

drop_unknown_celltype

Whether or not to remove "unknown" cell-types from the signature matrix (T/F).

toSave

Allow scMappR to write files in the current directory (T/F).

path

If toSave == TRUE, path to the directory where files will be saved.

deconMethod

Which RNA-seq deconvolution method to use to estimate cell-type proporitons. Options are "WGCNA", "DCQ", or "DeconRNAseq"

rareCT_filter

option to keep cell-types rarer than 0.1 percent of the population (T/F). Setting to FALSE may lead to false-positives.

Details

This function completes the pre-processing, normalization, and scaling steps in the scMappR algorithm before calculating cwFold-changes. cwFold-changes scales bulk fold-changes by the cell-type specificity of the gene, cell-type gene-normalized cell-type proportions, and the reciprocal ratio of cell-type proportions between the two conditions. cwFold-changes are generated for genes that are in both the count matrix and in the list of DEGs. It does not have to also be in the signature matrix. First, this function will estimate cell-type proportions with all genes included before estimating changes in cell-type proportion between case/control using a t-test. Then, it takes a leave-one-out approach to cell-type deconvolution such that estimated cell-type proportions are computed for every inputted DEG. Optionally, the differences between cell-type proportions before and after a gene is removed is plotted in boxplots. Then, for every gene, cwFold-changes are computed with the following formula (the example for upreguatled genes) val <- cell-preferences * cell-type_proportion * cell-type_proportion_fold-change * sign*2^abs(gene_DE$log2fc). A matrix of cwFold-changes for all DEGs are returned.

Value

List with the following elements:

cellWeighted_Foldchange

data frame of cellweightedFold changes for each gene.

cellType_Proportions

data frame of cell-type proportions from DeconRNA-seq.

leave_one_out_proportions

data frame of average cell-type proportions for case and control when gene is removed.

processed_signature_matrix

signature matrix used in final analysis.

Examples


data(PBMC_example)
bulk_DE_cors <- PBMC_example$bulk_DE_cors
bulk_normalized <- PBMC_example$bulk_normalized
odds_ratio_in <- PBMC_example$odds_ratio_in
case_grep <- "_female"
control_grep <- "_male"
max_proportion_change <- 10
print_plots <- FALSE
theSpecies <- "human"
cwFC <- deconvolute_and_contextualize(count_file = bulk_normalized,
                                    signature_matrix = odds_ratio_in, DEG_list = bulk_DE_cors,
                                    case_grep = case_grep, control_grep = control_grep,
                                     max_proportion_change = max_proportion_change,
                                      print_plots = print_plots, 
                                     theSpecies = theSpecies, toSave = FALSE)

                                      

scMappR documentation built on July 9, 2023, 6:26 p.m.