sig_genes_extract: Extract significant genes

View source: R/sig_genes_extract.R

sig_genes_extractR Documentation

Extract significant genes

Description

From the layer-level modeling results, this function extracts the top n significant genes. This is the workhorse function used by sig_genes_extract_all() through which we obtain the information that can then be used by functions such as layer_boxplot() for constructing informative titles.

Usage

sig_genes_extract(
  n = 10,
  modeling_results = fetch_data(type = "modeling_results"),
  model_type = names(modeling_results)[1],
  reverse = FALSE,
  sce_layer = fetch_data(type = "sce_layer")
)

Arguments

n

The number of the top ranked genes to extract.

modeling_results

Defaults to the output of fetch_data(type = 'modeling_results'). This is a list of tables with the columns ⁠f_stat_*⁠ or ⁠t_stat_*⁠ as well as ⁠p_value_*⁠ and ⁠fdr_*⁠ plus ensembl. The column name is used to extract the statistic results, the p-values, and the FDR adjusted p-values. Then the ensembl column is used for matching in some cases. See fetch_data() for more details. Typically this is the set of reference statistics used in layer_stat_cor().

model_type

A named element of the modeling_results list. By default that is either enrichment for the model that tests one human brain layer against the rest (one group vs the rest), pairwise which compares two layers (groups) denoted by layerA-layerB such that layerA is greater than layerB, and anova which determines if any layer (group) is different from the rest adjusting for the mean expression level. The statistics for enrichment and pairwise are t-statistics while the anova model ones are F-statistics.

reverse

A logical(1) indicating whether to multiply by -1 the input statistics and reverse the layerA-layerB column names (using the -) into layerB-layerA.

sce_layer

Defaults to the output of fetch_data(type = 'sce_layer'). This is a SingleCellExperiment object with the spot-level Visium data compressed via pseudo-bulking to the layer-level (group-level) resolution. See fetch_data() for more details.

Value

A data.frame() with the top n significant genes (as ordered by their statistics in decreasing order) in long format. The specific columns are described further in the vignette.

References

Adapted from https://github.com/LieberInstitute/HumanPilot/blob/master/Analysis/Layer_Guesses/layer_specificity_functions.R

See Also

Other Layer modeling functions: layer_boxplot(), sig_genes_extract_all()

Examples


## Obtain the necessary data
if (!exists("modeling_results")) {
    modeling_results <- fetch_data(type = "modeling_results")
}
if (!exists("sce_layer")) sce_layer <- fetch_data(type = "sce_layer")

## anova top 10 genes
sig_genes_extract(
    modeling_results = modeling_results,
    sce_layer = sce_layer
)

## Extract all genes
sig_genes_extract(
    modeling_results = modeling_results,
    sce_layer = sce_layer,
    n = nrow(sce_layer)
)

LieberInstitute/spatialLIBD documentation built on Dec. 19, 2024, 7:12 p.m.