distinct_test: Test for differential state between two groups of samples,...
In distinct: distinct: a method for differential analyses via hierarchical permutation tests

Description Usage Arguments Value Author(s) See Also Examples

distinct_test tests for differential state between two groups of samples.

distinct_test(
  x,
  name_assays_expression = "logcounts",
  name_cluster = "cluster_id",
  name_sample = "sample_id",
  design,
  column_to_test = 2,
  P_1 = 100,
  P_2 = 500,
  P_3 = 2000,
  P_4 = 10000,
  N_breaks = 25,
  min_non_zero_cells = 20,
  n_cores = 1
)

`x`	a `linkS4class{SummarizedExperiment}` or a `linkS4class{SingleCellExperiment}` object.
`name_assays_expression`	a character ("logcounts" by default), indicating the name of the assays(x) element which stores the expression data (i.e., assays(x)$name_assays_expression). We strongly encourage using normalized data, such as counts per million (CPM) or log2-CPM (e.g., 'logcounts' as created via `scater::logNormCounts`). In case additional covariates are provided (e.g., batch effects), we highly recommend using log-normalized data, such as log2-CPM (e.g., 'logcounts' as created via `scater::logNormCounts`).
`name_cluster`	a character ("cluster_id" by default), indicating the name of the colData(x) element which stores the cluster id of each cell (i.e., colData(x)$name_cluster).
`name_sample`	a character ("sample_id" by default), indicating the name of the colData(x) element which stores the sample id of each cell (i.e., colData(x)$name_sample).
`design`	a `matrix` or `data.frame` with the design matrix of the study (e.g., built via model.matrix(~batches), design must contain one row per sample, while columns include intercept, group and eventual covariates such as batches. Row names of design must indicate the sample ids, and correspond to the names in colData(x)$name_sample.
`column_to_test`	indicates the column(s) of the design one wants to test (do not include the intercept).
`P_1`	the number of permutations to use on all gene-cluster combinations.
`P_2`	the number of permutations to use, when a (raw) p-value is < 0.1 (500 by default).
`P_3`	the number of permutations to use, when a (raw) p-value is < 0.01 (2,000 by default).
`P_4`	the number of permutations to use, when a (raw) p-value is < 0.001 (10,000 by default). In order to obtain a finer ranking for the most significant genes, if computational resources are available, we encourage users to set P_4 = 20,000.
`N_breaks`	the number of breaks at which to evaluate the comulative density function.
`min_non_zero_cells`	the minimum number of non-zero cells (across all samples) in each cluster for a gene to be evaluated.
`n_cores`	the number of cores to parallelize the tasks on (parallelization is at the cluster level: each cluster is parallelized on a thread).

A data.frame object. Columns 'gene' and 'cluster_id' contain the gene and cell-cluster name, while 'p_val', 'p_adj.loc' and 'p_adj.glb' report the raw p-values, locally and globally adjusted p-values, via Benjamini and Hochberg (BH) correction. In locally adjusted p-values ('p_adj.loc') BH correction is applied in each cluster separately, while in globally adjusted p-values ('p_adj.glb') BH correction is performed to the results from all clusters. Column 'filtered' indicates whether a gene-cluster result was filtered (if TRUE), or analyzed (if FALSE). A gene-cluster combination is filtered when fewer than 'min_non_zero_cells' non-zero cells are available. Filtered results have raw and adjusted p-values equal to 1.

Simone Tiberi simone.tiberi@uzh.ch

plot_cdfs, plot_densities, log2_FC, top_results

# load the input data:
data("Kang_subset", package = "distinct")
Kang_subset

# create the design of the study:
samples = Kang_subset@metadata$experiment_info$sample_id
group = Kang_subset@metadata$experiment_info$stim
design = model.matrix(~group)
# rownames of the design must indicate sample ids:
rownames(design) = samples
design

# Note that the sample names in `colData(x)$name_sample` have to be the same ones as those in `rownames(design)`.
rownames(design)
unique(SingleCellExperiment::colData(Kang_subset)$sample_id)

# In order to obtain a finer ranking for the most significant genes, if computational resources are available, we encourage users to increase P_4 (i.e., the number of permutations when a raw p-value is < 0.001) and set P_4 = 20,000 (by default P_4 = 10,000).

# The group we would like to test for is in the second column of the design, therefore we will specify: column_to_test = 2

set.seed(61217)
res = distinct_test(
  x = Kang_subset, 
  name_assays_expression = "logcounts",
  name_cluster = "cell",
  design = design,
  column_to_test = 2,
  min_non_zero_cells = 20,
  n_cores = 2)

# We can optionally add the fold change (FC) and log2-FC between groups:
res = log2_FC(res = res,
  x = Kang_subset, 
  name_assays_expression = "cpm",
  name_group = "stim",
  name_cluster = "cell")

# Visualize significant results:
head(top_results(res))

# Visualize significant results from a specified cluster of cells:
top_results(res, cluster = "Dendritic cells")

# By default, results from 'top_results' are sorted by (globally) adjusted p-value;
# they can also be sorted by log2-FC:
top_results(res, cluster = "Dendritic cells", sort_by = "log2FC")

# Visualize significant UP-regulated genes only:
top_results(res, up_down = "UP",
  cluster = "Dendritic cells")

# Plot density and cdf for gene 'ISG15' in cluster 'Dendritic cells'.
plot_densities(x = Kang_subset,
  gene = "ISG15",
  cluster = "Dendritic cells",
  name_assays_expression = "logcounts",
  name_cluster = "cell",
  name_sample = "sample_id",
  name_group = "stim")
 
 plot_cdfs(x = Kang_subset,
   gene = "ISG15",
   cluster = "Dendritic cells",
   name_assays_expression = "logcounts",
   name_cluster = "cell",
   name_sample = "sample_id",
   name_group = "stim")

distinct documentation built on Nov. 8, 2020, 8:20 p.m.

distinct index

Package overview README.md distinct: a method for differential analyses via hierarchical permutation tests

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

distinct
distinct: a method for differential analyses via hierarchical permutation tests

distinct_test: Test for differential state between two groups of samples,...
In distinct: distinct: a method for differential analyses via hierarchical permutation tests

Description

Usage

Arguments

Value

Author(s)

See Also

Examples

Related to distinct_test in distinct...

R Package Documentation

Browse R Packages

We want your feedback!

distinct distinct: a method for differential analyses via hierarchical permutation tests

distinct_test: Test for differential state between two groups of samples,... In distinct: distinct: a method for differential analyses via hierarchical permutation tests

Description

Usage

Arguments

Value

Author(s)

See Also

Examples

Related to distinct_test in distinct...

R Package Documentation

Browse R Packages

We want your feedback!

distinct
distinct: a method for differential analyses via hierarchical permutation tests

distinct_test: Test for differential state between two groups of samples,...
In distinct: distinct: a method for differential analyses via hierarchical permutation tests