getCrossAssociation: Calculate correlations between features of two experiments.

getCrossAssociationR Documentation

Calculate correlations between features of two experiments.

Description

Calculate correlations between features of two experiments.

Usage

getCrossAssociation(x, ...)

## S4 method for signature 'MultiAssayExperiment'
getCrossAssociation(
  x,
  experiment1 = 1,
  experiment2 = 2,
  assay.type1 = assay_name1,
  assay_name1 = "counts",
  assay.type2 = assay_name2,
  assay_name2 = "counts",
  altexp1 = NULL,
  altexp2 = NULL,
  col.var1 = colData_variable1,
  colData_variable1 = NULL,
  col.var2 = colData_variable2,
  colData_variable2 = NULL,
  by = MARGIN,
  MARGIN = 1,
  method = c("kendall", "spearman", "categorical", "pearson"),
  mode = "table",
  p.adj.method = p_adj_method,
  p_adj_method = c("fdr", "BH", "bonferroni", "BY", "hochberg", "holm", "hommel", "none"),
  p.adj.threshold = p_adj_threshold,
  p_adj_threshold = NULL,
  cor.threshold = cor_threshold,
  cor_threshold = NULL,
  sort = FALSE,
  filter.self.cor = filter_self_correlations,
  filter_self_correlations = FALSE,
  verbose = TRUE,
  test.signif = test_significance,
  test_significance = FALSE,
  show.warnings = show_warnings,
  show_warnings = TRUE,
  paired = FALSE,
  ...
)

## S4 method for signature 'SummarizedExperiment'
getCrossAssociation(x, experiment2 = x, ...)

Arguments

x

A MultiAssayExperiment or SummarizedExperiment object.

...

Additional arguments:

  • symmetric: Logical scalar. Specifies if measure is symmetric or not. When symmetric = TRUE, associations are calculated only for unique variable-pairs, and they are assigned to corresponding variable-pair. This decreases the number of calculations in 2-fold meaning faster execution. (By default: FALSE)

  • association.fun: A function that is used to calculate (dis-)similarity between features. Function must take matrix as an input and give numeric values as an output. Adjust method and other parameters correspondingly. Supported functions are, for example, stats::dist and vegan::vegdist.

experiment1

Character scalar or numeric scalar. Selects the experiment 1 from experiments(x) of MultiassayExperiment object. (Default: 1)

experiment2

Character scalar or numeric scalar. Selects the experiment 2 fromexperiments(x) of MultiAssayExperiment object or altExp(x) of TreeSummarizedExperiment object. Alternatively, experiment2 can also be TreeSE object when x is TreeSE object. (Default: 2 when x is MAE and x when x is TreeSE)

assay.type1

Character scalar. Specifies the name of the assay in experiment 1 to be transformed.. (Default: "counts")

assay_name1

Deprecated. Use assay.type1 instead.

assay.type2

Character scalar. Specifies the name of the assay in experiment 2 to be transformed.. (Default: "counts")

assay_name2

Deprecated. Use assay.type2 instead.

altexp1

Character scalar or numeric scalar. Specifies alternative experiment from the altExp of experiment 1. If NULL, then the experiment is itself and altExp option is disabled. (Default: NULL)

altexp2

Character scalar or numeric scalar. Specifies alternative experiment from the altExp of experiment 2. If NULL, then the experiment is itself and altExp option is disabled. (Default: NULL)

col.var1

Character scalar. Specifies column(s) from colData of experiment 1. If col.var1 is used, assay.type1 is disabled. (Default: NULL)

colData_variable1

Deprecated. Use col.var1 instead.

col.var2

Character scalar. Specifies column(s) from colData of experiment 2. If col.var2 is used, assay.type2 is disabled. (Default: NULL)

colData_variable2

Deprecated. Use col.var2 instead.

by

ACharacter scalar. Determines if association are calculated row-wise / for features ('rows') or column-wise / for samples ('cols'). Must be 'rows' or 'cols'.

MARGIN

Deprecated. Use by instead.

method

Character scalar. Defines the association method ('kendall', pearson', or 'spearman' for continuous/numeric; 'categorical' for discrete) (Default: "kendall")

mode

Character scalar. Specifies the output format Available formats are 'table' and 'matrix'. (Default: "table")

p.adj.method

Character scalar. Specifies adjustment method of p-values. Passed to p.adjust function. (Default: "fdr")

p_adj_method

Deprecated. Use p.adj.method instead.

p.adj.threshold

Numeric scalar. From 0 to 1, specifies adjusted p-value threshold for filtering. (Default: NULL)

p_adj_threshold

Deprecated. Use p.dj.threshold instead.

cor.threshold

Numeric scalar. From 0 to 1, specifies correlation threshold for filtering. (Default: NULL)

cor_threshold

Deprecated. Use cor.threshold instead.

sort

Logical scalar. Specifies whether to sort features or not in result matrices. Used method is hierarchical clustering. (Default: FALSE)

filter.self.cor

Logical scalar. Specifies whether to filter out correlations between identical items. Applies only when correlation between experiment itself is tested, i.e., when assays are identical. (Default: FALSE)

filter_self_correlations

Deprecated. Use filter.self.cor instead.

verbose

Logical scalar. Specifies whether to get messages about progress of calculation. (Default: FALSE)

test.signif

Logical scalar. Specifies whether to test statistical significance of associations. (Default: FALSE)

test_significance

Deprecated. Use test.signif instead.

show.warnings

Logical scalar. specifies whether to show warnings that might occur when correlations and p-values are calculated. (Default: FALSE)

show_warnings

Deprecated. use show.warnings instead.

paired

Logical scalar. Specifies if samples are paired or not. colnames must match between twp experiments. paired is disabled when by = 1. (Default: FALSE)

Details

The function getCrossAssociation calculates associations between features of two experiments. By default, it not only computes associations but also tests their significance. If desired, setting test.signif to FALSE disables significance calculation.

We recommend the non-parametric Kendall's tau as the default method for association analysis. Kendall's tau has desirable statistical properties and robustness at lower sample sizes. Spearman rank correlation can provide faster solutions when running times are critical.

Value

This function returns associations in table or matrix format. In table format, returned value is a data frame that includes features and associations (and p-values) in columns. In matrix format, returned value is a one matrix when only associations are calculated. If also significances are tested, then returned value is a list of matrices.

Examples

data(HintikkaXOData)
mae <- HintikkaXOData

# Subset so that less observations / quicker to run, just for example
mae[[1]] <- mae[[1]][1:20, 1:10]
mae[[2]] <- mae[[2]][1:20, 1:10]
# Several rows in the counts assay have a standard deviation of zero
# Remove them, since they do not add useful information about
# cross-association
mae[[1]] <- mae[[1]][rowSds(assay(mae[[1]])) > 0, ]
# Transform data
mae[[1]] <- transformAssay(mae[[1]], method = "rclr")

# Calculate cross-correlations
result <- getCrossAssociation(mae, method = "pearson", assay.type2 = "nmr")
# Show first 5 entries
head(result, 5)

# Use altExp option to specify alternative experiment from the experiment
altExp(mae[[1]], "Phylum") <- agglomerateByRank(mae[[1]], rank = "Phylum")
# Transform data
altExp(mae[[1]], "Phylum") <- transformAssay(
    altExp(mae[[1]], "Phylum"), method = "relabundance")
# When mode = "matrix", the return value is a matrix
result <- getCrossAssociation(
    mae, experiment2 = 2, assay.type1 = "relabundance", assay.type2 = "nmr",
    altexp1 = "Phylum", method = "pearson", mode = "matrix")
# Show first 5 entries
head(result, 5)

# If test.signif = TRUE, then getCrossAssociation additionally returns 
# significances
# filter.self.cor = TRUE filters self correlations
# p.adj.threshold can be used to filter those features that do not
# have any correlations whose p-value is lower than the threshold
result <- getCrossAssociation(
    mae[[1]], experiment2 = mae[[1]], method = "pearson",
    filter.self.cor = TRUE, p.adj.threshold = 0.05, test.signif = TRUE)
# Show first 5 entries
head(result, 5)
                                        
# Returned value is a list of matrices
names(result)

# Calculate Bray-Curtis dissimilarity between samples. If dataset includes
# paired samples, you can use paired = TRUE.
result <- getCrossAssociation(
    mae[[1]], mae[[1]], by = 2, paired = FALSE,
    association.fun = getDissimilarity, method = "bray")
                                        

# If experiments are equal and measure is symmetric
# (e.g., taxa1 vs taxa2 == taxa2 vs taxa1),
# it is possible to speed-up calculations by calculating association only
# for unique variable-pairs. Use "symmetric" to choose whether to measure
# association for only other half of of variable-pairs.
result <- getCrossAssociation(
    mae, experiment1 = "microbiota", experiment2 = "microbiota", 
    assay.type1 = "counts", assay.type2 = "counts", symmetric = TRUE)

# For big data sets, the calculations might take a long time.
# To speed them up, you can take a random sample from the data. 
# When dealing with complex biological problems, random samples can be 
# enough to describe the data. Here, our random sample is 30 % of whole data.
sample_size <- 0.3
tse <- mae[[1]]
tse_sub <- tse[ sample( seq_len( nrow(tse) ), sample_size * nrow(tse) ), ]
result <- getCrossAssociation(tse_sub)

# It is also possible to choose variables from colData and calculate
# association between assay and sample metadata or between variables of
# sample metadata
mae[[1]] <- addAlpha(mae[[1]])
# col.var works similarly to assay.type. Instead of fetching
# an assay named assay.type from assay slot, it fetches a column named
# col.var from colData.
result <- getCrossAssociation(
    mae[[1]], assay.type1 = "counts", 
    col.var2 = c("shannon_diversity", "coverage_diversity"),
    test.signif = TRUE)
                                        

microbiome/mia documentation built on Nov. 20, 2024, 1:12 a.m.