secom_linear  R Documentation 
Obtain the sparse correlation matrix for linear correlations
between taxa. The current version of secom_linear
function supports
either of the three correlation coefficients: Pearson, Spearman, and
Kendall's τ.
secom_linear( data, assay_name = "counts", tax_level = NULL, pseudo = 0, prv_cut = 0.5, lib_cut = 1000, corr_cut = 0.5, wins_quant = c(0.05, 0.95), method = c("pearson", "kendall", "spearman"), soft = FALSE, thresh_len = 100, n_cv = 10, thresh_hard = 0, max_p = 0.005, n_cl = 1 )
data 
a list of the input data. Each element of the list can be a

assay_name 
character. Name of the count table in the data object
(only applicable if data object is a 
tax_level 
character. The taxonomic level of interest. The input data
can be agglomerated at different taxonomic levels based on your research
interest. Default is NULL, i.e., do not perform agglomeration, and the
SECOM anlysis will be performed at the lowest taxonomic level of the
input 
pseudo 
numeric. Add pseudocounts to the data. Default is 0 (no pseudocounts). 
prv_cut 
a numerical fraction between 0 and 1. Taxa with prevalences
less than 
lib_cut 
a numerical threshold for filtering samples based on library
sizes. Samples with library sizes less than 
corr_cut 
numeric. To prevent false positives due to taxa with
small variances, taxa with Pearson correlation coefficients greater than

wins_quant 
a numeric vector of probabilities with values between
0 and 1. Replace extreme values in the abundance data with less
extreme values. Default is 
method 
character. It indicates which correlation coefficient is to be computed. One of "pearson", "kendall", or "spearman": can be abbreviated. 
soft 
logical. 
thresh_len 
numeric. Gridsearch is implemented to find the optimal
values over 
n_cv 
numeric. The fold number in cross validation. Default is 10 (10fold cross validation). 
thresh_hard 
Numeric. Set a hard threshold for the correlation matrix.
Pairwise correlation coefficient (in its absolute value) less than or equal
to 
max_p 
numeric. Obtain the sparse correlation matrix by
pvalue filtering. Pairwise correlation coefficient with pvalue greater than

n_cl 
numeric. The number of nodes to be forked. For details, see

a list
with components:
s_diff_hat
, a numeric vector of estimated
samplespecific biases.
y_hat
, a matrix of biascorrected abundances
cv_error
, a numeric vector of crossvalidation error
estimates, which are the Frobenius norm differences between
correlation matrices using training set and validation set,
respectively.
thresh_grid
, a numeric vector of thresholds
in the crossvalidation.
thresh_opt
, numeric. The optimal threshold through
crossvalidation.
mat_cooccur
, a matrix of taxontaxon cooccurrence
pattern. The number in each cell represents the number of complete
(nonzero) samples for the corresponding pair of taxa.
corr
, the sample correlation matrix (using the measure
specified in method
) computed using the biascorrected
abundances y_hat
.
corr_p
, the pvalue matrix corresponding to the sample
correlation matrix corr
.
corr_th
, the sparse correlation matrix obtained by
thresholding based on the method specified in soft
.
corr_fl
, the sparse correlation matrix obtained by
pvalue filtering based on the cutoff specified in max_p
.
Huang Lin
secom_dist
library(ANCOMBC) data(dietswap) # subset to baseline tse = dietswap[, dietswap$timepoint == 1] set.seed(123) res_linear = secom_linear(data = list(tse), assay_name = "counts", tax_level = "Phylum", pseudo = 0, prv_cut = 0.5, lib_cut = 1000, corr_cut = 0.5, wins_quant = c(0.05, 0.95), method = "pearson", soft = FALSE, thresh_len = 20, n_cv = 10, thresh_hard = 0.3, max_p = 0.005, n_cl = 2) corr_th = res_linear$corr_th corr_fl = res_linear$corr_fl
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.