slsd | R Documentation |
In the low-microbiome biomass setting, real microbes also exhibit a
proportional number of total k-mers, number of unique k-mers, as well as
number of total assigned sequencing reads across samples; i.e. the following
three Spearman correlations are significant when tested using sample-level
data provided in Kraken reports: cor(minimizer_len, minimizer_n_unique)
,
cor(minimizer_len, total_reads)
and cor(total_reads, minimizer_n_unique)
.
(r1>0 & r2>0 & r3>0 & p1<0.05 & p2<0.05 & p3<0.05
).
slsd(
kreports,
method = "spearman",
...,
min_reads = 3L,
min_minimizer_n_unique = 3L,
min_number = 3L
)
kreports |
kreports data returned by |
method |
A character string indicating which correlation coefficient is to be used for the test. One of "pearson", "kendall", or "spearman", can be abbreviated. |
... |
Other arguments passed to cor.test. |
min_reads |
An integer, the minimal number of the total reads to filter
taxa. SAHMI use |
min_minimizer_n_unique |
An integer, the minimal number of the unique
number of minimizer to filter taxa. SAHMI use |
min_number |
An integer, the minimal number of samples per taxid. SAHMI
use |
A polars DataFrame of correlation
coefficient and pvalue for cor(minimizer_len, minimizer_n_unique)
(r1 and
p1), cor(minimizer_len, total_reads)
(r2 and p2) and cor(total_reads, minimizer_n_unique)
(r3 and p3).
## Not run:
# `sahmi_datasets` should be the output of all samples from `prep_dataset()`
slsd <- slsd(lapply(sahmi_datasets, `[[`, "kreport"))
real_taxids_slsd <- slsd$filter(
pl$col("r1")$gt(0),
pl$col("r2")$gt(0),
pl$col("r3")$gt(0),
pl$col("p1")$lt(0.05),
pl$col("p2")$lt(0.05),
pl$col("p3")$lt(0.05)
)$get_column("taxid")
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.