runAnalysis: Run an integrated co-occurence analysis for a microbial...
In AlmaasLab/micInt: Find microbial interactions

runAnalysis

R Documentation

Run an integrated co-occurence analysis for a microbial dataset

Description

Runs an automized processing of the OTU table or phyloseq object, passes the jobs to ccrepe and saves the results

Usage

runAnalysis(
  OTU_table,
  abundance_cutoff = 1e-04,
  q_crit = 0.05,
  parallel = FALSE,
  ncpus = getOption("micInt.ncpus", 1L),
  cl = NULL,
  returnVariables = NULL,
  subset = NULL,
  sim.scores = NULL,
  file = FALSE,
  magnitude_factor = 10,
  prefix = NULL,
  metadataCols = c("OTU Id", "taxonomy"),
  postfix = "",
  renormalize = TRUE,
  iterations = 1000,
  ccrepe_args = list()
)

Arguments

`OTU_table`	The raw OTU table (if a `data.frame` or a `phyloseq` `otu_table` is supplied) to be treated or an experiment level `phyloseq` object containing the data (the latter is recommended). Note that in the case of a `phyloseq` `otu_table`, no taxonomy can be handled.
`abundance_cutoff`	The mean abundance cutoff for the OTUs. If it is `NULL`, the there will be not filtering.
`q_crit`	Numeric, the q-value cutoff when construction interaction tables
`parallel`	Should the analysis be run in parallel?
`ncpus`	If `parallel = TRUE`, how many cores should be used? Defaults to one.
`cl`	Custom cluster to use if `parallel = TRUE`.
`returnVariables`	Which variables should the function return (character vector)? Available options are: `similarity_measures_significance`: The `interaction_table` of significant interactions `refined_table`: The processed OTU table `min_dataset`: The smallest non-zero entity in the refined table `taxonomy`: A named numberic containing the taxonomy of each OTU (collapsed into a single string) `outputargs`: A list (with a element for each similarity measure) comtaining the arguments to be passed to output_ccrepe_data for each similarity measure `common_outputargs`: Like `outputargs`, but these arguments stay the same for all similarity measure in order to avoid duplicates. In addition, all paramerters for this function are available. Other internal variables found upon inspection of the source code may also be returned, but they are for advanced users only. If `NULL`, the listed parameters in this section in addtion to an echo of the parameters are retuned. Note: If `OTU_table` is a `phyloseq` object, the returned variables is a data frame corresponding to the `phyloseq` object. This is due to the fact that it is intervally converted into a data frame.
`subset`	Character, the subset of similarity measures to use, denoted by the its name in the list (not necessarly its string) returned from similarity_measures or similarity measure modiftying function such as noisify If `NULL`, all available measures will be used
`sim.scores`	The similarity measures of class sim.measure to use. If it is `NULL`, all measures available in the package will be used (recommanded for most purposes).
`file`	Should the tables of significant interactions be written to a file? If so, they are printed to `csv`-files containing the name of the similarity measure
`magnitude_factor`	When making noisified functions, the magnitude of the noise will be this number multiplied with `min_dataset`
`prefix`	The prefix of the file names being written. Ignored if `file=FALSE`.
`metadataCols`	The names (character vector) or position (integer) of the metadata columns to remove from the table before analyzing it. Ignored if a `phyloseq` object is supplied
`postfix`	The postfix of the file names being written. Ignored if `file=FALSE`.
`renormalize`	Should the data be renormalized during filtering process and permutation? Should be `TRUE` when used on relative abundances, but must be `FALSE` if absolute abundances are used.
`iterations`	Integer of length one, the number of iterations to run
`ccrepe_args`	A named list of custom arguments to `ccrepe` if it is necessary to fine-tune the workings. This argument list will override the effects of the other arguments.

Details

If the function is told to output a file and no prefix is given, the csv-files will all share a common prefix of the form: q_crit=(critical q-value)_cutoff=(the mean abundance cutoff)_magfac=(the magnitude factor), where all numbers are in scientific notation. Then the sim.score name follows, then the postfix and finally the csv extention. The postfix is by default empty.

In order for an OTU-table to be valid when the argument OTU_table is a data.frame, the following criteria must hold:

The data points (sample) are in columns, the abundances for each OTU is in rows.
The rows may only hold OTU abundances
There may be as many metadata colums as you like. However, they all need to be declared in the metadataCols argument and the column taxonomy has be there in order for the output file to contain the taxonomy.
The row names of the table are the OTU names and the column names are the sample names

For phyloseq objects (both experiment level and otu_table), you do not need to care about this, it is automatically handeled

Value

A list of the variables requested from the parameter returnVariables.

Examples

library(micInt)
data(seawater)
sim.scores <- similarity_measures(subset= c("spearman","pearson"))
runAnalysis(OTU_table = seawater, sim.scores = sim.scores, parallel = TRUE, ncpus = 2,
iterations = 100)

AlmaasLab/micInt documentation built on April 1, 2022, 10:37 a.m.