plot_alpha_diversity: Generates relative abundance plots per feature annotated by...

View source: R/plot_alpha_diversity.R

plot_alpha_diversityR Documentation

Generates relative abundance plots per feature annotated by the metadata using as input a SummarizedExperiment object

Description

Generates relative abundance plots per feature annotated by the metadata using as input a SummarizedExperiment object

Usage

plot_alpha_diversity(
  ExpObj = NULL,
  measures = c("Observed", "Chao1", "Shannon", "Simpson", "InvSimpson", "GeneCount"),
  stratify_by_kingdoms = TRUE,
  glomby = NULL,
  samplesToKeep = NULL,
  featuresToKeep = NULL,
  subsetby = NULL,
  compareby = NULL,
  compareby_order = NULL,
  colourby = NULL,
  shapeby = NULL,
  fillby = NULL,
  pairby = NULL,
  connectby = NULL,
  facetby = NULL,
  wrap_facet = FALSE,
  overlay_boxplot = FALSE,
  applyfilters = "light",
  featcutoff = NULL,
  GenomeCompletenessCutoff = NULL,
  PPM_normalize_to_bases_sequenced = FALSE,
  cdict = NULL,
  addtit = NULL,
  signiflabel = "p.format",
  max_pairwise_cats = 4,
  ignoreunclassified = TRUE,
  class_to_ignore = "N_A",
  returnstats = FALSE,
  ...
)

Arguments

ExpObj

JAMS-style SummarizedExperiment object

measures

String giving the alpha diversity measurements used to quantify the alpha diversity within each category. Default includes Observed, Chao1, Shannon, Simpson, InvSimpson, and GeneCount.

stratify_by_kingdoms

Requires a logical value. If TRUE, will concatenate all of the taxonomical features to the kingdom level and create individual plots for each of the kingdoms for each measure specified. If FALSE, will forgo the kingdom concatenation. Default is TRUE.

glomby

String giving the taxonomic level at which to agglomerate counts. This argument should only be used with taxonomic SummarizedExperiment objects. When NULL (the default), there is no agglomeration

samplesToKeep

Vector with sample names to keep. If NULL, all samples within the SummarizedExperiment object are kept. Default is NULL.

featuresToKeep

Vector with feature names to keep. If NULL, all features within the SummarizedExperiment object are kept. Default is NULL. Please note that when agglomerating features with the glomby argument (see above), feature names passed to featuresToKeep must be post-agglomeration feature names. For example, if glomby="Family", featuresToKeep must be family names, such as "f__Enterobacteriaceae", etc.

subsetby

String specifying the metadata variable name for subsetting samples. If passed, multiple plots will be drawn, one plot for samples within each different class contained within the variable. If NULL, data is not subset. Default is NULL.

compareby

String specifying the metadata variable name for grouping samples. This will define which metadata variable grouping to calculate PERMANOVA p-value. If not specified, and argument permanova is set to TRUE, (see permanova), the compareby argument will be set by colourby or shapeby. If these latter two are also NULL, and permanova is TRUE, permanova will be set to FALSE. Default is NULL.

compareby_order

String or vector specifying the order in which to compare by, if this order is different than alphabetical order of the compareby parameter. Default is NULL.

colourby

String specifying the metadata variable name for colouring the lines of the boxes for the samples. If NULL, all samples will be black. Default is NULL.

shapeby

String specifying the metadata variable name for attributing shapes to samples. If NULL, all samples will be a round dot (pch = 19). Default is NULL. If there are more than 27 classes within the variable, samples will be attributed letters (A-Z, then a-z) automatically.

fillby

String specifying the metadata variable with which to colour/fill in the boxes with. If NULL, all boxes will be filled in white. Default is NULL.

pairby

.

connectby

String specifying the metadata variable name for drawing a line connecting samples belonging to the same class. If NULL, samples are not connected. Default is NULL.

facetby

.

wrap_facet

.

overlay_boxplot

Requires a logical value. If FALSE, will not overlay the boxplots ontop of one-another. If TRUE, the boxplots will all be plotted, one on top of another. Default is FALSE.

applyfilters

Optional string specifying filtration setting "combos", used as a shorthand for setting the featcutoff, GenomeCompletenessCutoff, minl2fc and minabscorrcoeff arguments in JAMS plotting functions. If NULL, none of these arguments are set if not specified. Permissible values for applyfilters are "light", "moderate" or "stringent". The actual values vary whether the SummarizedExperiment object is taxonomical (LKT) or not. For a taxonomical SummarizedExperiment object, using "light" will set featcutoff=c(50, 5), GenomeCompletenessCutoff=c(5, 5), minl2fc=1, minabscorrcoeff=0.4; using "moderate" will set featcutoff=c(250, 15), GenomeCompletenessCutoff=c(10, 5), minl2fc=1, minabscorrcoeff=0.6; and using "stringent" will set featcutoff=c(2000, 15), GenomeCompletenessCutoff=c(30, 10), minl2fc=2, minabscorrcoeff=0.8. For non-taxonomical (i.e. functional) SummarizedExperiment objects, using "light" will set featcutoff=c(0, 0), minl2fc=1, minabscorrcoeff=0.4; using "moderate" will set featcutoff=c(5, 5), minl2fc=1, minabscorrcoeff=0.6; and using "stringent" will set featcutoff=c(50, 15), minl2fc=2.5, minabscorrcoeff=0.8. When using applyfilters, one can still set one or more of featcutoff, GenomeCompletenessCutoff, minl2fc and minabscorrcoeff, which will then take the user set value in lieu of those set by the applyfilters shorthand. Default is light.

featcutoff

Requires a numeric vector of length 2 for specifying how to filter out features by relative abundance. The first value of the vector specifies the minimum relative abundance in Parts per Million (PPM) and the second value is the percentage of samples which must have at least that relative abundance. Thus, passing c(250, 10) to featcutoff would filter out any feature which does not have at least 250 PPM (= 0.025 percent) of relative abundance in at least 10 percent of all samples being plot. Please note that when using the subsetby option (q.v.) to automatically plot multiple plots of sample subsets, the featcutoff parameters are applied within the subset. The default is c(0, 0), meaning no feature is filtered. If NULL is passed, then the value defaults to c(0, 0). See also applyfilters for a shorthand way of applying multiple filtration settings.

GenomeCompletenessCutoff

Requires a numeric vector of length 2 for specifying how to filter out features by genome completeness. This is, of course, only applicble for taxonomic shotgun SummarizedExperiment objects. When passed to non-taxonomic shotgun SummarizedExperiment objects, GenomeCompletenessCutoff will be ignored. The first value of the vector specifies the minimum genome completeness in percentage and the second value is the percentage of samples which must have at least that genome completeness. Thus, passing c(50, 5) to GenomeCompletenessCutoff would filter out any taxonomic feature which does not have at least 50 percent of genome completeness in at least 5 percent of all samples being plot. Please note that when using the subsetby option (q.v.) to automatically plot multiple plots of sample subsets, the GenomeCompletenessCutoff parameters are applied within the subset. The default is c(0, 0), meaning no feature is filtered. If NULL is passed, then the value defaults to c(0, 0). See also applyfilters for a shorthand way of applying multiple filtration settings.

PPM_normalize_to_bases_sequenced

Requires a logical value. Non-filtered JAMS feature counts tables (the BaseCounts assay within SummarizedExperiment objects) always includes unclassified taxonomical features (for taxonomical SummarizedExperiment objects) or unknown/unattributed functional features (for non-taxonomical SummarizedExperiment objects), so the relative abundance for each feature (see normalization) will be calculated in Parts per Million (PPM) by dividing the number of bases covering each feature by the sum of each sample column **previous to any filtration**. Relative abundances are thus representative of the entirety of the genomic content for taxonomical objects, whereas for non-taxonomical objects, strictly speaking, it is the abundance of each feature relative to only the coding regions present in the metagenome, even if these are annotationally unatributed. In other words, intergenic regions are not taken into account. In order to relative-abundance-normalize a **non-taxonomical** SummarizedExperiment object with the total genomic sequencing content, including non-coding regions, set PPM_normalize_to_bases_sequenced = TRUE. Default is FALSE.

addtit

Optional string with text to append to heatmap main title. Default is NULL.

signiflabel

String specifying the label used to determine if comparisions are significant or not. Deault is p.format.

max_pairwise_cats

Numerical value specifying the maximum number of categories to be plot. Default is 4. This means that if you give a category that would require more than four different boxes, it will not be plot.

ignoreunclassified

Requires a logical value. If set to TRUE, for taxonomical SummarizedExperiment objects, the feature "LKT__Unclassified" will be omitted from being shown. In the case of non-taxonomical SummarizedExperiment objects, the completely unannotated features will be omitted. For example, for an ECNumber SummarizedExperiment object, genes *without* an Enzyme Commission Number annotation (feature "EC_none") will not be shown. Statistics are, however, computed taking the completely unclassifed feature into account, so p-values will not change.

class_to_ignore

String or vector specifying any classes which should lead to samples being excluded from the comparison within the variable passed to compareby. Default is N_A. This means that within any metadata variable passed to compareby containing the "N_A" string within that specific variable, the sample will be dropped from that comparison.


johnmcculloch/JAMS_BW documentation built on March 29, 2024, 7:56 p.m.