View source: R/plot_alpha_diversity.R
plot_alpha_diversity | R Documentation |
Generates relative abundance plots per feature annotated by the metadata using as input a SummarizedExperiment object
plot_alpha_diversity(
ExpObj = NULL,
measures = c("Observed", "Chao1", "Shannon", "Simpson", "InvSimpson", "GeneCount"),
stratify_by_kingdoms = TRUE,
glomby = NULL,
samplesToKeep = NULL,
featuresToKeep = NULL,
subsetby = NULL,
compareby = NULL,
compareby_order = NULL,
colourby = NULL,
shapeby = NULL,
fillby = NULL,
pairby = NULL,
connectby = NULL,
facetby = NULL,
wrap_facet = FALSE,
overlay_boxplot = FALSE,
applyfilters = "light",
featcutoff = NULL,
GenomeCompletenessCutoff = NULL,
PPM_normalize_to_bases_sequenced = FALSE,
cdict = NULL,
addtit = NULL,
signiflabel = "p.format",
max_pairwise_cats = 4,
ignoreunclassified = TRUE,
class_to_ignore = "N_A",
returnstats = FALSE,
...
)
ExpObj |
JAMS-style SummarizedExperiment object |
measures |
String giving the alpha diversity measurements used to quantify the alpha diversity within each category. Default includes Observed, Chao1, Shannon, Simpson, InvSimpson, and GeneCount. |
stratify_by_kingdoms |
Requires a logical value. If TRUE, will concatenate all of the taxonomical features to the kingdom level and create individual plots for each of the kingdoms for each measure specified. If FALSE, will forgo the kingdom concatenation. Default is TRUE. |
glomby |
String giving the taxonomic level at which to agglomerate counts. This argument should only be used with taxonomic SummarizedExperiment objects. When NULL (the default), there is no agglomeration |
samplesToKeep |
Vector with sample names to keep. If NULL, all samples within the SummarizedExperiment object are kept. Default is NULL. |
featuresToKeep |
Vector with feature names to keep. If NULL, all features within the SummarizedExperiment object are kept. Default is NULL. Please note that when agglomerating features with the glomby argument (see above), feature names passed to featuresToKeep must be post-agglomeration feature names. For example, if glomby="Family", featuresToKeep must be family names, such as "f__Enterobacteriaceae", etc. |
subsetby |
String specifying the metadata variable name for subsetting samples. If passed, multiple plots will be drawn, one plot for samples within each different class contained within the variable. If NULL, data is not subset. Default is NULL. |
compareby |
String specifying the metadata variable name for grouping samples. This will define which metadata variable grouping to calculate PERMANOVA p-value. If not specified, and argument permanova is set to TRUE, (see permanova), the compareby argument will be set by colourby or shapeby. If these latter two are also NULL, and permanova is TRUE, permanova will be set to FALSE. Default is NULL. |
compareby_order |
String or vector specifying the order in which to compare by, if this order is different than alphabetical order of the compareby parameter. Default is NULL. |
colourby |
String specifying the metadata variable name for colouring the lines of the boxes for the samples. If NULL, all samples will be black. Default is NULL. |
shapeby |
String specifying the metadata variable name for attributing shapes to samples. If NULL, all samples will be a round dot (pch = 19). Default is NULL. If there are more than 27 classes within the variable, samples will be attributed letters (A-Z, then a-z) automatically. |
fillby |
String specifying the metadata variable with which to colour/fill in the boxes with. If NULL, all boxes will be filled in white. Default is NULL. |
pairby |
. |
connectby |
String specifying the metadata variable name for drawing a line connecting samples belonging to the same class. If NULL, samples are not connected. Default is NULL. |
facetby |
. |
wrap_facet |
. |
overlay_boxplot |
Requires a logical value. If FALSE, will not overlay the boxplots ontop of one-another. If TRUE, the boxplots will all be plotted, one on top of another. Default is FALSE. |
applyfilters |
Optional string specifying filtration setting "combos", used as a shorthand for setting the featcutoff, GenomeCompletenessCutoff, minl2fc and minabscorrcoeff arguments in JAMS plotting functions. If NULL, none of these arguments are set if not specified. Permissible values for applyfilters are "light", "moderate" or "stringent". The actual values vary whether the SummarizedExperiment object is taxonomical (LKT) or not. For a taxonomical SummarizedExperiment object, using "light" will set featcutoff=c(50, 5), GenomeCompletenessCutoff=c(5, 5), minl2fc=1, minabscorrcoeff=0.4; using "moderate" will set featcutoff=c(250, 15), GenomeCompletenessCutoff=c(10, 5), minl2fc=1, minabscorrcoeff=0.6; and using "stringent" will set featcutoff=c(2000, 15), GenomeCompletenessCutoff=c(30, 10), minl2fc=2, minabscorrcoeff=0.8. For non-taxonomical (i.e. functional) SummarizedExperiment objects, using "light" will set featcutoff=c(0, 0), minl2fc=1, minabscorrcoeff=0.4; using "moderate" will set featcutoff=c(5, 5), minl2fc=1, minabscorrcoeff=0.6; and using "stringent" will set featcutoff=c(50, 15), minl2fc=2.5, minabscorrcoeff=0.8. When using applyfilters, one can still set one or more of featcutoff, GenomeCompletenessCutoff, minl2fc and minabscorrcoeff, which will then take the user set value in lieu of those set by the applyfilters shorthand. Default is light. |
featcutoff |
Requires a numeric vector of length 2 for specifying how to filter out features by relative abundance. The first value of the vector specifies the minimum relative abundance in Parts per Million (PPM) and the second value is the percentage of samples which must have at least that relative abundance. Thus, passing c(250, 10) to featcutoff would filter out any feature which does not have at least 250 PPM (= 0.025 percent) of relative abundance in at least 10 percent of all samples being plot. Please note that when using the subsetby option (q.v.) to automatically plot multiple plots of sample subsets, the featcutoff parameters are applied within the subset. The default is c(0, 0), meaning no feature is filtered. If NULL is passed, then the value defaults to c(0, 0). See also applyfilters for a shorthand way of applying multiple filtration settings. |
GenomeCompletenessCutoff |
Requires a numeric vector of length 2 for specifying how to filter out features by genome completeness. This is, of course, only applicble for taxonomic shotgun SummarizedExperiment objects. When passed to non-taxonomic shotgun SummarizedExperiment objects, GenomeCompletenessCutoff will be ignored. The first value of the vector specifies the minimum genome completeness in percentage and the second value is the percentage of samples which must have at least that genome completeness. Thus, passing c(50, 5) to GenomeCompletenessCutoff would filter out any taxonomic feature which does not have at least 50 percent of genome completeness in at least 5 percent of all samples being plot. Please note that when using the subsetby option (q.v.) to automatically plot multiple plots of sample subsets, the GenomeCompletenessCutoff parameters are applied within the subset. The default is c(0, 0), meaning no feature is filtered. If NULL is passed, then the value defaults to c(0, 0). See also applyfilters for a shorthand way of applying multiple filtration settings. |
PPM_normalize_to_bases_sequenced |
Requires a logical value. Non-filtered JAMS feature counts tables (the BaseCounts assay within SummarizedExperiment objects) always includes unclassified taxonomical features (for taxonomical SummarizedExperiment objects) or unknown/unattributed functional features (for non-taxonomical SummarizedExperiment objects), so the relative abundance for each feature (see normalization) will be calculated in Parts per Million (PPM) by dividing the number of bases covering each feature by the sum of each sample column **previous to any filtration**. Relative abundances are thus representative of the entirety of the genomic content for taxonomical objects, whereas for non-taxonomical objects, strictly speaking, it is the abundance of each feature relative to only the coding regions present in the metagenome, even if these are annotationally unatributed. In other words, intergenic regions are not taken into account. In order to relative-abundance-normalize a **non-taxonomical** SummarizedExperiment object with the total genomic sequencing content, including non-coding regions, set PPM_normalize_to_bases_sequenced = TRUE. Default is FALSE. |
addtit |
Optional string with text to append to heatmap main title. Default is NULL. |
signiflabel |
String specifying the label used to determine if comparisions are significant or not. Deault is p.format. |
max_pairwise_cats |
Numerical value specifying the maximum number of categories to be plot. Default is 4. This means that if you give a category that would require more than four different boxes, it will not be plot. |
ignoreunclassified |
Requires a logical value. If set to TRUE, for taxonomical SummarizedExperiment objects, the feature "LKT__Unclassified" will be omitted from being shown. In the case of non-taxonomical SummarizedExperiment objects, the completely unannotated features will be omitted. For example, for an ECNumber SummarizedExperiment object, genes *without* an Enzyme Commission Number annotation (feature "EC_none") will not be shown. Statistics are, however, computed taking the completely unclassifed feature into account, so p-values will not change. |
class_to_ignore |
String or vector specifying any classes which should lead to samples being excluded from the comparison within the variable passed to compareby. Default is N_A. This means that within any metadata variable passed to compareby containing the "N_A" string within that specific variable, the sample will be dropped from that comparison. |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.