microtable | R Documentation |
microtable
object to store and manage all the basic files.This class is a wrapper for a series of operations on the input files and basic manipulations,
including microtable object creation, data trimming, data filtering, rarefaction based on Paul et al. (2013) <doi:10.1371/journal.pone.0061217>, taxonomic abundance calculation,
alpha and beta diversity calculation based on the An et al. (2019) <doi:10.1016/j.geoderma.2018.09.035> and
Lozupone et al. (2005) <doi:10.1128/AEM.71.12.8228-8235.2005> and other basic operations.
Online tutorial: https://chiliubio.github.io/microeco_tutorial/
Download tutorial: https://github.com/ChiLiubio/microeco_tutorial/releases
microtable.
new()
microtable$new( otu_table, sample_table = NULL, tax_table = NULL, phylo_tree = NULL, rep_fasta = NULL, auto_tidy = FALSE )
otu_table
data.frame; The feature abundance table; rownames are features (e.g. OTUs/ASVs/species/genes); column names are samples.
sample_table
data.frame; default NULL; The sample information table; rownames are samples; columns are sample metadata; If not provided, the function can generate a table automatically according to the sample names in otu_table.
tax_table
data.frame; default NULL; The taxonomic information table; rownames are features; column names are taxonomic classes.
phylo_tree
phylo; default NULL; The phylogenetic tree that must be read with the read.tree
function of ape package.
rep_fasta
DNAStringSet
, list
or DNAbin
format; default NULL; The sequences.
The sequences should be read with the readDNAStringSet
function in Biostrings
package (DNAStringSet class),
read.fasta
function in seqinr
package (list class),
or read.FASTA
function in ape
package (DNAbin class).
auto_tidy
default FALSE; Whether tidy the files in the microtable
object automatically.
If TRUE, the function can invoke the tidy_dataset
function.
an object of class microtable
with the following components:
sample_table
The sample information table.
otu_table
The feature table.
tax_table
The taxonomic table.
phylo_tree
The phylogenetic tree.
rep_fasta
The sequence.
taxa_abund
default NULL; use cal_abund
function to calculate.
alpha_diversity
default NULL; use cal_alphadiv
function to calculate.
beta_diversity
default NULL; use cal_betadiv
function to calculate.
data(otu_table_16S) data(taxonomy_table_16S) data(sample_info_16S) data(phylo_tree_16S) m1 <- microtable$new(otu_table = otu_table_16S) m1 <- microtable$new(sample_table = sample_info_16S, otu_table = otu_table_16S, tax_table = taxonomy_table_16S, phylo_tree = phylo_tree_16S) # trim the files in the dataset m1$tidy_dataset()
filter_pollution()
Filter the features considered pollution in microtable$tax_table
.
This operation will remove any line of the microtable$tax_table
containing any the word in taxa parameter regardless of word case.
microtable$filter_pollution(taxa = c("mitochondria", "chloroplast"))
taxa
default c("mitochondria", "chloroplast")
; filter mitochondria and chloroplast, or others as needed.
None
m1$filter_pollution(taxa = c("mitochondria", "chloroplast"))
filter_taxa()
Filter the feature with low abundance and/or low occurrence frequency.
microtable$filter_taxa(rel_abund = 0, freq = 1, include_lowest = TRUE)
rel_abund
default 0; the relative abundance threshold, such as 0.0001.
freq
default 1; the occurrence frequency threshold. For example, the number 2 represents filtering the feature that occurs less than 2 times. A number smaller than 1 is also allowable. For instance, the number 0.1 represents filtering the feature that occurs in less than 10% samples.
include_lowest
default TRUE; whether include the feature with the threshold.
None
\donttest{ d1 <- clone(m1) d1$filter_taxa(rel_abund = 0.0001, freq = 0.2) }
rarefy_samples()
Rarefy communities to make all samples have same count number.
microtable$rarefy_samples(method = c("rarefy", "SRS")[1], sample.size, ...)
method
default c("rarefy", "SRS")[1]; "rarefy" represents the classical resampling like rrarefy
function of vegan
package.
"SRS" is scaling with ranked subsampling method based on the SRS package provided by Lukas Beule and Petr Karlovsky (2020) <DOI:10.7717/peerj.9593>.
sample.size
default NULL; libray size. If not provided, use the minimum number across all samples.
For "SRS" method, this parameter is passed to Cmin
parameter of SRS
function of SRS package.
...
parameters pass to norm
function of trans_norm
class.
None; rarefied dataset.
\donttest{ m1$rarefy_samples(sample.size = min(m1$sample_sums())) }
tidy_dataset()
Trim all the data in the microtable
object to make taxa and samples consistent. The results are intersections across data.
microtable$tidy_dataset(main_data = FALSE)
main_data
default FALSE; if TRUE, only basic data in microtable
object is trimmed. Otherwise, all data,
including taxa_abund
, alpha_diversity
and beta_diversity
, are all trimed.
None. The data in the object are tidied up.
If tax_table
is in object, its row names are totally same with the row names of otu_table
.
m1$tidy_dataset(main_data = TRUE)
add_rownames2taxonomy()
Add the rownames of microtable$tax_table
as its last column.
This is especially useful when the rownames of microtable$tax_table
are required as a taxonomic level
for the taxonomic abundance calculation and biomarker idenfification.
microtable$add_rownames2taxonomy(use_name = "OTU")
use_name
default "OTU"; The column name used in the tax_table
.
NULL, a new tax_table stored in the object.
\donttest{ m1$add_rownames2taxonomy() }
sample_sums()
Sum the species number for each sample.
microtable$sample_sums()
species number of samples.
\donttest{ m1$sample_sums() }
taxa_sums()
Sum the species number for each taxa.
microtable$taxa_sums()
species number of taxa.
\donttest{ m1$taxa_sums() }
sample_names()
Show sample names.
microtable$sample_names()
sample names.
\donttest{ m1$sample_names() }
taxa_names()
Show taxa names of tax_table.
microtable$taxa_names()
taxa names.
\donttest{ m1$taxa_names() }
rename_taxa()
Rename the features, including the rownames of otu_table
, rownames of tax_table
, tip labels of phylo_tree
and rep_fasta
.
microtable$rename_taxa(newname_prefix = "ASV_")
newname_prefix
default "ASV_"; the prefix of new names; new names will be newname_prefix + numbers according to the rownames order of otu_table
.
None; renamed dataset.
\donttest{ m1$rename_taxa() }
merge_samples()
Merge samples according to specific group to generate a new microtable
.
microtable$merge_samples(group)
group
a column name in sample_table
of microtable
object.
a new merged microtable object.
\donttest{ m1$merge_samples("Group") }
merge_taxa()
Merge taxa according to specific taxonomic rank to generate a new microtable
.
microtable$merge_taxa(taxa = "Genus")
taxa
default "Genus"; the specific rank in tax_table
.
a new merged microtable
object.
\donttest{ m1$merge_taxa(taxa = "Genus") }
save_table()
Save each basic data in microtable object as local file.
microtable$save_table(dirpath = "basic_files", sep = ",", ...)
dirpath
default "basic_files"; directory to save the tables, phylogenetic tree and sequences in microtable object. It will be created if not found.
sep
default ","; the field separator string, used to save tables. Same with sep
parameter in write.table
function.
default ','
correspond to the file name suffix 'csv'. The option '\t'
correspond to the file name suffix 'tsv'. For other options, suffix are all 'txt'.
...
parameters passed to write.table
.
\dontrun{ m1$save_table() }
cal_abund()
Calculate the taxonomic abundance at each taxonomic level or selected levels.
microtable$cal_abund( select_cols = NULL, rel = TRUE, merge_by = "|", split_group = FALSE, split_by = "&", split_column = NULL, split_special_char = "&&" )
select_cols
default NULL; numeric vector (column sequences) or character vector (column names of microtable$tax_table
);
applied to select columns to calculate abundances according to ordered hierarchical levels.
This parameter is very useful when only part of the columns are needed to calculate abundances.
rel
default TRUE; if TRUE, relative abundance is used; if FALSE, absolute abundance (i.e. raw values) will be summed.
merge_by
default "|"; the symbol to merge and concatenate taxonomic names of different levels.
split_group
default FALSE; if TRUE, split the rows to multiple rows according to one or more columns in tax_table
when there is multiple mapping information.
split_by
default "&"; Separator delimiting collapsed values; only available when split_group = TRUE
.
split_column
default NULL; one column name used for the splitting in tax_table for each abundance calculation;
only available when split_group = TRUE
. If not provided, the function will split each column that containing the split_by
character.
split_special_char
default "&&"; special character that will be used forcibly to split multiple mapping information in tax_table
by default
no matter split_group
setting.
taxa_abund
list in object.
\donttest{ m1$cal_abund() }
save_abund()
Save taxonomic abundance as local file.
microtable$save_abund( dirpath = "taxa_abund", merge_all = FALSE, rm_un = FALSE, rm_pattern = "__$", sep = ",", ... )
dirpath
default "taxa_abund"; directory to save the taxonomic abundance files. It will be created if not found.
merge_all
default FALSE; Whether merge all tables into one. The merged file format is generally called 'mpa' style.
rm_un
default FALSE; Whether remove unclassified taxa in which the name ends with '__' generally.
rm_pattern
default "__$"; The pattern searched through the merged taxonomic names. See also pattern
parameter in grepl
function.
Only available when rm_un = TRUE
. The default "__$" means removing the names end with '__'.
sep
default ","; the field separator string. Same with sep
parameter in write.table
function.
default ','
correspond to the file name suffix 'csv'. The option '\t'
correspond to the file name suffix 'tsv'. For other options, suffix are all 'txt'.
...
parameters passed to write.table
.
\dontrun{ m1$save_abund(dirpath = "taxa_abund") m1$save_abund(merge_all = TRUE, rm_un = TRUE, sep = "\t") }
cal_alphadiv()
Calculate alpha diversity.
microtable$cal_alphadiv(measures = NULL, PD = FALSE)
measures
default NULL; one or more indexes in c("Observed", "Coverage", "Chao1", "ACE", "Shannon", "Simpson", "InvSimpson", "Fisher", "Pielou")
;
The default NULL represents that all the measures are calculated. 'Shannon', 'Simpson' and 'InvSimpson' are calculated based on vegan::diversity
function;
'Chao1' and 'ACE' depend on the function vegan::estimateR
.
'Fisher' index relies on the function vegan::fisher.alpha
.
"Observed" means the observed species number in a community, i.e. richness.
"Coverage" represents good's coverage. It is defined:
Coverage = 1 - \frac{f1}{n}
where n is the total abundance of a sample, and f1 is the number of singleton (species with abundance 1) in the sample. "Pielou" denotes the Pielou evenness index. It is defined:
J = \frac{H'}{\ln(S)}
where H' is Shannon index, and S is the species number.
PD
default FALSE; whether Faith's phylogenetic diversity is calculated. The calculation depends on the function picante::pd
.
Note that the phylogenetic tree (phylo_tree
object in the data) is required for PD.
alpha_diversity stored in the object. The se.chao1 and se.ACE are the standard erros of Chao1 and ACE, respectively.
\donttest{ m1$cal_alphadiv(measures = NULL, PD = FALSE) class(m1$alpha_diversity) }
save_alphadiv()
Save alpha diversity table to the computer.
microtable$save_alphadiv(dirpath = "alpha_diversity")
dirpath
default "alpha_diversity"; directory name to save the alpha_diversity.csv file.
cal_betadiv()
Calculate beta diversity dissimilarity matrix, such as Bray-Curtis, Jaccard, and UniFrac. See An et al. (2019) <doi:10.1016/j.geoderma.2018.09.035> and Lozupone et al. (2005) <doi:10.1128/AEM.71.12.8228–8235.2005>.
microtable$cal_betadiv(method = NULL, unifrac = FALSE, binary = FALSE, ...)
method
default NULL; a character vector with one or more elements; c("bray", "jaccard")
is used when method = NULL
;
See the method
parameter in vegdist
function for more available options, such as 'aitchison' and 'robust.aitchison'.
unifrac
default FALSE; whether UniFrac indexes (weighted and unweighted) are calculated. Phylogenetic tree is necessary when unifrac = TRUE
.
binary
default FALSE; Whether convert abundance to binary data (presence/absence) when method
is not "jaccard".
TRUE is used for "jaccard" automatically.
...
parameters passed to vegdist
function of vegan package.
beta_diversity list stored in the object.
\donttest{ m1$cal_betadiv(unifrac = FALSE) class(m1$beta_diversity) }
save_betadiv()
Save beta diversity matrix to the computer.
microtable$save_betadiv(dirpath = "beta_diversity")
dirpath
default "beta_diversity"; directory name to save the beta diversity matrix files.
print()
Print the microtable object.
microtable$print()
clone()
The objects of this class are cloneable with this method.
microtable$clone(deep = FALSE)
deep
Whether to make a deep clone.
## ------------------------------------------------
## Method `microtable$new`
## ------------------------------------------------
data(otu_table_16S)
data(taxonomy_table_16S)
data(sample_info_16S)
data(phylo_tree_16S)
m1 <- microtable$new(otu_table = otu_table_16S)
m1 <- microtable$new(sample_table = sample_info_16S, otu_table = otu_table_16S,
tax_table = taxonomy_table_16S, phylo_tree = phylo_tree_16S)
# trim the files in the dataset
m1$tidy_dataset()
## ------------------------------------------------
## Method `microtable$filter_pollution`
## ------------------------------------------------
m1$filter_pollution(taxa = c("mitochondria", "chloroplast"))
## ------------------------------------------------
## Method `microtable$filter_taxa`
## ------------------------------------------------
d1 <- clone(m1)
d1$filter_taxa(rel_abund = 0.0001, freq = 0.2)
## ------------------------------------------------
## Method `microtable$rarefy_samples`
## ------------------------------------------------
m1$rarefy_samples(sample.size = min(m1$sample_sums()))
## ------------------------------------------------
## Method `microtable$tidy_dataset`
## ------------------------------------------------
m1$tidy_dataset(main_data = TRUE)
## ------------------------------------------------
## Method `microtable$add_rownames2taxonomy`
## ------------------------------------------------
m1$add_rownames2taxonomy()
## ------------------------------------------------
## Method `microtable$sample_sums`
## ------------------------------------------------
m1$sample_sums()
## ------------------------------------------------
## Method `microtable$taxa_sums`
## ------------------------------------------------
m1$taxa_sums()
## ------------------------------------------------
## Method `microtable$sample_names`
## ------------------------------------------------
m1$sample_names()
## ------------------------------------------------
## Method `microtable$taxa_names`
## ------------------------------------------------
m1$taxa_names()
## ------------------------------------------------
## Method `microtable$rename_taxa`
## ------------------------------------------------
m1$rename_taxa()
## ------------------------------------------------
## Method `microtable$merge_samples`
## ------------------------------------------------
m1$merge_samples("Group")
## ------------------------------------------------
## Method `microtable$merge_taxa`
## ------------------------------------------------
m1$merge_taxa(taxa = "Genus")
## ------------------------------------------------
## Method `microtable$save_table`
## ------------------------------------------------
## Not run:
m1$save_table()
## End(Not run)
## ------------------------------------------------
## Method `microtable$cal_abund`
## ------------------------------------------------
m1$cal_abund()
## ------------------------------------------------
## Method `microtable$save_abund`
## ------------------------------------------------
## Not run:
m1$save_abund(dirpath = "taxa_abund")
m1$save_abund(merge_all = TRUE, rm_un = TRUE, sep = "\t")
## End(Not run)
## ------------------------------------------------
## Method `microtable$cal_alphadiv`
## ------------------------------------------------
m1$cal_alphadiv(measures = NULL, PD = FALSE)
class(m1$alpha_diversity)
## ------------------------------------------------
## Method `microtable$cal_betadiv`
## ------------------------------------------------
m1$cal_betadiv(unifrac = FALSE)
class(m1$beta_diversity)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.