microtable: Create 'microtable' object to store and manage all the basic...

microtableR Documentation

Create microtable object to store and manage all the basic files.

Description

This class is a wrapper for a series of operations on the input files and basic manipulations, including microtable object creation, data trimming, data filtering, rarefaction based on Paul et al. (2013) <doi:10.1371/journal.pone.0061217>, taxonomic abundance calculation, alpha and beta diversity calculation based on the An et al. (2019) <doi:10.1016/j.geoderma.2018.09.035> and Lozupone et al. (2005) <doi:10.1128/AEM.71.12.8228-8235.2005> and other basic operations.

Online tutorial: https://chiliubio.github.io/microeco_tutorial/
Download tutorial: https://github.com/ChiLiubio/microeco_tutorial/releases

Format

microtable.

Methods

Public methods


Method new()

Usage
microtable$new(
  otu_table,
  sample_table = NULL,
  tax_table = NULL,
  phylo_tree = NULL,
  rep_fasta = NULL,
  auto_tidy = FALSE
)
Arguments
otu_table

data.frame; The feature abundance table; rownames are features (e.g. OTUs/ASVs/species/genes); column names are samples.

sample_table

data.frame; default NULL; The sample information table; rownames are samples; columns are sample metadata; If not provided, the function can generate a table automatically according to the sample names in otu_table.

tax_table

data.frame; default NULL; The taxonomic information table; rownames are features; column names are taxonomic classes.

phylo_tree

phylo; default NULL; The phylogenetic tree that must be read with the read.tree function of ape package.

rep_fasta

DNAStringSet, list or DNAbin format; default NULL; The sequences. The sequences should be read with the readDNAStringSet function in Biostrings package (DNAStringSet class), read.fasta function in seqinr package (list class), or read.FASTA function in ape package (DNAbin class).

auto_tidy

default FALSE; Whether tidy the files in the microtable object automatically. If TRUE, the function can invoke the tidy_dataset function.

Returns

an object of class microtable with the following components:

sample_table

The sample information table.

otu_table

The feature table.

tax_table

The taxonomic table.

phylo_tree

The phylogenetic tree.

rep_fasta

The sequence.

taxa_abund

default NULL; use cal_abund function to calculate.

alpha_diversity

default NULL; use cal_alphadiv function to calculate.

beta_diversity

default NULL; use cal_betadiv function to calculate.

Examples
data(otu_table_16S)
data(taxonomy_table_16S)
data(sample_info_16S)
data(phylo_tree_16S)
m1 <- microtable$new(otu_table = otu_table_16S)
m1 <- microtable$new(sample_table = sample_info_16S, otu_table = otu_table_16S, 
  tax_table = taxonomy_table_16S, phylo_tree = phylo_tree_16S)
# trim the files in the dataset
m1$tidy_dataset()

Method filter_pollution()

Filter the features considered pollution in microtable$tax_table. This operation will remove any line of the microtable$tax_table containing any the word in taxa parameter regardless of word case.

Usage
microtable$filter_pollution(taxa = c("mitochondria", "chloroplast"))
Arguments
taxa

default c("mitochondria", "chloroplast"); filter mitochondria and chloroplast, or others as needed.

Returns

None

Examples
m1$filter_pollution(taxa = c("mitochondria", "chloroplast"))

Method filter_taxa()

Filter the feature with low abundance and/or low occurrence frequency.

Usage
microtable$filter_taxa(rel_abund = 0, freq = 1, include_lowest = TRUE)
Arguments
rel_abund

default 0; the relative abundance threshold, such as 0.0001.

freq

default 1; the occurrence frequency threshold. For example, the number 2 represents filtering the feature that occurs less than 2 times. A number smaller than 1 is also allowable. For instance, the number 0.1 represents filtering the feature that occurs in less than 10% samples.

include_lowest

default TRUE; whether include the feature with the threshold.

Returns

None

Examples
\donttest{
d1 <- clone(m1)
d1$filter_taxa(rel_abund = 0.0001, freq = 0.2)
}

Method rarefy_samples()

Rarefy communities to make all samples have same count number.

Usage
microtable$rarefy_samples(method = c("rarefy", "SRS")[1], sample.size, ...)
Arguments
method

default c("rarefy", "SRS")[1]; "rarefy" represents the classical resampling like rrarefy function of vegan package. "SRS" is scaling with ranked subsampling method based on the SRS package provided by Lukas Beule and Petr Karlovsky (2020) <DOI:10.7717/peerj.9593>.

sample.size

default NULL; libray size. If not provided, use the minimum number across all samples. For "SRS" method, this parameter is passed to Cmin parameter of SRS function of SRS package.

...

parameters pass to norm function of trans_norm class.

Returns

None; rarefied dataset.

Examples
\donttest{
m1$rarefy_samples(sample.size = min(m1$sample_sums()))
}

Method tidy_dataset()

Trim all the data in the microtable object to make taxa and samples consistent. The results are intersections across data.

Usage
microtable$tidy_dataset(main_data = FALSE)
Arguments
main_data

default FALSE; if TRUE, only basic data in microtable object is trimmed. Otherwise, all data, including taxa_abund, alpha_diversity and beta_diversity, are all trimed.

Returns

None. The data in the object are tidied up. If tax_table is in object, its row names are totally same with the row names of otu_table.

Examples
m1$tidy_dataset(main_data = TRUE)

Method add_rownames2taxonomy()

Add the rownames of microtable$tax_table as its last column. This is especially useful when the rownames of microtable$tax_table are required as a taxonomic level for the taxonomic abundance calculation and biomarker idenfification.

Usage
microtable$add_rownames2taxonomy(use_name = "OTU")
Arguments
use_name

default "OTU"; The column name used in the tax_table.

Returns

NULL, a new tax_table stored in the object.

Examples
\donttest{
m1$add_rownames2taxonomy()
}

Method sample_sums()

Sum the species number for each sample.

Usage
microtable$sample_sums()
Returns

species number of samples.

Examples
\donttest{
m1$sample_sums()
}

Method taxa_sums()

Sum the species number for each taxa.

Usage
microtable$taxa_sums()
Returns

species number of taxa.

Examples
\donttest{
m1$taxa_sums()
}

Method sample_names()

Show sample names.

Usage
microtable$sample_names()
Returns

sample names.

Examples
\donttest{
m1$sample_names()
}

Method taxa_names()

Show taxa names of tax_table.

Usage
microtable$taxa_names()
Returns

taxa names.

Examples
\donttest{
m1$taxa_names()
}

Method rename_taxa()

Rename the features, including the rownames of otu_table, rownames of tax_table, tip labels of phylo_tree and rep_fasta.

Usage
microtable$rename_taxa(newname_prefix = "ASV_")
Arguments
newname_prefix

default "ASV_"; the prefix of new names; new names will be newname_prefix + numbers according to the rownames order of otu_table.

Returns

None; renamed dataset.

Examples
\donttest{
m1$rename_taxa()
}

Method merge_samples()

Merge samples according to specific group to generate a new microtable.

Usage
microtable$merge_samples(group)
Arguments
group

a column name in sample_table of microtable object.

Returns

a new merged microtable object.

Examples
\donttest{
m1$merge_samples("Group")
}

Method merge_taxa()

Merge taxa according to specific taxonomic rank to generate a new microtable.

Usage
microtable$merge_taxa(taxa = "Genus")
Arguments
taxa

default "Genus"; the specific rank in tax_table.

Returns

a new merged microtable object.

Examples
\donttest{
m1$merge_taxa(taxa = "Genus")
}

Method save_table()

Save each basic data in microtable object as local file.

Usage
microtable$save_table(dirpath = "basic_files", sep = ",", ...)
Arguments
dirpath

default "basic_files"; directory to save the tables, phylogenetic tree and sequences in microtable object. It will be created if not found.

sep

default ","; the field separator string, used to save tables. Same with sep parameter in write.table function. default ',' correspond to the file name suffix 'csv'. The option '\t' correspond to the file name suffix 'tsv'. For other options, suffix are all 'txt'.

...

parameters passed to write.table.

Examples
\dontrun{
m1$save_table()
}

Method cal_abund()

Calculate the taxonomic abundance at each taxonomic level or selected levels.

Usage
microtable$cal_abund(
  select_cols = NULL,
  rel = TRUE,
  merge_by = "|",
  split_group = FALSE,
  split_by = "&",
  split_column = NULL,
  split_special_char = "&&"
)
Arguments
select_cols

default NULL; numeric vector (column sequences) or character vector (column names of microtable$tax_table); applied to select columns to calculate abundances according to ordered hierarchical levels. This parameter is very useful when only part of the columns are needed to calculate abundances.

rel

default TRUE; if TRUE, relative abundance is used; if FALSE, absolute abundance (i.e. raw values) will be summed.

merge_by

default "|"; the symbol to merge and concatenate taxonomic names of different levels.

split_group

default FALSE; if TRUE, split the rows to multiple rows according to one or more columns in tax_table when there is multiple mapping information.

split_by

default "&"; Separator delimiting collapsed values; only available when split_group = TRUE.

split_column

default NULL; one column name used for the splitting in tax_table for each abundance calculation; only available when split_group = TRUE. If not provided, the function will split each column that containing the split_by character.

split_special_char

default "&&"; special character that will be used forcibly to split multiple mapping information in tax_table by default no matter split_group setting.

Returns

taxa_abund list in object.

Examples
\donttest{
m1$cal_abund()
}

Method save_abund()

Save taxonomic abundance as local file.

Usage
microtable$save_abund(
  dirpath = "taxa_abund",
  merge_all = FALSE,
  rm_un = FALSE,
  rm_pattern = "__$",
  sep = ",",
  ...
)
Arguments
dirpath

default "taxa_abund"; directory to save the taxonomic abundance files. It will be created if not found.

merge_all

default FALSE; Whether merge all tables into one. The merged file format is generally called 'mpa' style.

rm_un

default FALSE; Whether remove unclassified taxa in which the name ends with '__' generally.

rm_pattern

default "__$"; The pattern searched through the merged taxonomic names. See also pattern parameter in grepl function. Only available when rm_un = TRUE. The default "__$" means removing the names end with '__'.

sep

default ","; the field separator string. Same with sep parameter in write.table function. default ',' correspond to the file name suffix 'csv'. The option '\t' correspond to the file name suffix 'tsv'. For other options, suffix are all 'txt'.

...

parameters passed to write.table.

Examples
\dontrun{
m1$save_abund(dirpath = "taxa_abund")
m1$save_abund(merge_all = TRUE, rm_un = TRUE, sep = "\t")
}

Method cal_alphadiv()

Calculate alpha diversity.

Usage
microtable$cal_alphadiv(measures = NULL, PD = FALSE)
Arguments
measures

default NULL; one or more indexes in c("Observed", "Coverage", "Chao1", "ACE", "Shannon", "Simpson", "InvSimpson", "Fisher", "Pielou"); The default NULL represents that all the measures are calculated. 'Shannon', 'Simpson' and 'InvSimpson' are calculated based on vegan::diversity function; 'Chao1' and 'ACE' depend on the function vegan::estimateR. 'Fisher' index relies on the function vegan::fisher.alpha. "Observed" means the observed species number in a community, i.e. richness. "Coverage" represents good's coverage. It is defined:

Coverage = 1 - \frac{f1}{n}

where n is the total abundance of a sample, and f1 is the number of singleton (species with abundance 1) in the sample. "Pielou" denotes the Pielou evenness index. It is defined:

J = \frac{H'}{\ln(S)}

where H' is Shannon index, and S is the species number.

PD

default FALSE; whether Faith's phylogenetic diversity is calculated. The calculation depends on the function picante::pd. Note that the phylogenetic tree (phylo_tree object in the data) is required for PD.

Returns

alpha_diversity stored in the object. The se.chao1 and se.ACE are the standard erros of Chao1 and ACE, respectively.

Examples
\donttest{
m1$cal_alphadiv(measures = NULL, PD = FALSE)
class(m1$alpha_diversity)
}

Method save_alphadiv()

Save alpha diversity table to the computer.

Usage
microtable$save_alphadiv(dirpath = "alpha_diversity")
Arguments
dirpath

default "alpha_diversity"; directory name to save the alpha_diversity.csv file.


Method cal_betadiv()

Calculate beta diversity dissimilarity matrix, such as Bray-Curtis, Jaccard, and UniFrac. See An et al. (2019) <doi:10.1016/j.geoderma.2018.09.035> and Lozupone et al. (2005) <doi:10.1128/AEM.71.12.8228–8235.2005>.

Usage
microtable$cal_betadiv(method = NULL, unifrac = FALSE, binary = FALSE, ...)
Arguments
method

default NULL; a character vector with one or more elements; c("bray", "jaccard") is used when method = NULL; See the method parameter in vegdist function for more available options, such as 'aitchison' and 'robust.aitchison'.

unifrac

default FALSE; whether UniFrac indexes (weighted and unweighted) are calculated. Phylogenetic tree is necessary when unifrac = TRUE.

binary

default FALSE; Whether convert abundance to binary data (presence/absence) when method is not "jaccard". TRUE is used for "jaccard" automatically.

...

parameters passed to vegdist function of vegan package.

Returns

beta_diversity list stored in the object.

Examples
\donttest{
m1$cal_betadiv(unifrac = FALSE)
class(m1$beta_diversity)
}

Method save_betadiv()

Save beta diversity matrix to the computer.

Usage
microtable$save_betadiv(dirpath = "beta_diversity")
Arguments
dirpath

default "beta_diversity"; directory name to save the beta diversity matrix files.


Method print()

Print the microtable object.

Usage
microtable$print()

Method clone()

The objects of this class are cloneable with this method.

Usage
microtable$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.

Examples


## ------------------------------------------------
## Method `microtable$new`
## ------------------------------------------------

data(otu_table_16S)
data(taxonomy_table_16S)
data(sample_info_16S)
data(phylo_tree_16S)
m1 <- microtable$new(otu_table = otu_table_16S)
m1 <- microtable$new(sample_table = sample_info_16S, otu_table = otu_table_16S, 
  tax_table = taxonomy_table_16S, phylo_tree = phylo_tree_16S)
# trim the files in the dataset
m1$tidy_dataset()

## ------------------------------------------------
## Method `microtable$filter_pollution`
## ------------------------------------------------

m1$filter_pollution(taxa = c("mitochondria", "chloroplast"))

## ------------------------------------------------
## Method `microtable$filter_taxa`
## ------------------------------------------------


d1 <- clone(m1)
d1$filter_taxa(rel_abund = 0.0001, freq = 0.2)


## ------------------------------------------------
## Method `microtable$rarefy_samples`
## ------------------------------------------------


m1$rarefy_samples(sample.size = min(m1$sample_sums()))


## ------------------------------------------------
## Method `microtable$tidy_dataset`
## ------------------------------------------------

m1$tidy_dataset(main_data = TRUE)

## ------------------------------------------------
## Method `microtable$add_rownames2taxonomy`
## ------------------------------------------------


m1$add_rownames2taxonomy()


## ------------------------------------------------
## Method `microtable$sample_sums`
## ------------------------------------------------


m1$sample_sums()


## ------------------------------------------------
## Method `microtable$taxa_sums`
## ------------------------------------------------


m1$taxa_sums()


## ------------------------------------------------
## Method `microtable$sample_names`
## ------------------------------------------------


m1$sample_names()


## ------------------------------------------------
## Method `microtable$taxa_names`
## ------------------------------------------------


m1$taxa_names()


## ------------------------------------------------
## Method `microtable$rename_taxa`
## ------------------------------------------------


m1$rename_taxa()


## ------------------------------------------------
## Method `microtable$merge_samples`
## ------------------------------------------------


m1$merge_samples("Group")


## ------------------------------------------------
## Method `microtable$merge_taxa`
## ------------------------------------------------


m1$merge_taxa(taxa = "Genus")


## ------------------------------------------------
## Method `microtable$save_table`
## ------------------------------------------------

## Not run: 
m1$save_table()

## End(Not run)

## ------------------------------------------------
## Method `microtable$cal_abund`
## ------------------------------------------------


m1$cal_abund()


## ------------------------------------------------
## Method `microtable$save_abund`
## ------------------------------------------------

## Not run: 
m1$save_abund(dirpath = "taxa_abund")
m1$save_abund(merge_all = TRUE, rm_un = TRUE, sep = "\t")

## End(Not run)

## ------------------------------------------------
## Method `microtable$cal_alphadiv`
## ------------------------------------------------


m1$cal_alphadiv(measures = NULL, PD = FALSE)
class(m1$alpha_diversity)


## ------------------------------------------------
## Method `microtable$cal_betadiv`
## ------------------------------------------------


m1$cal_betadiv(unifrac = FALSE)
class(m1$beta_diversity)


microeco documentation built on Oct. 30, 2024, 9:12 a.m.