Description Usage Arguments Details Value Examples
Utils to preprocess taxa table, and make it easy for visualization.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | subsetTaxaTable(taxa.table, taxa.group = "assigned", rank = "kingdom",
include = TRUE, ignore.case = TRUE)
subsetCM(community.matrix, taxa.table, taxa.group = NA, rank = NA,
include = TRUE, ignore.case = TRUE, verbose = TRUE, drop.taxa = TRUE,
merged.by = "row.names", ...)
prepTaxonomy(taxa.table, col.ranks = c("kingdom", "phylum", "class", "order",
"family"), txt.unclassified = "unclassified", verbose = TRUE,
pattern = "(\\s\\[|\\()(\\=|\\.|\\,|\\s|\\w|\\?)*(\\]|\\))")
mergeCMTaxa(community.matrix, taxa.table, classifier = c("MEGAN", "RDP"),
min.conf = 0.8, has.total = 1, sort = TRUE, preprocess = TRUE,
verbose = TRUE, mv.row.names = T,
pattern = "(\\s\\[|\\()(\\=|\\.|\\,|\\s|\\w|\\?)*(\\]|\\))",
col.ranks = c("kingdom", "phylum", "class", "order", "family"))
assignTaxaByRank(cm.taxa, unclassified = 0, aggre.FUN = sum,
pattern = "(\\s\\[|\\()(\\=|\\.|\\,|\\s|\\w|\\?)*(\\]|\\))")
summaryTaxaAssign(ta.list, ta.OTU.list = list(), exclude.rank = c(-1),
exclude.unclassified = TRUE, sort.rank = getRanks())
combineTaxaAssign(ta.list, keywords = c("Eukaryota"), ignore.case = TRUE,
replace.to = c(), min.row.comb = 2)
summaryRank(ta.list, rank = "kingdom", exclude.unclassified = TRUE)
groupsTaxaMembers(taxa.assign, cm.taxa, rank = "phylum",
rm.unclassified = TRUE, regex1 = "(\\|[0-9]+)", regex2 = "",
ignore.case = TRUE, verbose = TRUE)
|
taxa.table |
A data frame to contain taxonomic classifications of OTUs.
Columns are taxonomy at the rank or lineage, rows are OTUs which need to
match rows from community matrix. Use |
taxa.group |
The taxonomic group, the values can be 'all', 'assigned', or Group 'all' includes everything. Group 'assigned' removes all uncertain classifications including 'root', 'cellular organisms', 'No hits', 'Not assigned'. Alternatively, any high-ranking taxonomy in your taxonomy file can be used as a group or multi-groups (seperated by "|"), such as 'BACTERIA', 'Proteobacteria', etc. But they have to be in the same rank column in the file. Default to remove all uncertain classifications, even when group(s) assigned. |
rank |
The rank to specify which column name in |
include |
Define whether include or exclude given |
ignore.case |
If TRUE, as default, case insensitive for taxon names. |
community.matrix |
Community matrix (OTU table), where rows are
OTUs or individual species and columns are sites or samples. See |
verbose |
More details. Default to TRUE. |
drop.taxa |
TRUE, as default, to drop all taxonomy columns,
and only keep |
col.ranks |
A vector or string of column name(s) of taxonomic ranks in the taxa table,
which will determine the aggregated abundence matrix. They have to be full set or subset of
|
txt.unclassified |
The key word to represent unclassified taxonomy. |
pattern |
The pattern for |
classifier |
The classifier is used to generate |
min.conf |
The confidence threshold to drop rows < min.conf. |
has.total |
If 0, then only return abundance by samples (columns) of community matrix. If 1, then only return total abundance. If 2, then return abundance by samples (columns) and total. Default to 1. |
sort |
Sort the taxonomy rank by rank. Default to TRUE. |
preprocess |
If TRUE, as default, replace
"root|cellular organisms|No hits|Not assigned|unclassified sequences" from MEGAN result,
or mark OTUs as 'unclassified' in RDP result whose confidence < |
mv.row.names |
Default to TRUE to move the column 'Row.names'
created by |
cm.taxa |
The data frame combined community matrix with
taxonomic classifications generated by Note: From 1 to |
unclassified |
An interger to instruct how to deal with "unclassified" taxa. Default to 0, which keeps all "unclassified" but moves them to the last rows. If 1, then remove the row whose taxon name is exact "unclassified". See the detail. If 2, then remove the row whose taxon name is exact "unclassified", but also merge all the rest "unclassified ???" to "unclassified rank", such as "unclassified family". If 3, then remove every rows containing "unclassified". If 4, then do nothing. |
aggre.FUN |
A function for |
ta.list, ta.OTU.list |
The list of taxonomic assignments
created by |
exclude.rank |
The first n elements (ranks) to exclude from the summary, default to -1, which is normally the kingdom. |
exclude.unclassified |
Default to TRUE, not to count the taxonomy having the "unclassified" keyword. |
sort.rank |
The order used to sort the summary dataframe by "rank" column,
default to |
keywords |
The vector of keywords for |
replace.to |
The new names are used for combined rows,
which should be either empty or the same length of the vector |
min.row.comb |
The minimun number of rows from |
taxa.assign |
The data frame of taxonomic assignments with abundance
at the |
rm.unclassified |
Drop all unclassified rows (OTUs). Default to TRUE. |
regex1, regex2 |
Use for |
ignore.case |
Default to TRUE, same to |
rank |
The rank given to select the list of taxa assignments
produced by |
subsetTaxaTable
takes or excludes a subset of given a taxa table at given rank.
subsetCM
returns a subset community matrix
regarding taxa.group
at a given rank
column
in taxa.table
, which is also the alternative choice
of mergeCMTaxa
if only simply merge
is required.
If either taxa.group
or rank
is NA, as default,
then use the whole taxa.table
, otherwise take the subset
of taxa.table
by subsetTaxaTable
.
prepTaxonomy
replace repeated high rank taxa to
unclassified high rank in MEGAN result,
or replace the blank value to unclassified in RDP result,
in order to make taxonomy table taxa.table
(can be cm.taxa
)
to make names look nice.
col.ranks
vector have to be rank column names in taxa.table.
mergeCMTaxa
creates a data frame cm.taxa
combined community matrix with
taxonomic classification table. The 1st column is "row.names" that are OTUs/individuals,
the next "ncol.cm" columns are abundence that can be sample-based or total,
and the last "length(col.ranks)" columns are the ranks.
All sequences either classified as "root|cellular organisms|No hits|Not assigned|unclassified sequences" from BLAST + MEGAN, or confidence < min.conf threshold from RDP, are changed to "unclassified", which will be moved to the last row.
assignTaxaByRank
provides a list of taxonomic assignments with abundance
from community matrix at different rank levels, where rownames are taxonomy
at that rank, and columns are the sample names (may include total).
The function is iterated through col.ranks
, and aggregate
s
abundance into taxonomy based on the rank in col.ranks
.
summaryTaxaAssign
summarises the number of reads, OTUs, and taxonomy
from the result of assignTaxaByRank
.
combineTaxaAssign
combines the total of taxonomy matching a given each of keywords
with the row names of the taxonomy assignment in the list from assignTaxaByRank
.
The function is only working for the taxonomy assignment having 1 column "total" at the moment.
summaryRank
directly converts the result of assignTaxaByRank
at a given rank
into a data frame as the summary.
groupsTaxaMembers
groups the members (rows, also OTUs) from
cm.taxa
for each taxa in taxa.assign
at the rank
,
and returns a list of members (OTUs) grouped by taxonomy.
Default to drop all unclassified members (OTUs).
It is impossible to trace back members after assignTaxaByRank
,
so that this function only has one option except the default,
which assign the rest of members (OTUs) not picked up from other taxa
into "unclassified". The result relies on using the identical cm.taxa
in both assignTaxaByRank
and groupsTaxaMembers
.
ncol.cm
and col.ranks
are attributes of cm.taxa
generated by mergeCMTaxa
.
ncol.cm
indicates how many column(s) is/are abundence in cm.taxa
.
col.ranks
records what ranks column(s) is/are in cm.taxa
,
which is also the input of mergeCMTaxa
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | tt.sub <- subsetTaxaTable(tt.megan, taxa.group="Proteobacteria", rank="phylum")
tt.sub <- subsetTaxaTable(tt.megan, taxa.group="Cnidaria|Brachiopoda|Echinodermata|Porifera", rank="phylum", include=FALSE)
sub.cm <- subsetCM(cm, tt, taxa.group="BACTERIA", rank="kingdom")
tt <- prepTaxonomy(taxa.table, col.ranks=c("kingdom", "phylum", "class"))
cm.taxa <- mergeCMTaxa(community.matrix, tt.megan)
ta.megan <- assignTaxaByRank(cm.taxa)
cm.taxa <- mergeCMTaxa(community.matrix, tt.rdp, classifier="RDP", has.total=0)
ta.rdp <- assignTaxaByRank(cm.taxa, unclassified=2)
colSums(ta.rdp[["phylum"]])
summary.ta.df <- summaryTaxaAssign(ta.list, ta.OTU.list)
combined.ta.list <- combineTaxaAssign(ta.list, c("Fungi", "Eukaryota", "Streptophyta|Viridiplantae", "Bacteria"))
combined.ta.list <- combineTaxaAssign(ta.list, c("Streptophyta|Viridiplantae"), replace.to=c("Plant"))
summary.kingdom.df <- summaryRank(ta.list, rank="kingdom")
taxa.members <- groupsTaxaMembers(ta.rdp[["phylum"]], tt.rdp)
taxa.members <- groupsTaxaMembers(ta.rdp[["family"]], tt.rdp, rank="family")
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.