postcluster_pq: Recluster sequences of an object of class 'physeq' or a list...
In MiscMetabar: Miscellaneous Functions for Metabarcoding Analysis

postcluster_pq

R Documentation

Recluster sequences of an object of class `physeq` or a list of DNA sequences

Description

This function use the merge_taxa_vec function to merge taxa into clusters.

Usage

postcluster_pq(
  physeq = NULL,
  dna_seq = NULL,
  nproc = 1,
  method = "clusterize",
  id = 0.97,
  vsearchpath = find_vsearch(),
  tax_adjust = 0,
  rank_propagation = FALSE,
  vsearch_cluster_method = "--cluster_size",
  vsearch_args = "--strand both",
  keep_temporary_files = FALSE,
  swarmpath = "swarm",
  d = 1,
  swarm_args = "--fastidious",
  mmseqs2path = find_mmseqs2(),
  mmseqs2_cluster_method = "easy-cluster",
  mmseqs2_args = "",
  method_clusterize = "overlap",
  ...
)

asv2otu(
  physeq = NULL,
  dna_seq = NULL,
  nproc = 1,
  method = "clusterize",
  id = 0.97,
  vsearchpath = find_vsearch(),
  tax_adjust = 0,
  rank_propagation = FALSE,
  vsearch_cluster_method = "--cluster_size",
  vsearch_args = "--strand both",
  keep_temporary_files = FALSE,
  swarmpath = "swarm",
  d = 1,
  swarm_args = "--fastidious",
  mmseqs2path = find_mmseqs2(),
  mmseqs2_cluster_method = "easy-cluster",
  mmseqs2_args = "",
  method_clusterize = "overlap",
  ...
)

Arguments

`physeq`	(required) a `phyloseq-class` object obtained using the `phyloseq` package.
`dna_seq`	You may directly use a character vector of DNA sequences in place of physeq args. When physeq is set, dna sequences take the value of `physeq@refseq`
`nproc`	(default: 1) Set to number of cpus/processors to use for the clustering
`method`	(default: clusterize) Set the clustering method. `clusterize` use the `DECIPHER::Clusterize()` fonction, `vsearch` use the vsearch software (https://github.com/torognes/vsearch) with arguments `--cluster_size` by default (see args `vsearch_cluster_method`) and `⁠-strand both⁠` (see args `vsearch_args`) `swarm` use the swarm software (https://github.com/torognes/swarm) `mmseqs2` use the MMseqs2 software (https://github.com/soedinglab/MMseqs2) with `easy-cluster` by default (see args `mmseqs2_cluster_method`)
`id`	(default: 0.97) level of identity to cluster
`vsearchpath`	(default: vsearch) path to vsearch
`tax_adjust`	(Default 0) See the man page of `merge_taxa_vec()` for more details. To conserved the taxonomic rank of the most abundant taxa (ASV, OTU,...), set tax_adjust to 0 (default). For the moment only tax_adjust = 0 is robust
`rank_propagation`	(logical, default FALSE). Do we propagate the NA value from lower taxonomic rank to upper rank? See the man page of `merge_taxa_vec()` for more details.
`vsearch_cluster_method`	(default: "–cluster_size) See other possible methods in the vsearch manual (e.g. `--cluster_size` or `--cluster_fast`) `--cluster_fast` : Clusterize the fasta sequences in filename, automatically sort by decreasing sequence length beforehand. `--cluster_size` : Clusterize the fasta sequences in filename, automatically sort by decreasing sequence abundance beforehand.
`vsearch_args`	(default : "–strand both") a one length character element defining other parameters to passed on to vsearch.
`keep_temporary_files`	(logical, default: FALSE) Do we keep temporary files temp.fasta (refseq in fasta or dna_seq sequences) cluster.fasta (centroid if method = "vsearch") temp.uc (clusters if method = "vsearch")
`swarmpath`	(default: swarm) path to swarm
`d`	(default: 1) maximum number of differences allowed between two amplicons, meaning that two amplicons will be grouped if they have `d` (or less) differences
`swarm_args`	(default : "–fastidious") a one length character element defining other parameters to passed on to swarm See other possible methods in the SWARM pdf manual
`mmseqs2path`	(default: `find_mmseqs2()`) path to MMseqs2
`mmseqs2_cluster_method`	(default: `"easy-cluster"`) Either `"easy-cluster"` or `"easy-linclust"`. See `mmseqs2_clustering()`.
`mmseqs2_args`	(default: `""`) Additional arguments passed to the MMseqs2 clustering command.
`method_clusterize`	(default "overlap") the method for the `DECIPHER::Clusterize()` method
`...`	Additional arguments passed on to `DECIPHER::Clusterize()`

Details

This function use the merge_taxa_vec function to merge taxa into clusters. By default tax_adjust = 0. See the man page of merge_taxa_vec().

Value

A new object of class physeq or a list of cluster if dna_seq args was used.

Author(s)

Adrien Taudière

References

VSEARCH can be downloaded from https://github.com/torognes/vsearch. More information in the associated publication https://pubmed.ncbi.nlm.nih.gov/27781170.

Examples


if (requireNamespace("DECIPHER")) {
  postcluster_pq(data_fungi_mini)
}

## Not run: 
if (requireNamespace("DECIPHER")) {
  postcluster_pq(data_fungi_mini, method_clusterize = "longest")

  if (MiscMetabar::is_swarm_installed()) {
    d_swarm <- postcluster_pq(data_fungi_mini, method = "swarm")
  }
  if (MiscMetabar::is_vsearch_installed()) {
    d_vs <- postcluster_pq(data_fungi_mini, method = "vsearch")
  }
  if (MiscMetabar::is_mmseqs2_installed()) {
    d_mm <- postcluster_pq(data_fungi_mini, method = "mmseqs2")
  }
}

## End(Not run)

MiscMetabar documentation built on June 8, 2026, 5:07 p.m.