prune_groups: Prune group sizes
In alexpiper/taxreturn: An R package for retrieving and curating public DNA barcode reference data

prune_groups

R Documentation

Prune group sizes

Prune group sizes

prune_groups(
  x,
  max_group_size = 5,
  dedup = TRUE,
  discardby = "length",
  prefer = NULL,
  quiet = FALSE
)

`x`	A DNAbin or DNAStringset object
`max_group_size`	The maximum number of sequences with the same taxonomic annotation to keep
`dedup`	Whether sequences with identical taxonomic name and nucleotide bases sequences should be discarded first
`discardby`	How sequences from groups with size above max_group_size should be discarded. Options include "length" (Default) which will discard sequences from smallest to largest until the group is below max_group_size, "random" which will randomly pick sequences to discard until the group is below max_group_size.
`prefer`	A vector of sequence names that will be preferred when subsampling groups when discardby=random, or prefered when breaking ties in sequences of the same length when discardby=length. For instance high quality in-house sequences.
`quiet`	Whether progress should be printed to the console.

alexpiper/taxreturn documentation built on Sept. 14, 2024, 7:56 p.m.

alexpiper/taxreturn index

README.md

Note that we can't provide technical support on individual packages. You should contact the package authors for that.