prune_groups: Prune group sizes

View source: R/filter_sequences.R

prune_groupsR Documentation

Prune group sizes

Description

Prune group sizes

Usage

prune_groups(
  x,
  max_group_size = 5,
  dedup = TRUE,
  discardby = "length",
  prefer = NULL,
  quiet = FALSE
)

Arguments

x

A DNAbin or DNAStringset object

max_group_size

The maximum number of sequences with the same taxonomic annotation to keep

dedup

Whether sequences with identical taxonomic name and nucleotide bases sequences should be discarded first

discardby

How sequences from groups with size above max_group_size should be discarded. Options include "length" (Default) which will discard sequences from smallest to largest until the group is below max_group_size, "random" which will randomly pick sequences to discard until the group is below max_group_size.

prefer

A vector of sequence names that will be preferred when subsampling groups when discardby=random, or prefered when breaking ties in sequences of the same length when discardby=length. For instance high quality in-house sequences.

quiet

Whether progress should be printed to the console.


alexpiper/taxreturn documentation built on Sept. 14, 2024, 7:56 p.m.