dbMaxGIPerSpec: Limit Numbers of Sequences per Species

Description Usage Arguments Details See Also

View source: R/dbMaxGIPerSpec.R

Description

Some species (e.g. model organism) have thousands or more sequences of a single locus on GenBank. Much of this genetic information is redundant in a phylogenetic context, so megaptera internally limits the number of sequences per species to 10. This number can be changed by the user via with megapteraPars or dbMaxGIPerSpec, the latter function allowing more fine-tuning (see Details).

Usage

1
dbMaxGIPerSpec(megProj, max.gi.per.spec, prefer = "longest", taxon)

Arguments

megProj

An object of class megapteraProj.

max.gi.per.spec

An integer giving the maximum number of sequences per species to be used by the pipeline.

prefer

A character string indicating under what criterion the sequences will be choosen; available are "longest" (default), "shortest", "most frequent length" and "random".

taxon

A character string giving one or more taxon names for which the number of sequences will be limited. If left missing, all taxon names found in the database are handled.

Details

After calling dbMaxGIPerSpec the pipeline must be rerun from stepC onwards.

To do: describe parameters!

See Also

megapteraPars


heibl/megaptera documentation built on Jan. 17, 2021, 3:34 a.m.