megapteraPars: Create an Object of Class "megapteraPars"

Description Usage Arguments Details Pipeline Parameters References See Also

Description

S4 Class for parameters of a megaptera project pipeline, as stored in megapteraProj.

Usage

1

Arguments

...

Arguments in tag = value form. The tags must come from the names of the parameters described in the ‘Pipeline Parameters’ section.

Details

The pipeline's verbosity can be fine-tuned with debug.level:

0 No progess and diagnostic messages
1 Messages on screen
2 Messages logged to file
3 Messages on screen and logged to file
4 Same as 3, in addition current data is saved as .rda object in case of a foreseeable error
5 Same as 4, in addition current data is always saved

Pipeline Parameters

data.path

A character string giving the path to the directory there all data and results will be stored (see megapteraInit).

gb.seq.download

A character string defining how sequences should be downloaded from GenBank Nucleotide; can be "eutils" or "ftp".

debug.level

Numeric, a number between 0 and 5, determining the pipeline's verbosity (see Details). (default: 1)

parallel

Logical: if TRUE, several steps in the pipeline will be run in parallel, otherwise all steps are serial.

cpus

Numerical: if TRUE, several steps in the pipeline will be run in parallel, otherwise all steps are serial.

cluster.type

A character string: if TRUE, several steps in the pipeline will be run in parallel, otherwise all steps are serial.

update.seqs

Currently unused.

retmax

Numeric, giving the batch size when downloading sequences from the Entrez History server (default: 500).

max.gi.per.spec

Numeric, giving the maximum number of sequences that will be used per species. Can be used to avoid model organism (e.g., rice, Drosophila, ...) cluttering up the pipeline with thousands of sequences (default: 1000).

max.bp

Numeric, the maximal length of DNA sequences in base pairs to be included in the alignment. The upper limit is determined by the alignment program and the specific alignment and can only be determined by trial-and-error (default: 5000).

reference.max.dist

Currently unused.

min.seqs.reference

Currently unused.

fract.miss

Numeric, ranging between 0 and 1. To avoid long stretches of only a few sequences at the beginning and the ending of an alignment block a minimum required number of sequences can be set as a fraction of the total number of sequences in this alignment block. Has been superseeded by the gb.* parameters.

block.max.dist

Numeric, ranging between 0 and 1. block.max.dist gives the maximum genetic distance (measured as the fraction of divergent nucleotide positions) allowed in a sequence alignment block. The alignment of individual marker is iteratively broken into smaller blocks until this condition is met with.

min.n.seq

Numeric, the minimum number of sequences required for an alignment block. Alignment blocks with less than min.n.seq are dropped from the output.

max.mad

Numeric, giving the treshold value for the assessment of saturation: alignments with a median average distance (MAD) of max.mad or greater will be broken into blocks. The default value has been estimated with simulation by Smith et al. (2009).

gb1

Parameters for masking of alignment blocks with gblocks.

gb2

Parameters for masking of alignment blocks with gblocks.

gb3

Parameters for masking of alignment blocks with gblocks.

gb4

Parameters for masking of alignment blocks with gblocks.

gb5

Parameters for masking of alignment blocks with gblocks.

References

Smith, S.A., J.M. Beaulieu, and M.J. Donoghue. 2009. Mega-phylogeny approach for comparative biology: an alternative to supertree and supermatrix approaches. BMC Evolutionary Biology 9:37.

See Also

megapteraProj for creating a megaptera project.


heibl/megaptera documentation built on Jan. 17, 2021, 3:34 a.m.