guidance: GUIDetree-based AligNment ConficencE

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/guidance.R

Description

MSA reliability assessment GUIDANCE (Penn et al. 2010)

Usage

1
2
3
guidance(sequences, msa.program = "mafft", exec, bootstrap = 100,
  col.cutoff = "auto", seq.cutoff = "auto", mask.cutoff = "auto",
  parallel = FALSE, ncore = "auto", method = "auto", alt.msas.file)

Arguments

sequences

An object of class DNAbin or AAbin containing unaligned sequences of DNA or amino acids.

msa.program

A charcter string giving the name of the MSA program, currelty one of c("mafft", "muscle", "clustalo", "clustalw2"); MAFFT is default

exec

A character string giving the path to the executable of the alignment program.

bootstrap

An integer giving the number of perturbated MSAs.

col.cutoff

numberic between 0 and 1; specifies a cutoff to remove unreliable columns below the cutoff; either user supplied or "auto" (0.73)

seq.cutoff

numberic between 0 and 1; specifies a cutoff to remove unreliable sequences below the cutoff; either user supplied of "auto" (0.5)

mask.cutoff

specific residues below a certain cutoff are masked ('N' for DNA, 'X' for AA); either user supplied of "auto" (0.5)

parallel

logical, if TRUE, specify the number of cores

ncore

number of cores (default is maxinum of local threads)

method

further arguments passed to mafft, default is "auto"

Details

Calculates column confidence (and other scors) by comparing alternative MSAs generated by alternative guide trees derived from bootstrap MSAs (Felsenstein 1985). The basic comparison between the BP MSAs and a reference MSA is if column residue pairs are identically aligned in all alternative MSAs compared with the base MSA (see compareMSAs).

Value

list containing following scores and alignments:

mean_scores residue pair score and mean column score

column_score

residue_column_score GUIDANCE score

residue_pair_residue_score

residual_pair_sequence_pair_score

residual_pair_sequence_score

residue_pair_score

base_msa

guidance_msa is the base_MSA removed from unreliable residues/columns/sequences below cutoffs

Author(s)

Franz-Sebastian Krah

Christoph Heibl

References

Felsenstein J. 1985. Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39:783<e2><80><93>791

Penn et al. (2010). An alignment confidence score capturing robustness to guide tree uncertainty. Molecular Biology and Evolution 27:1759–1767

G. Landan and D. Graur (2008). Local reliability measures from sets of co-optimal multiple sequencesuence alignments. 13:15–24

See Also

compareMSAs, guidance2, HoT

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
## Not run: 
# first run GUIDANCE on example data using MAFFT
file <- system.file("extdata", "BB50009.fasta", package = "rpg")
aa_seq<- read.fas(file, type ="AA")
g_msa <- guidance(sequences = aa_seq,
msa.program = "mafft",
exec = "/usr/local/bin/mafft",
bootstrap = 100,
parallel = FALSE,
method = "retree 1")
h.p <- confidence.heatmap(g_msa, title = "GUIDANCE BB50009 benchmark",
legend = TRUE,guidance_score = TRUE)
h.p
# again with Muscle
g_msa_m <- guidance(sequences = aa_seq,
msa.program = "muscle",
exec = "/Applications/muscle",
bootstrap = 100,
parallel = FALSE,
method = "retree 1")
h.p <- confidence.heatmap(g_msa_m, title = "GUIDANCE BB50009 benchmark",
legend = TRUE,guidance_score = TRUE)
h.p

## Plot both for comparison
h.p.mafft <- confidence.heatmap(g_msa, title = "MAFFT",
legend = FALSE, guidance_score = FALSE)
h.p.muscle <- confidence.heatmap(g_msa_m, title = "MUSCLE",
legend = FALSE, guidance_score = FALSE)
library(cowplot)
plot_grid(h.p.mafft, h.p.muscle, ncol = 1, nrow = 2)

## End(Not run)

heibl/rpg documentation built on May 17, 2019, 3:23 p.m.