Multiple Sequence Alignment with ClustalOmega

Description

This function calls the multiple sequence alignment algorithm ClustalOmega.

Usage

1
2
3
4
5
    msaClustalOmega(inputSeqs, cluster="default",
                    gapOpening="default", gapExtension="default",
                    maxiters="default",  substitutionMatrix="default",
                    type="default", order=c("aligned", "input"),
                    verbose=FALSE, help=FALSE, ...) 

Arguments

inputSeqs

input sequences; see msa. In the original ClustalOmega implementation, this parameter is called infile.

cluster

The cluster size which should be used. The default is 100. In the original ClustalOmega implementation, this parameter is called cluster-size.

gapOpening,gapExtension

ClustalOmega currently does not allow to adjust gap penalties; these arguments are only for future extensions and consistency with the other algorithms and msa. However, setting these parameters to values other than "default" will result in a warning.

maxiters

maximum number of iterations; the default value is 0 (no limitation). In the original ClustalOmega implementation, this parameter is called iterations.

substitutionMatrix

name of substitution matrix for scoring matches and mismatches; can be one of the choices "BLOSUM30", "BLOSUM40", "BLOSUM50", "BLOSUM65", "BLOSUM80", and "Gonnet". This parameter is a new feature - the original ClustalOmega implementation does not allow for using a custom substitution matrix.

type

type of the input sequences inputSeqs; see msa.

order

how the sequences should be ordered in the output object (see msa); in the original ClustalW implementation, this parameter is called output-order.

verbose

if TRUE, the algorithm displays detailed information and progress messages.

help

if TRUE, information about algorithm-specific parameters is displayed. In this case, no multiple sequence alignment is performed and the function quits after displaying the additional help information.

...

further parameters specific to ClustalOmega; An overview of parameters that are available in this interface is shown when calling msaClustalOmega with help=TRUE. For more details, see also the documentation of ClustalOmega.

Details

This is a function providing the ClustalOmega multiple alignment algorithm as an R function. It can be used for various types of sequence data (see inputSeqs argument above). Parameters that are common to all multiple sequences alignments provided by the msa package are explicitly provided by the function and named in the same for all algorithms. Most other parameters that are specific to ClustalOmega can be passed to ClustalOmega via additional arguments (see argument help above).

Since ClustalOmega only allows for using built-in amino acid substitution matrices, it is hardly useful for multiple alignments of nucleotide sequences.

For a note on the order of output sequences and direct reading from FASTA files, see msa.

Value

Depending on the type of sequences for which it was called, msaClustalOmega returns a MsaAAMultipleAlignment, MsaDNAMultipleAlignment, or MsaRNAMultipleAlignment object. If called with help=TRUE, msaClustalOmega returns an invisible NULL.

Author(s)

Enrico Bonatesta and Christoph Horejs-Kainrath <msa@bioinf.jku.at>

References

http://www.bioinf.jku.at/software/msa

U. Bodenhofer, E. Bonatesta, C. Horejs-Kainrath, and S. Hochreiter (2015). msa: an R package for multiple sequence alignment. Bioinformatics 31(24):3997-3999. DOI: 10.1093/bioinformatics/btv494.

http://www.clustal.org/omega/README

Sievers, F., Wilm, A., Dineen, D., Gibson, T. J., Karplus, K., Li, W., Lopez, R., McWilliam, H., Remmert, M., Soeding, J., Thompson, J. D., and Higgins, D. G. (2011) Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol. Syst. Biol. 7:539. DOI: 10.1038/msb.2011.75.

See Also

msa, MsaAAMultipleAlignment, MsaDNAMultipleAlignment, MsaRNAMultipleAlignment, MsaMetaData

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
## read sequences
filepath <- system.file("examples", "exampleAA.fasta", package="msa")
mySeqs <- readAAStringSet(filepath)

## call msaClustalOmega with default values
msaClustalOmega(mySeqs)

## call msaClustalOmega with custom parameters
msaClustalOmega(mySeqs, auto=FALSE, cluster=120, dealign=FALSE,
                useKimura=FALSE, order="input", verbose=FALSE)