Multiple alignment using Infernal

Share:

Description

Computing a multiple sequence alignment using the Infernal software.

Usage

1
cmalign(in.file, out.file, CM.file, threads = 1)

Arguments

in.file

Name of FASTA-file with input sequences.

out.file

Name of file to store the result.

CM.file

Name of file with correlation model.

threads

Number of CPU's to use

Details

The software Infernal (Nawrocki&Eddy, 2013) must be installed and available on the system. Test this by typing system("cmalign -h") in the Console, and some sensible output should be produced. For more details on Infernal, see http://eddylab.org/infernal/.

This function is most typically used to align 16S rRNA sequences.

The cmalign function will produce a multiple alignment, like e.g. muscle, but makes use of a correlation model to do so. A correlation model means in this case a description of how various bases have a long-range relation, due to folding of the sequence. This means that you can only use this function to align sequences for which you have such correlation models. Such models are typically available for a number of RNA-families, see below.

The argument CM.file is the name of a file with a valid correlation model, e.g. one downloaded from the Rfam database (http://rfam.xfam.org/). See examples below for the 16S model supplied with this package.

Value

The result is written to the file specified in out.file.

Author(s)

Lars Snipen.

References

E.P. Nawrocki and S.R. Eddy, Infernal 1.1: 100-fold faster RNA homology searches, Bioinformatics 29:2933-2935 (2013).

See Also

msaTrim.

Examples

1
2
3
4
5
6
## Not run: 
in.file <- file.path(file.path(path.package("microseq"),"extdata"),"16S.fasta")
cm.file <- file.path(file.path(path.package("microseq"),"extdata"),"ssu_bacteria.cm")
cmalign(in.file,"msa_infernal.fasta",cm.file)

## End(Not run)