This method computes a vector of conservation scores from a multiple alignment or a previously computed consensus matrix.
1 2 3 4 5
an object of class
substitution matrix (see details below).
score to use for aligning gaps versus gaps (see details below).
when the method is called for a
The method takes a
MultipleAlignment object or a
previously computed consensus matrix and computes the sum of pairwise
scores for all positions of the alignment. For computing these scores,
it is compulsory to specify a substitution/scoring matrix. This matrix
must be provided as a
matrix object. This can either be
one of the ready-made matrices provided by the Biostrings
BLOSUM62) or any other hand-crafted
matrix. In the latter case, the following restrictions apply:
The matrix must be quadratic.
For reasonable results, the matrix should be symmetric (note that this is not checked).
Rows and columns must be named and the order of letters/symbols in row names and column names must be identical.
All letters/symbols occurring in the multiple alignment, including gaps ‘-’, must also be found in the row/column names of the substitution matrix. For consistency with the matrices from the Biostrings package, the row/column corresponding to gap penalties may also be labeled ‘*’ instead of ‘-’.
So, nucleotide substitution matrices created by
nucleotideSubstitutionMatrix can be used for multiple
alignments of nucleotide sequences, but must be
completed with gap penalty rows and columns (see example below).
If the consensus matrix of a multiple alignment of nucleotide sequences contains rows labeled ‘+’ and/or ‘.’, these rows are ignored.
gapVsGap can be used to control how
pairs of gaps are scored. If
gapVsGap=NULL (default), the
corresponding diagonal entry of the substitution matrix is used as is.
In the BLOSUM matrices, this is usually a positive value, which may
not make sense under all circumstances. Therefore, the parameter
gapVsGap can be set to an alternative value, e.g. 0 for
ignoring gap-gap pairs.
The method, in any case, returns a vector of scores that is as long
as the alignment/consensus matrix has columns. The names of the vector
entries are the corresponding positions of the consensus sequence of
the alignment. How this consensus sequence is computed, can be
controlled with additional arguments that are passed on to the
The function returns a vector as long as the alignment/consensus matrix has columns. The vector is named with the consensus sequence (see details above).
Ulrich Bodenhofer <[email protected]>
U. Bodenhofer, E. Bonatesta, C. Horejs-Kainrath, and S. Hochreiter (2015). msa: an R package for multiple sequence alignment. Bioinformatics 31(24):3997-3999. DOI: 10.1093/bioinformatics/btv494.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
## read sequences filepath <- system.file("examples", "HemoglobinAA.fasta", package="msa") mySeqs <- readAAStringSet(filepath) ## perform multiple alignment myAlignment <- msa(mySeqs) ## compute consensus scores using the BLOSUM62 matrix data(BLOSUM62) msaConservationScore(myAlignment, BLOSUM62) ## compute consensus scores using the BLOSUM62 matrix ## without scoring gap-gap pairs and using a different consensus sequence msaConservationScore(myAlignment, BLOSUM62, gapVsGap=0, type="upperlower") ## compute a consensus matrix first conMat <- consensusMatrix(myAlignment) data(PAM250) msaConservationScore(conMat, PAM250, gapVsGap=0) ## DNA example filepath <- system.file("examples", "exampleDNA.fasta", package="msa") mySeqs <- readDNAStringSet(filepath) ## perform multiple alignment myAlignment <- msa(mySeqs) ## create substitution matrix with gap penalty -8 mat <- nucleotideSubstitutionMatrix(4, -1) mat <- cbind(rbind(mat, "-"=-8), "-"=-8) ## compute consensus scores using this matrix msaConservationScore(myAlignment, mat, gapVsGap=0)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.