Compute the Sequence Entropy for an Alignment

Share:

Description

Computes the sequence entropy of an alignment. It is possible to specify which characters to omit in the computation. The joint entropy is computed using get.entropy2p().

Usage

1
2
3
4
5
get.entropy(aln, bool = FALSE, gapchar = "NOGAPCHAR",
            verbose = FALSE)

get.entropy2p(aln, bool = FALSE, gapchar = "NOGAPCHAR",
              verbose = FALSE)

Arguments

aln

alignment matrix

bool

logical, if TRUE gaps are ignored when computing the entropy of each column of the alignment

gapchar

character vector containing the unique set of characters representing gaps in the amino acid sequence

verbose

logical, TRUE for getting output messages

Details

The Shannon (1948) entropy for an alignment is computed as follows:

H(X)=-sum_x p(x)log_2(p(x))

The joint entropy is computed for every possible column pair:

H(X)=-sum_(x,y) p(x,y)log_2(p(x,y))

where X and Y are two columns of the alignment.

Value

Return value for get.entropy() is a vector containing the entropy for each column.
Return value for get.entropy2p() is a matrix containing the joint entropies in the lower triangle.

Author(s)

Franziska Hoffgaard

References

Shannon (1948) The Bell System Technical Journal 27, 379–423.

See Also

get.mie

Examples

1
2
3
4
5
6
aln<-matrix(c("M", "H", "X", "P", "V", "-", "H", "X", "L", "V", "M", "L",
 "X", "P", "V"), 3, byrow = TRUE)
h1<-get.entropy(aln, bool = TRUE , gapchar = "-")
h2<-get.entropy(aln)

h3<-get.entropy2p(aln)