EntropyMSA: An entropy correction function

Description Usage Arguments Examples

View source: R/EntropyMSA.R

Description

This function estimates entropy in an infite size population from a sample.

Usage

1
2
3
EntropyMSA(MSA, gap = 1, n = 1, no.gap.firstseq = FALSE,
  entropy.correct = FALSE, totalCharacters = 21, write.align = FALSE,
  randomise = FALSE)

Arguments

MSA

The input is a multiple sequence alignment in FASTA format.

gap

This is for gap removal in an alignment column. The default accepts 100 percentage of gaps. Users can change the numeric value provided to this parameter from 0 to 1 to change the gaps that can be tolerated per alignment column. 0 for 0 percentage gaps, 1 for 100 percentage gaps.

n

By default, n = 1, which means only a single character per alignment column per sequence. For example, users may change to n = 9 for nonamer scoring instead of monomer.

no.gap.firstseq

This allows the user to set gap removal based on a reference sequence that must be placed as the first sequence in the multiple alignment file.

entropy.correct

By default, there is no entropy correction. If users wish to perform a correction, they should set entropy.correct = TRUE.

totalCharacters

This depends on whether the standard 20 amino acids plus a gap scoring is used or not. Users may set this to 20 if they wish to ignore gap character.

write.align

After filtering for gaps either with no.gap.firstseq and gap parameters, the actual multiple alignment used differ. Users may choose to output the final alignment to find out which alignment columns correspond to the entropy values.

randomise

This parameter allows users to try other corrected entropy values. By default, set.seed is used in entropy correction and hence, same values are produced, which made it easier for the empiral study to determine the effects of correction in various sample and population sizes.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
#for monomer and a comparison of corrected versus uncorrected
file1 <- system.file("extdata","exampleMSA1.fasta",package="EntropyCorrect")
H.uncor <- EntropyMSA(file1, no.gap.firstseq = TRUE)
H.cor1 <- EntropyMSA(file1, no.gap.firstseq = TRUE, entropy.correct = TRUE)
H.cor2 <- EntropyMSA(file1, no.gap.firstseq = TRUE, entropy.correct = TRUE, randomise = TRUE)
layout(matrix(c(1,2),2,1,byrow = TRUE), widths = 7, 
heights = c(8, 2), respect = FALSE)
par(mar = c(0, 4, 3, 2))
barplot(H.uncor,border="white", ylab = "Entropy",space=0,yaxt="n",ylim = c(0,4))
axis(side=2, at=c(0,0.5, 1,1.5,2,2.5,3.0,3.5,4))
axis(side=3, at=(seq(0,length(H.uncor),by=5)-0.5),labels=seq(0,length(H.uncor),by=5))
lines(1:length(H.cor1),H.cor1,col="red")
#lines(1:length(H.cor1),H.cor1,col="black",lty=5) #alternative presentation
#lines(1:length(H.cor2),H.cor2,col="blue") #alternative line
box()
mtext("Alignment position", side=3, line=2)
par(mar = c(2, 4, 0, 2))
plot(c(0, length(H.cor1)), c(0, 100), type= "n", xlab = "", 
ylab = "",yaxt="n",xaxt="n",bty="n")
rect(22, 30, 101, 80,density=10)
rect(106, 30, 230, 80,density=20, angle=135)
lines(c(0,22),c(55,55),lwd=2)
lines(c(101,106),c(55,55),lwd=2)
lines(c(230,243),c(55,55),lwd=2)
text(65, 10, "N terminal")
text(165, 10, "C terminal")
#for nonamer and a comparison of corrected versus uncorrected
H9 <- EntropyMSA(file1, n = 9)
H9.cor <- EntropyMSA(file1, n = 9, entropy.correct = TRUE)
#for nonamer, a simple example to understand entropy calculation
file2 <- system.file("extdata","test.fasta",package="EntropyCorrect")
H9.new <- EntropyMSA(file2,n=9)

lloydlow/EntropyCorrect documentation built on May 21, 2019, 7:34 a.m.