These functions calculate Shannon entropy and related concepts, including diversity, specificity, and specialization. They can be used to quantify gene expression profiles.

1 2 3 4 | ```
entropy(vector)
entropyDiversity(mat, norm=FALSE)
entropySpecificity(mat, norm=FALSE)
sampleSpecialization(mat, norm=TRUE)
``` |

`vector` |
A vector of numbers, or characters. Discrete probability of each item is calculated and the Shannon entropy is returned. |

`mat` |
A matrix (usually an expression matrix), with genes (features) in rows and samples in columns. |

`norm` |
Logical value. If set to |

Shannon entropy can be used as measures of gene expression specificity, as well as measures of tissue diversity and specialization. See references below.

We use `2`

as base for the entropy calculation, because in this
base the unit of entropy is *bit*.

`entropy`

returns one entropy value. `entropyDiversity`

and
`sampleSpecialization`

returns a vector as long as the column
number of the input matrix. `entropySpecificity`

returns a vector
of the length of the row number of the input matrix, namely the
specificity score of genes.

Jitao David Zhang <[email protected]>

Martinez and Reyes-Valdes (2008) Defining diversity, specialization, and gene specificity in transcriptomes through information theory. PNAS 105(28):9709–9714

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | ```
myVec0 <- 1:9
entropy(myVec0) ## log2(9)
myVec1 <- rep(1, 9)
entropy(myVec1)
myMat <- rbind(c(3,4,5),c(6,6,6), c(0,2,4))
entropySpecificity(myMat)
entropySpecificity(myMat, norm=TRUE)
entropyDiversity(myMat)
entropyDiversity(myMat, norm=TRUE)
sampleSpecialization(myMat)
sampleSpecialization(myMat,norm=TRUE)
myRandomMat <- matrix(runif(1000), ncol=20)
entropySpecificity(myRandomMat)
entropySpecificity(myRandomMat, norm=TRUE)
entropyDiversity(myRandomMat)
entropyDiversity(myRandomMat, norm=TRUE)
sampleSpecialization(myRandomMat)
sampleSpecialization(myRandomMat,norm=TRUE)
``` |

