bnrf1: BNRF1 Gene DNA sequences: Epstein-Barr and Herpes
In VLMC: Variable Length Markov Chains ('VLMC') Models

bnrf1

R Documentation

BNRF1 Gene DNA sequences: Epstein-Barr and Herpes

Description

Two gene DNA data “discrete time series”,

bnrf1EB: the BNRF1 gene from the Epstein-Barr virus,
bnrf1HV: the BNRF1 gene from the herpes virus.

Usage

data(bnrf1)

Format

The EB sequence is of length 3954, whereas the HV has 3741 nucleotides. Both are R factors with the four levels c("a","c","g","t").

Author(s)

Martin Maechler (original packaging for R).

Source

See the references; data used to be at ⁠https://anson.ucdavis.edu/~shumway/tsa.html⁠, and are now available in CRAN package astsa, e.g., bnrf1ebv.

References

Shumway, R. and Stoffer, D. (2000) Time Series Analysis and its Applications. Springer Texts in Statistics.

Examples

data(bnrf1)
bnrf1EB[1:500]
table(bnrf1EB)
table(bnrf1HV)
n <- length(bnrf1HV)
table(t = bnrf1HV[-1], "t-1" = bnrf1HV[-n])

plot(as.integer(bnrf1EB[1:500]), type = "b")


## Simplistic gene matching:
percent.eq <- sapply(0:200,
           function(i) 100 * sum(bnrf1EB[(1+i):(n+i)] ==  bnrf1HV))/n
plot.ts(percent.eq)

VLMC documentation built on Sept. 11, 2024, 5:28 p.m.