EM_random: Wrapper function with random starting values for the EM...
In johannabertl/SelectionMix: Negative binomial mixture model

Description Usage Arguments Details Value Author(s) Examples

The function EM is called multiple times, each time with a random starting value

1 2	EM_random(alpha.interval, beta.interval, pvec, precision, nstart, x.syn, x.non, cvec, iter, epsilon = NULL, ui, ci, parallel = F)

`pvec`	vector of length k-1
`precision`	positive integer to determine the variance of the multinomial draws (see details).
`nstart`	Integer. Number of random starting values
`parallel`	Logical. If true, the function mclapply is used for parallelization. Only makes sense when multiple cores are available. The other parameters are the same as in `rnegbinmix::EM`.

The starting values for the parameters alpha and beta are drawn from a uniform distribution on two user-defined intervals. The starting values for p1, ..., pk-1 are drawn from a multinomial distribution. The variance of the multinomial distribution is determined by the parameter precision: a multinomial sample of the size precision is drawn with the function rmultinom, afterwards the multinomial sample is scaled to sum up to one. The larger precision, the smaller the variance. To make sure that no entry of the vector is zero, a pseudo count of 0.05 is added to each component before scaling.

A list consisting of nstart objects, each one a list just like the output of EM (trajectory of theta, Q and convergence code), see rnegbinmix::EM.

Johanna Bertl

# Simulate dataset of synonymous and non-synonymous mutations

c1 = 0.5; c2 = 10
p1 = 0.2; p2 = 0.8
alpha = 10
beta = 5

mutations = rnegbinmix_syn_nonsyn(n=3000, alpha=alpha, beta=beta, c=c(c1, c2), p = c(p1, p2))

# EM algorithm

# Constraints: ui %*% theta - ci >= 0
# alpha > 0, beta > 0, 0 <= p1 <=1

ui = rbind(c(1, 0, 0), c(0, 1, 0), c(0, 0, 1), c(0, 0, -1))
ci = c(0, 0, 0, -1)

# Starting values
alpha.interval = c(1, 20)
beta.interval = c(1, 10)
pvec = p1
precision=20
nstart = 5

# Run EM algorithm
EMres = EM_random(alpha.interval = alpha.interval, beta.interval = beta.interval, pvec = pvec, precision = precision, nstart=nstart, x.syn = mutations$Syn, x.non = mutations$Non, cvec=c(c1, c2), iter=25, ui = ui, ci=ci, parallel=T)

# Plot results
par(mfrow=c(3,2))
 for(i in 1:5) matplot(EMres[[i]]$theta, t="l")
par(mfrow=c(1,1))