Description Usage Arguments Details Value Warning Note Author(s) References See Also Examples

View source: R/family.mixture.R

Estimates the five parameters of a mixture of two univariate normal distributions by maximum likelihood estimation.

1 2 3 |

`lphi,lmu,lsd` |
Link functions for the parameters |

`iphi` |
Initial value for |

`imu1, imu2` |
Optional initial value for |

`isd1, isd2` |
Optional initial value for |

`qmu` |
Vector with two values giving the probabilities relating to the sample
quantiles for obtaining initial values for |

`eq.sd` |
Logical indicating whether the two standard deviations should be
constrained to be equal. If |

`nsimEIM` |
See |

`zero` |
May be an integer vector
specifying which linear/additive predictors are modelled as
intercept-only. If given, the value or values can be from the
set |

The probability density function can be loosely written as

*
f(y) = phi * N(mu1, sd1) + (1-phi) * N(mu2, sd2)*

where *phi* is the probability an observation belongs
to the first group.
The parameters *mu1* and *mu2* are the means, and
*sd1* and *sd2* are the standard deviations.
The parameter *phi* satisfies *0 < phi < 1*.
The mean of *Y* is
*phi*mu1 + (1-phi)*mu2*
and this is returned as the fitted values.
By default, the five linear/additive predictors are
*
(logit(phi), mu1, log(sd1), mu2, log(sd2))^T*.
If `eq.sd = TRUE`

then *sd1=sd2*
is enforced.

An object of class `"vglmff"`

(see `vglmff-class`

).
The object is used by modelling functions such as `vglm`

,
and `vgam`

.

Numerical problems can occur and
half-stepping is not uncommon.
If failure to converge occurs, try inputting better initial values,
e.g., by using `iphi`

,
`qmu`

,
`imu1`

,
`imu2`

,
`isd1`

,
`isd2`

,
etc.

This VGAM family function is experimental and should be used with care.

Fitting this model successfully to data can be difficult due to
numerical problems and ill-conditioned data. It pays to fit the
model several times with different initial values and check that
the best fit looks reasonable. Plotting the results is recommended.
This function works better as *mu1* and *mu2*
become more different.

Convergence can be slow, especially when the two component
distributions are not well separated.
The default control argument `trace = TRUE`

is to encourage
monitoring convergence.
Having `eq.sd = TRUE`

often makes the overall optimization problem
easier.

T. W. Yee

McLachlan, G. J. and Peel, D. (2000).
*Finite Mixture Models*.
New York: Wiley.

Everitt, B. S. and Hand, D. J. (1981).
*Finite Mixture Distributions*.
London: Chapman & Hall.

`uninormal`

,
`Normal`

,
`mix2poisson`

.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | ```
## Not run: mu1 <- 99; mu2 <- 150; nn <- 1000
sd1 <- sd2 <- exp(3)
(phi <- logitlink(-1, inverse = TRUE))
mdata <- data.frame(y = ifelse(runif(nn) < phi, rnorm(nn, mu1, sd1),
rnorm(nn, mu2, sd2)))
fit <- vglm(y ~ 1, mix2normal(eq.sd = TRUE), data = mdata)
# Compare the results
cfit <- coef(fit)
round(rbind('Estimated' = c(logitlink(cfit[1], inverse = TRUE),
cfit[2], exp(cfit[3]), cfit[4]),
'Truth' = c(phi, mu1, sd1, mu2)), digits = 2)
# Plot the results
xx <- with(mdata, seq(min(y), max(y), len = 200))
plot(xx, (1-phi) * dnorm(xx, mu2, sd2), type = "l", xlab = "y",
main = "Orange = estimate, blue = truth",
col = "blue", ylab = "Density")
phi.est <- logitlink(coef(fit)[1], inverse = TRUE)
sd.est <- exp(coef(fit)[3])
lines(xx, phi*dnorm(xx, mu1, sd1), col = "blue")
lines(xx, phi.est * dnorm(xx, Coef(fit)[2], sd.est), col = "orange")
lines(xx, (1-phi.est) * dnorm(xx, Coef(fit)[4], sd.est), col = "orange")
abline(v = Coef(fit)[c(2,4)], lty = 2, col = "orange")
abline(v = c(mu1, mu2), lty = 2, col = "blue")
## End(Not run)
``` |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.