Home

/

CRAN

/

moc

/

moc: Fit a General Nonlinear Multivariate Mixture Model (MOC)

moc: Fit a General Nonlinear Multivariate Mixture Model (MOC)
In moc: General Nonlinear Multivariate Finite Mixtures

Description Usage Arguments Details Value Note Author(s) References See Also Examples

View source: R/moc.R

moc fits user-specified mixture models with one, two and three parameters distributions to multivariate data that can be of discrete or continuous type and a mix of both types. The likelihood for the vector of observations or repeated measurements for subject i has the form

f( y[i] | z[i], x[i]) = \Sum_k p_k(z[i],x[i]) h_k( y[i] | x[i])

Here, p_k() represent the mixture probability function and h_k() the conditional joint density of the observations y[i] given the covariates x[i] and mixture k. The user supplies either the joint or marginal conditional density(ies) of the components of y[i]. In the latter case, the joint conditional density is constructed by taking the product of the marginal densities (assuming conditional independence of the components).

The update.moc function allows to update or modify an already fitted moc object and to put constraint on its parameters.

moc(y, density = NULL, joint = FALSE, groups = 1,
    gmu = NULL, gshape = NULL, gextra = NULL,
    gmixture = inv.glogit, expected = NULL,
    pgmu = NULL, pgshape = NULL, pgextra = NULL, pgmix = NULL,
    check.length = TRUE, scale.weight = FALSE, wt = 1, data = NULL,
    ndigit = 10, gradtol = 0.0001, steptol = gradtol,
    iterlim = 100, print.level = 1, ...)

## S3 method for class 'moc'
update(object, groups = 1:object$groups, parm = object$coef,
           what = NULL, evaluate = FALSE, ...)

`y`	A matrix or data frame giving the vectors of observations (of length nvar) for each subject.
`object`	A `moc` object to update.
`density`	A function returning the conditional joint or marginal density of the observations and calling the location, shape and extra functions.
`joint`	Specify if the density gives the joint or common marginal density of the vector of observations. When using a joint density remember that this density will receive its parameters as matrices.
`groups`	Number of mixtures.
`gmu, gshape, gextra`	User-specified `lists` of functions returning the location, shape and extra density parameters for each mixture group and observation as a function of the parameters `pgmu`, `pgshape`, `pgextra` and covariates. These functions should return values in the proper range expected by the density function.
`gmixture`	A user-specified function of `pgmix`, giving the regression function of the mixture probabilities. The default is the inverse generalized logit with respect to the first group.
`expected`	A `list` of functions returning the expected response value that depends on the combined parameters `c(pgmu, pgshape, pgextra, pgmix)` for each mixture groups. Defaults to `gmu`.
`pgmu, pgshape, pgextra, pgmix`	Vector of initial estimates for the parameters of the location, shape, extra and mixture functions. Parameters always assume real values from (`-Inf`,`Inf`).
`wt`	Vector of subjects sampling weights. Currently the program uses standard sample-weighted \log(Likelihood) assuming fixed weights.
`scale.weight`	Logical value specifying if the vector of weights `wt` should be rescaled to sum to the sample size.
`check.length`	Logical value specifying check of rows length returned by the functions in `gextra` against the number of variables in y. Especially useful when the `density` requires more parameters than the number of variables like covariance parameters for multivariate normal.
`data`	An optional `data frame` or `list` containing some or all variables and functions required to fit the model. Due to changes from CRAN, `moc` may have difficulties finding all the objects defined in `data`. You might prefer `attach`ing it before running `moc` and its methods.
`ndigit, gradtol, steptol, iterlim, print.level, ...`	Arguments controlling `nlm`.
`parm`	new parameter starting values for update.moc, you can put constraints and fix values with argument `what`. It should be of the same length as the number of parameters in the `moc` object.
`what`	vector of integer values telling what to do with the parameters: 0 values correspond to free parameters. Negative values are fixed, parameters corresponding to the same negative numbers are fixed to the same values given in `parm`. Positive values are used for free parameters but the same positive values are constrained to be equal.
`evaluate`	boolean indicating if evaluation of the updated model should be performed (TRUE) or simply return a `call` (FALSE) for the new model.

The procedure minimizes the resulting -\log(Likelihood) without constraints, the parameters are all assumed to be real numbers. Thus the user should supply appropriate link functions and parameterize the density and parameters functions accordingly (see the examples). By default missing values in the response variables y are assumed to be missing at random, that is the likelihood for the subset of valid observations is just the marginal likelihood for this subset in each mixture. Specific treatment of missing values in the response variables can be achieved by handling them explicitly in the functions density, gmixture, gmu, gshape and gextra. The function density can return NA and yields the default treatment of missing values in the response. The functions gmixture, gmu, gshape and gextra, cannot return NA thus missing values in the covariates should be treated explicitly by these functions.

The lists of functions gmu, gshape, gextra returns the location, shape and extra parameters to the density for each observation and mixture group as a function of pgmu, pgshape and pgextra and covariates. Each function should return a vector of length nvar or a matrix of such vectors (one vector for each subject). The first function in the list is for the first group, the second function for the second group and so on. The functions in the same list share the same parameters but the different lists have different parameters (see the examples).

Setting the attributes parameters for functions gmu, gshape, gextra and gmixture will generate parameter labels in the printout of the object.

The residuals, fitted values and posterior probabilities are obtained through the use of the methods residuals, fitted and post.

A list of class moc is returned that contains all of the relevant information calculated, including error code generated by nlm. The printed output includes -2 * log(Likelihood), the corresponding df, AIC, BIC, entropy and ICL-BIC (entropy corrected BIC, see AIC.moc), mean mixture probabilities, mean expected and observed values for each mixture group, the maximum likelihood estimates and standard errors.

The expected function is used to compute the fitted values and response residuals (not deviance). It is especially useful when the expected value differs from the location parameters as for censored normal or zero inflated Poisson distributions.

The method of fixed sample-weight provides design-consistent parameters estimates. However, for the moment the program does not provide any methods to include sampling variances resulting from weights estimation. If the user wants to incorporate weights estimation sampling variances it could be achieved, for example, by including moc model estimation in a jackknife loop.

Be aware that degrees of freedom (df) for mixture models may be useless (if not meaningless) and likelihood-ratio of apparently nested models may not converge to a Chi-Square with corresponding df.

Bernard Boulerice <bernard.boulerice.bb@gmail.com>

McLachlan, G. and Peel, D. (2000) Finite mixture models, Wiley-Interscience, New York.

Lindsay, B. G. (1983) The Geometry of Mixture Likelihoods: A General Theory, Annals of Statistics, 11, pp. 86–94.

Biernacki, C., Celeux, G., Govaert, G. (2000) Assessing a Mixture Model with the Integrated Completed Likelihood, IEEE Transaction on Pattern Analysis and Machine Learning, 22, pp. 719–725.

Lindsay, B. G. and Roeder, K. (1992) Residual diagnostics for mixture models, Journal of the American Statistical Association, 87, pp. 785–794.

print.moc, plot.moc, residuals.moc, plot.residuals.moc, fitted.moc, post.moc, AIC.moc, logLik.moc, obsfit.moc, nlm

data(moc.dat)

cnorm.dat<-list()   #This is used as a container for functions and data

# Censored Normal (marginal density)

cnorm<-function(x,mu,sig,min,max)
{mi<-(x == min)*1
ma<-(x == max)*1
mi*pnorm((min-mu)/sig)+ma*(1-pnorm((max-mu)/sig))+
(1-mi-ma)*dnorm((x-mu)/sig)/sig}

# For this data set the range of the dependent variables is [0,14]

cnorm.dat$cnorm1<-function(x,mu,sig,...) {cnorm(x,mu,sig,0,14)}

# We have 4 observations

cnorm.dat$gmu1<- list(
  Group1 = function(pmu) {t(1)%*%rep(pmu[1],4)},
  Group2 = function(pmu) {t(1)%*%rep(pmu[2],4)},
  Group3 = function(pmu) {t(1)%*%rep(pmu[3],4)})

attr(cnorm.dat$gmu1,"parameters")<-c("  cons1","  cons2","  cons3")

# Expected value of a general censored normal

cmean<-function(mu,sig,min,max) {
max-(max-mu)*pnorm((max-mu)/sig)+(min-mu)*pnorm((min-mu)/sig)-
sig*(dnorm((max-mu)/sig)-dnorm((min-mu)/sig)) }

# Homogeneous variances

cnorm.dat$gshape1<- list(
  Group1 = function(psh) {t(1)%*%rep(exp(psh[1]),4)},
  Group2 = function(psh) {t(1)%*%rep(exp(psh[1]),4)},
  Group3 = function(psh) {t(1)%*%rep(exp(psh[1]),4)})

attr(cnorm.dat$gshape1,"parameters")<-c("  log(std.dev)")

cnorm.dat$cmean1<- list(
  Group1 = function(p) {cmean(cnorm.dat$gmu1[[1]](p[1:3]),cnorm.dat$gshape1[[1]](p[4]),0,14) },
  Group2 = function(p) {cmean(cnorm.dat$gmu1[[2]](p[1:3]),cnorm.dat$gshape1[[2]](p[4]),0,14) },
  Group3 = function(p) {cmean(cnorm.dat$gmu1[[3]](p[1:3]),cnorm.dat$gshape1[[3]](p[4]),0,14) })

moc1<-
moc(moc.dat[,1:4],density=cnorm1,groups=3,gmu=gmu1,gshape=gshape1,
expected=cmean1,pgmu=c(0.5, 2, 5),pgshape=c(0.7),pgmix=c(-0.6, -2.0),
data=cnorm.dat,gradtol=1E-4)

print(moc1)

## Not run: 
# Heterogeneous variances across mixture groups

cnorm.dat$gshape2<-list(
  Group1 = function(psh) {t(1)%*%rep(exp(psh[1]),4)},
  Group2 = function(psh) {t(1)%*%rep(exp(psh[2]),4)},
  Group3 = function(psh) {t(1)%*%rep(exp(psh[3]),4)})

cnorm.dat$cmean2<-list(
  Group1 = function(p) {cmean(cnorm.dat$gmu1[[1]](p[1:3]),cnorm.dat$gshape2[[1]](p[4:6]),0,14) },
  Group2 = function(p) {cmean(cnorm.dat$gmu1[[2]](p[1:3]),cnorm.dat$gshape2[[2]](p[4:6]),0,14) },
  Group3 = function(p) {cmean(cnorm.dat$gmu1[[3]](p[1:3]),cnorm.dat$gshape2[[3]](p[4:6]),0,14) })

moc2<-
moc(moc.dat[,1:4],density=cnorm1,groups=3,gmu=gmu1,gshape=gshape2,
expected=cmean2,pgmu=moc1$coef[1:3],pgshape=c(rep(moc1$coef[4],3)),
pgmix=moc1$coef[5:6],data=cnorm.dat,gradtol=1E-4)


## End(Not run)
# Heterogeneous variances across time

cnorm.dat$gshape3<-list(
  Group1 = function(psh) {exp(t(1)%*%psh[1:4])},
  Group2 = function(psh) {exp(t(1)%*%psh[1:4])},
  Group3 = function(psh) {exp(t(1)%*%psh[1:4])})

cnorm.dat$cmean3<-list(
  Group1 = function(p) {cmean(cnorm.dat$gmu1[[1]](p[1:3]),cnorm.dat$gshape3[[1]](p[4:7]),0,14)},
  Group2 = function(p) {cmean(cnorm.dat$gmu1[[2]](p[1:3]),cnorm.dat$gshape3[[2]](p[4:7]),0,14)},
  Group3 = function(p) {cmean(cnorm.dat$gmu1[[3]](p[1:3]),cnorm.dat$gshape3[[3]](p[4:7]),0,14)})

moc3<-
moc(moc.dat[,1:4],density=cnorm1,groups=3,gmu=gmu1,gshape=gshape3,
expected=cmean3,pgmu=moc1$coef[1:3],pgshape=c(rep(moc1$coef[4],4)),
pgmix=moc1$coef[5:6],data=cnorm.dat,gradtol=1E-4)

print(moc3)

cnorm.dat$ages<-cbind(1.7,3,4.2,5.6)

## Not run: 
# Last group is a linear function of time

cnorm.dat$gmu1t<-list(
  Group1 = function(pmu) {pmu[1]*cnorm.dat$ages^0},
  Group2 = function(pmu) {pmu[2]+pmu[3]*cnorm.dat$ages},
  Group3 = function(pmu) {pmu[4]*cnorm.dat$ages^0})

cnorm.dat$cmean1t<-list(
  Group1 = function(p) {cmean(cnorm.dat$gmu1t[[1]](p[1:4]),cnorm.dat$gshape1[[1]](p[5]),0,14)},
  Group2 = function(p) {cmean(cnorm.dat$gmu1t[[2]](p[1:4]),cnorm.dat$gshape1[[2]](p[5]),0,14)},
  Group3 = function(p) {cmean(cnorm.dat$gmu1t[[3]](p[1:4]),cnorm.dat$gshape1[[3]](p[5]),0,14)})

moc4<-
moc(moc.dat[,1:4],density=cnorm1,groups=3,gmu=gmu1t,gshape=gshape1,
expected=cmean1t,pgmu=append(moc1$coef[1:3],0,after=2),
pgshape=c(moc1$coef[4]),pgmix=moc1$coef[5:6],data=cnorm.dat,gradtol=1E-4)


# Zero inflated Poisson log-linear in time for the third group
# Be careful dpois requires integer values

zip<- function(x,la,shape=1,extra)
{ mix<- exp(extra)/(1+exp(extra))
  mix*(x == 0)+(1-mix)*dpois(x,la) }


## End(Not run)


gmup<-list(
  Group1 = function(pmu) {exp(pmu[1]*cnorm.dat$ages^0)},
  Group2 = function(pmu) {exp(pmu[2]+pmu[3]*cnorm.dat$ages)},
  Group3 = function(pmu) {exp(pmu[4]*cnorm.dat$ages^0)})

## Not run: 
zipfit<-list(
  Group1 = function(p) { gmup[[1]](p)/(1+exp(p[5]))},
  Group2 = function(p) { gmup[[2]](p)/(1+exp(p[5]))},
  Group3 = function(p) { gmup[[3]](p)/(1+exp(p[5]))})

gextrap<-list(
  Group1 = function(pxt) {t(1)%*%rep(pxt[1],4)},
  Group2 = function(pxt) {t(1)%*%rep(pxt[1],4)},
  Group3 = function(pxt) {t(1)%*%rep(pxt[1],4)})

moc5<-
moc(moc.dat[,1:4],density=zip,groups=3,gmu=gmup,gextra=gextrap,
expected = zipfit,pgmu=c(-0.6, 0.64,0, 1.6),pgextra=c(-3),
pgmix=c(-0.7, -2), gradtol=1E-4)


## End(Not run)

# Standard Poisson with mixture depending on time independent
# dichotomous covariate
# Be aware that dpoiss require integer values

dumm<-moc.dat[,5]-1
gmixt<-function(pm){
mix<-cbind(1,dumm)%*%matrix(pm[1:4],2,2)
cbind(1,exp(mix))/(1+apply(exp(mix),1,sum))}

poiss<-function(x,la,...) {dpois(x,la)}

moc6<-
moc(moc.dat[,1:4],density=poiss,groups=3,gmu=gmup,gmixture=gmixt,
pgmu=c(-0.7,2.0, 0, 1.5),pgmix=c(-0.2,-1, -1 ,-2),gradtol=1E-4)

print(moc6)

obsfit.moc(moc6,along=dumm)

entropy(moc1,moc3,moc6)

## Not run: 
plot(moc6,against=cnorm.dat$ages,main="MOC profiles",xlab="age",ylab="Y")
plot(residuals(moc6))


## End(Not run)

#More extended examples are available in the Examples directory of the package.

moc documentation built on May 1, 2019, 7:32 p.m.

moc index

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

moc
General Nonlinear Multivariate Finite Mixtures

moc: Fit a General Nonlinear Multivariate Mixture Model (MOC)
In moc: General Nonlinear Multivariate Finite Mixtures

Description

Usage

Arguments

Details

Value

Note

Author(s)

References

See Also

Examples

Related to moc in moc...

R Package Documentation

Browse R Packages

We want your feedback!

moc General Nonlinear Multivariate Finite Mixtures

moc: Fit a General Nonlinear Multivariate Mixture Model (MOC) In moc: General Nonlinear Multivariate Finite Mixtures

Description

Usage

Arguments

Details

Value

Note

Author(s)

References

See Also

Examples

Related to moc in moc...

R Package Documentation

Browse R Packages

We want your feedback!

moc
General Nonlinear Multivariate Finite Mixtures

moc: Fit a General Nonlinear Multivariate Mixture Model (MOC)
In moc: General Nonlinear Multivariate Finite Mixtures