Description Usage Arguments Details Value Note Author(s) References See Also Examples
moc
fits user-specified mixture models with one, two
and three parameters distributions to multivariate data that can be of
discrete or continuous type and a mix of both types. The likelihood for the
vector of observations or repeated measurements for
subject i has the form
f( y[i] | z[i], x[i]) = \Sum_k p_k(z[i],x[i]) h_k( y[i] | x[i])
Here, p_k() represent the mixture probability function and h_k() the conditional joint density of the observations y[i] given the covariates x[i] and mixture k. The user supplies either the joint or marginal conditional density(ies) of the components of y[i]. In the latter case, the joint conditional density is constructed by taking the product of the marginal densities (assuming conditional independence of the components).
The update.moc
function allows to update or modify an already
fitted moc
object and to put constraint on its parameters.
1 2 3 4 5 6 7 8 9 10 11 | moc(y, density = NULL, joint = FALSE, groups = 1,
gmu = NULL, gshape = NULL, gextra = NULL,
gmixture = inv.glogit, expected = NULL,
pgmu = NULL, pgshape = NULL, pgextra = NULL, pgmix = NULL,
check.length = TRUE, scale.weight = FALSE, wt = 1, data = NULL,
ndigit = 10, gradtol = 0.0001, steptol = gradtol,
iterlim = 100, print.level = 1, ...)
## S3 method for class 'moc'
update(object, groups = 1:object$groups, parm = object$coef,
what = NULL, evaluate = FALSE, ...)
|
y |
A matrix or data frame giving the vectors of observations (of length nvar) for each subject. |
object |
A |
density |
A function returning the conditional joint or marginal density of the observations and calling the location, shape and extra functions. |
joint |
Specify if the density gives the joint or common marginal density of the vector of observations. When using a joint density remember that this density will receive its parameters as matrices. |
groups |
Number of mixtures. |
gmu, gshape, gextra |
User-specified |
gmixture |
A user-specified function of |
expected |
A |
pgmu, pgshape, pgextra, pgmix |
Vector of initial estimates for the
parameters of the location, shape, extra and mixture
functions. Parameters always assume real values from ( |
wt |
Vector of subjects sampling weights. Currently the program uses standard sample-weighted \log(Likelihood) assuming fixed weights. |
scale.weight |
Logical value specifying if the vector of weights
|
check.length |
Logical value specifying check of rows
length returned by the functions in |
data |
An optional |
ndigit, gradtol, steptol, iterlim, print.level, ... |
Arguments controlling |
parm |
new parameter starting values for update.moc, you can put
constraints and fix values with argument |
what |
vector of integer values telling what to do with the parameters:
|
evaluate |
boolean indicating if evaluation of the updated model
should be performed (TRUE) or simply return a |
The procedure minimizes the resulting -\log(Likelihood) without
constraints, the parameters are all assumed to be real numbers.
Thus the user should supply appropriate link functions and parameterize
the density and parameters functions accordingly (see the examples).
By default missing values in the response variables y are assumed
to be missing at random, that is the likelihood for the subset
of valid observations is just the marginal likelihood for this
subset in each mixture. Specific treatment of missing values in the
response variables can be achieved by handling them explicitly in
the functions density
, gmixture
, gmu
, gshape
and gextra
. The function density
can return NA
and yields the default treatment of missing values in the response.
The functions gmixture
, gmu
, gshape
and
gextra
, cannot return NA
thus missing values in the
covariates should be treated explicitly by these functions.
The lists
of functions gmu
, gshape
,
gextra
returns the location, shape and extra parameters
to the density for each observation and mixture group as a function
of pgmu
, pgshape
and pgextra
and covariates.
Each function should return a vector of length nvar or a
matrix of such vectors (one vector for each subject).
The first function in the list
is for the first group, the
second function for the second group and so on.
The functions in the same list
share the same parameters
but the different lists
have different parameters
(see the examples).
Setting the attributes
parameters for functions
gmu
, gshape
, gextra
and gmixture
will
generate parameter labels in the printout of the object.
The residuals, fitted values and posterior probabilities
are obtained through the use of the methods residuals
, fitted
and post
.
A list of class moc
is returned that contains all of the
relevant information calculated, including error code generated by
nlm
.
The printed output includes -2 * log(Likelihood),
the corresponding df, AIC, BIC, entropy and ICL-BIC
(entropy corrected BIC, see AIC.moc
),
mean mixture probabilities, mean expected and observed values for
each mixture group, the maximum likelihood estimates and standard
errors.
The expected
function is used to compute the fitted values
and response residuals (not deviance). It is especially useful when
the expected value differs from the location parameters as for censored
normal or zero inflated Poisson distributions.
The method of fixed sample-weight provides design-consistent parameters
estimates. However, for the moment the program does not provide any
methods to include sampling variances resulting from weights
estimation. If the user wants to incorporate weights estimation
sampling variances it could be achieved, for example, by including
moc
model estimation in a jackknife loop.
Be aware that degrees of freedom (df) for mixture models may be useless (if not meaningless) and likelihood-ratio of apparently nested models may not converge to a Chi-Square with corresponding df.
Bernard Boulerice <bernard.boulerice.bb@gmail.com>
McLachlan, G. and Peel, D. (2000) Finite mixture models, Wiley-Interscience, New York.
Lindsay, B. G. (1983) The Geometry of Mixture Likelihoods: A General Theory, Annals of Statistics, 11, pp. 86–94.
Biernacki, C., Celeux, G., Govaert, G. (2000) Assessing a Mixture Model with the Integrated Completed Likelihood, IEEE Transaction on Pattern Analysis and Machine Learning, 22, pp. 719–725.
Lindsay, B. G. and Roeder, K. (1992) Residual diagnostics for mixture models, Journal of the American Statistical Association, 87, pp. 785–794.
print.moc
, plot.moc
, residuals.moc
,
plot.residuals.moc
, fitted.moc
,
post.moc
, AIC.moc
,
logLik.moc
, obsfit.moc
, nlm
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 | data(moc.dat)
cnorm.dat<-list() #This is used as a container for functions and data
# Censored Normal (marginal density)
cnorm<-function(x,mu,sig,min,max)
{mi<-(x == min)*1
ma<-(x == max)*1
mi*pnorm((min-mu)/sig)+ma*(1-pnorm((max-mu)/sig))+
(1-mi-ma)*dnorm((x-mu)/sig)/sig}
# For this data set the range of the dependent variables is [0,14]
cnorm.dat$cnorm1<-function(x,mu,sig,...) {cnorm(x,mu,sig,0,14)}
# We have 4 observations
cnorm.dat$gmu1<- list(
Group1 = function(pmu) {t(1)%*%rep(pmu[1],4)},
Group2 = function(pmu) {t(1)%*%rep(pmu[2],4)},
Group3 = function(pmu) {t(1)%*%rep(pmu[3],4)})
attr(cnorm.dat$gmu1,"parameters")<-c(" cons1"," cons2"," cons3")
# Expected value of a general censored normal
cmean<-function(mu,sig,min,max) {
max-(max-mu)*pnorm((max-mu)/sig)+(min-mu)*pnorm((min-mu)/sig)-
sig*(dnorm((max-mu)/sig)-dnorm((min-mu)/sig)) }
# Homogeneous variances
cnorm.dat$gshape1<- list(
Group1 = function(psh) {t(1)%*%rep(exp(psh[1]),4)},
Group2 = function(psh) {t(1)%*%rep(exp(psh[1]),4)},
Group3 = function(psh) {t(1)%*%rep(exp(psh[1]),4)})
attr(cnorm.dat$gshape1,"parameters")<-c(" log(std.dev)")
cnorm.dat$cmean1<- list(
Group1 = function(p) {cmean(cnorm.dat$gmu1[[1]](p[1:3]),cnorm.dat$gshape1[[1]](p[4]),0,14) },
Group2 = function(p) {cmean(cnorm.dat$gmu1[[2]](p[1:3]),cnorm.dat$gshape1[[2]](p[4]),0,14) },
Group3 = function(p) {cmean(cnorm.dat$gmu1[[3]](p[1:3]),cnorm.dat$gshape1[[3]](p[4]),0,14) })
moc1<-
moc(moc.dat[,1:4],density=cnorm1,groups=3,gmu=gmu1,gshape=gshape1,
expected=cmean1,pgmu=c(0.5, 2, 5),pgshape=c(0.7),pgmix=c(-0.6, -2.0),
data=cnorm.dat,gradtol=1E-4)
print(moc1)
## Not run:
# Heterogeneous variances across mixture groups
cnorm.dat$gshape2<-list(
Group1 = function(psh) {t(1)%*%rep(exp(psh[1]),4)},
Group2 = function(psh) {t(1)%*%rep(exp(psh[2]),4)},
Group3 = function(psh) {t(1)%*%rep(exp(psh[3]),4)})
cnorm.dat$cmean2<-list(
Group1 = function(p) {cmean(cnorm.dat$gmu1[[1]](p[1:3]),cnorm.dat$gshape2[[1]](p[4:6]),0,14) },
Group2 = function(p) {cmean(cnorm.dat$gmu1[[2]](p[1:3]),cnorm.dat$gshape2[[2]](p[4:6]),0,14) },
Group3 = function(p) {cmean(cnorm.dat$gmu1[[3]](p[1:3]),cnorm.dat$gshape2[[3]](p[4:6]),0,14) })
moc2<-
moc(moc.dat[,1:4],density=cnorm1,groups=3,gmu=gmu1,gshape=gshape2,
expected=cmean2,pgmu=moc1$coef[1:3],pgshape=c(rep(moc1$coef[4],3)),
pgmix=moc1$coef[5:6],data=cnorm.dat,gradtol=1E-4)
## End(Not run)
# Heterogeneous variances across time
cnorm.dat$gshape3<-list(
Group1 = function(psh) {exp(t(1)%*%psh[1:4])},
Group2 = function(psh) {exp(t(1)%*%psh[1:4])},
Group3 = function(psh) {exp(t(1)%*%psh[1:4])})
cnorm.dat$cmean3<-list(
Group1 = function(p) {cmean(cnorm.dat$gmu1[[1]](p[1:3]),cnorm.dat$gshape3[[1]](p[4:7]),0,14)},
Group2 = function(p) {cmean(cnorm.dat$gmu1[[2]](p[1:3]),cnorm.dat$gshape3[[2]](p[4:7]),0,14)},
Group3 = function(p) {cmean(cnorm.dat$gmu1[[3]](p[1:3]),cnorm.dat$gshape3[[3]](p[4:7]),0,14)})
moc3<-
moc(moc.dat[,1:4],density=cnorm1,groups=3,gmu=gmu1,gshape=gshape3,
expected=cmean3,pgmu=moc1$coef[1:3],pgshape=c(rep(moc1$coef[4],4)),
pgmix=moc1$coef[5:6],data=cnorm.dat,gradtol=1E-4)
print(moc3)
cnorm.dat$ages<-cbind(1.7,3,4.2,5.6)
## Not run:
# Last group is a linear function of time
cnorm.dat$gmu1t<-list(
Group1 = function(pmu) {pmu[1]*cnorm.dat$ages^0},
Group2 = function(pmu) {pmu[2]+pmu[3]*cnorm.dat$ages},
Group3 = function(pmu) {pmu[4]*cnorm.dat$ages^0})
cnorm.dat$cmean1t<-list(
Group1 = function(p) {cmean(cnorm.dat$gmu1t[[1]](p[1:4]),cnorm.dat$gshape1[[1]](p[5]),0,14)},
Group2 = function(p) {cmean(cnorm.dat$gmu1t[[2]](p[1:4]),cnorm.dat$gshape1[[2]](p[5]),0,14)},
Group3 = function(p) {cmean(cnorm.dat$gmu1t[[3]](p[1:4]),cnorm.dat$gshape1[[3]](p[5]),0,14)})
moc4<-
moc(moc.dat[,1:4],density=cnorm1,groups=3,gmu=gmu1t,gshape=gshape1,
expected=cmean1t,pgmu=append(moc1$coef[1:3],0,after=2),
pgshape=c(moc1$coef[4]),pgmix=moc1$coef[5:6],data=cnorm.dat,gradtol=1E-4)
# Zero inflated Poisson log-linear in time for the third group
# Be careful dpois requires integer values
zip<- function(x,la,shape=1,extra)
{ mix<- exp(extra)/(1+exp(extra))
mix*(x == 0)+(1-mix)*dpois(x,la) }
## End(Not run)
gmup<-list(
Group1 = function(pmu) {exp(pmu[1]*cnorm.dat$ages^0)},
Group2 = function(pmu) {exp(pmu[2]+pmu[3]*cnorm.dat$ages)},
Group3 = function(pmu) {exp(pmu[4]*cnorm.dat$ages^0)})
## Not run:
zipfit<-list(
Group1 = function(p) { gmup[[1]](p)/(1+exp(p[5]))},
Group2 = function(p) { gmup[[2]](p)/(1+exp(p[5]))},
Group3 = function(p) { gmup[[3]](p)/(1+exp(p[5]))})
gextrap<-list(
Group1 = function(pxt) {t(1)%*%rep(pxt[1],4)},
Group2 = function(pxt) {t(1)%*%rep(pxt[1],4)},
Group3 = function(pxt) {t(1)%*%rep(pxt[1],4)})
moc5<-
moc(moc.dat[,1:4],density=zip,groups=3,gmu=gmup,gextra=gextrap,
expected = zipfit,pgmu=c(-0.6, 0.64,0, 1.6),pgextra=c(-3),
pgmix=c(-0.7, -2), gradtol=1E-4)
## End(Not run)
# Standard Poisson with mixture depending on time independent
# dichotomous covariate
# Be aware that dpoiss require integer values
dumm<-moc.dat[,5]-1
gmixt<-function(pm){
mix<-cbind(1,dumm)%*%matrix(pm[1:4],2,2)
cbind(1,exp(mix))/(1+apply(exp(mix),1,sum))}
poiss<-function(x,la,...) {dpois(x,la)}
moc6<-
moc(moc.dat[,1:4],density=poiss,groups=3,gmu=gmup,gmixture=gmixt,
pgmu=c(-0.7,2.0, 0, 1.5),pgmix=c(-0.2,-1, -1 ,-2),gradtol=1E-4)
print(moc6)
obsfit.moc(moc6,along=dumm)
entropy(moc1,moc3,moc6)
## Not run:
plot(moc6,against=cnorm.dat$ages,main="MOC profiles",xlab="age",ylab="Y")
plot(residuals(moc6))
## End(Not run)
#More extended examples are available in the Examples directory of the package.
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.