ridgeGGMmixture: Ridge penalized estimation of a mixture of GGMs.
In porridge: Ridge-Type Penalized Estimation of a Potpourri of Models

ridgeGGMmixture

R Documentation

Ridge penalized estimation of a mixture of GGMs.

Description

Function that estimates a mixture of GGMs (Gaussian graphical models) through a ridge penalized EM (Expectation-Maximization) algorithm as described in Aflakparast et al. (2018).

Usage

ridgeGGMmixture(Y, K, lambda, target,                                    
                iWeights=matrix(sample(seq(0+1/nrow(Y), 
                                1-1/nrow(Y), by=1/(2*nrow(Y))), 
                                nrow(Y)*K, replace=TRUE), 
                                nrow=nrow(Y), ncol=K),
                nInit=100, minSuccDiff=10^(-10),
                minMixProp=0.01)

Arguments

`Y`	Data `matrix` with samples as rows and variates as columns.
`K`	A `numeric`, specifying the number of mixture components.
`lambda`	A positive `numeric` representing the ridge penalty parameter.
`target`	A semi-positive definite target `matrix` towards which the estimate is shrunken.
`iWeights`	Sample-specific positive component weight `matrix`. Rows correspond to samples, while columns to components.
`nInit`	A `numeric` specifying the number of iterations.
`minSuccDiff`	A `numeric`: minimum successive difference (in terms of their penalized loglikelihood) between two succesive estimates to be achieved.
`minMixProp`	Smallest mixing probability tolerated.

Details

The data are assumed to follow a mixture of K Gaussian graphical models:

\mathbf{Y}_i \sim \sum\nolimits_{k=1}^K \theta_k \mathcal{N}(\boldsymbol{\mu}_k, \boldsymbol{\Omega}_k^{-1}),

where \theta_k = P(Z_i = k) is the probability that the i-th sample stems from the k-the component. The model parameters are estimated by ridge penalized likelihood maximization:

\sum\nolimits_{i=1}^n \log [ \sum\nolimits_{k=1}^K \theta_k P(\mathbf{Y}_i \, | \, Z_i = k; \boldsymbol{\mu}_k, \boldsymbol{\Omega}_k) ] + \lambda \sum\nolimits_{k=1}^K \| \boldsymbol{\Omega}_k - \mathbf{T}_k \|_F^2,

where \lambda is the penalty parameter and \mathbf{T}_k is the shrinkage target of the k-th component's precision matrix. This function yields the maximizer of this penalized loglikelihood, which is found by means of a penalized EM algorithm.

Value

The function returns a regularized inverse covariance list-object with slots:

`mu`	A `matrix` with estimated mean vectors are rows.
`P`	A `matrix` with estimated mixture precision matrices stacked on top of each other.
`pi`	A `numeric` wth estimated mixing probabilities.
`weights`	A `matrix` wth estimated component memberships.
`penLL`	A `numeric` with the penalized loglikelihood of the estimated model.

Note

The elements of iWeights may be larger than one as they are rescaled internally to sum to one.

Author(s)

W.N. van Wieringen, M. Aflakparast.

References

Aflakparast, M., de Gunst, M.C.M., van Wieringen, W.N. (2018), "Reconstruction of molecular network evolution from cross-sectional omics data", Biometrical Journal, 60(3), 547-563.

Examples

# define mixing proportions
pis <- c(0.2, 0.3, 0.4)

# set dimension and sample size
p <- 5
n <- 100

# define population covariance matrices
diags       <- list(rep(1, p), 
                    rep(0.5, p-1), 
                    rep(0.25, p-2), 
                    rep(0.1, p-3))
Omega       <- as.matrix(Matrix::bandSparse(p, 
                                            k=-c(0:3), 
                                            diag=c(diags), 
                                            symm=TRUE))
Sigma1      <- solve(Omega)
Omega       <- matrix(0.3, p, p)
diag(Omega) <- 1
Sigma2      <- solve(Omega)
Sigma3      <- cov(matrix(rnorm(p*n), ncol=p))

# mean vectors
mean1 <- rep(0,p)
mean2 <- rexp(p)
mean3 <- rnorm(p)

# draw data data from GGM mixture
Z <- sort(sample(c(1:3), n, prob=pis, replace=TRUE))
Y <- rbind(mvtnorm::rmvnorm(sum(Z==1), mean=mean1, sigma=Sigma1),
           mvtnorm::rmvnorm(sum(Z==2), mean=mean2, sigma=Sigma2),
           mvtnorm::rmvnorm(sum(Z==3), mean=mean3, sigma=Sigma3))

# find optimal penalty parameter
optLambda <- optPenaltyGGMmixture.kCVauto(Y,  K=3,          
                                          0.00001, 100,     
                                          10, fold=5,       
                                          target=0*Sigma1)  

# ridge penalized estimation of the GGM mixture
ridgeGGMmixFit <- ridgeGGMmixture(Y, 3, 1, target=0*Sigma1)

porridge documentation built on May 29, 2024, 1:37 a.m.