Description Details Author(s) References Examples
The mixedMem
package contains tools for fitting and interpreting discrete multivariate mixed membership models following the general framework outlined in Erosheva et al 2004. In a mixed membership models, individuals can belong to multiple groups instead of only a single group (Airoldi et al 2014). This extension allows for a richer description of heterogeneous populations and has been applied in a wide variety of contexts including: text data (Blei et al 2003), genotype sequences (Pritchard et al 2000), ranked data (Gormley and Murphy 2009), and survey data (Erosheva et al 2007, Gross and Manrique-Vallier 2014).
Mixed membership model objects can be created using the mixedMemModel
constructor function. This function checks the internal consistency of the data/parameters
and returns an object suitable for use by the mmVarFit
function. The
mmVarFit
function is the main function in the package. It utilizes a variational EM algorithim to fit an approximate posterior distribution for the latent variables and select pseudo-MLE estimates for the global parameters. A step-by-step guide to using the package is detailed in the package vignette "Fitting Mixed Membership Models using mixedMem
".
The package supports multivariate models (with or without repeated measurements) where each variable can be of a different type. Currently supported data types include: Bernoulli, rank (Plackett-Luce) and multinomial. Given a fixed number of sub-populations K, we assume the following generative model for each mixed membership model:
For each individual i = 1,... Total:
Draw λ_i from a Dirichlet(α). λ_i is a vector of length K whose components indicates the degree of membership for individual i in each of the K sub-populations.
For each variable j = 1 ..., J:
For each of replicate r = 1, ..., R_j:
For each ranking level n = 1..., N_{i,j,r}:
Draw Z_{i,j,r,n} from a multinomial(1, λ_i). The latent sub-population indicator Z_{i,j,r,n} determines the sub-population which governs the response for observation X_{i,j,r,n}. This is sometimes referred to as the context vector because it determines the context from which the individual responds.
Draw X_{i,j,r,n} from the latent sub-population distribution parameterized by θ_{j,Z_{i,j,r,n}}. The parameter θ governs the observations for each sub-population. For example, if variable j is a multinomial or rank distribution with V_j categories/candidates, then θ_{j,k} is a vector of length V_j which parameterizes the responses to variable j for sub-population k. Likewise, if variable j is a Bernoulli random variable, then θ_{j,k} is a value which determines the probability of success.
Y. Samuel Wang <ysamuelwang@gmail.com>, Elena Erosheva <erosheva@uw.edu>
Airoldi, E. M., Blei, D., Erosheva, E. A., & Fienberg, S. E.. 2014. Handbook of Mixed Membership Models and Their Applications. CRC Press. Chicago
Blei, David; Ng, Andrew Y.; Jordan, Michael I.. 2003. Latent Dirichlet Allocation. Journal of Machine Learning Research, 3, 993-1022.
Erosheva, Elena A.; Fienberg, Stephen E.; Joutard, Cyrille. 2007. Describing Disability Through Individual-level Mixture Models for Multivariate Binary Data. The Annals of Applied Statistics 1 (2007), no. 2, 502–537. doi:10.1214/07-AOAS126.
Erosheva, Elena A.; Fienberg, Stephen E.; Lafferty, John. 2004. Mixed-membership Models of Scientific Publications". PNAS, 101 (suppl 1), 5220-5227. doi:10.1073/pnas.0307760101.
Gormley, Isobel C.; Murphy, Thomas B.. 2009. A Grade of Membership Model for Rank Data. Bayesian Analysis, 4, 265 - 296. DOI:10.1214/09-BA410.
National Election Studies, 1983 Pilot Election Study. Ann Arbor, MI: University of Michigan, Center for Political Studies, 1999
Pritchard, Jonathan K.; Stephens, Matthew; Donnelly, Peter. 2000. Inference of Population Structure using Multilocus Genotype Data. Genetics 155.2: 945-959.
Gross, Justin; Manrique-Vallier, Daniel. 2014. A Mixed-membership Approach to the Assessment of Political Ideology from Survey Responses. In Airoldi, Edoardo M.; Blei, David; Erosheva, Elena A.; & Fienberg, Stephen E.. Handbook of Mixed Membership Models and Their Applications. CRC Press. Chicago
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 | library(mixedMem)
data(ANES)
# Dimensions of the data set: 279 individuals with 19 responses each
dim(ANES)
# The 19 variables and their categories
# The specific statements for each variable can be found using help(ANES)
# Variables titled EQ are about Equality
# Variables titled IND are about Econonic Individualism
# Variables titled ENT are about Free Enterprise
colnames(ANES)
# Distribution of responses
table(unlist(ANES))
# Sample Size
Total <- 279
# Number of variables
J <- 19
# we only have one replicate for each of the variables
Rj <- rep(1, J)
# Nijr indicates the number of ranking levels for each variable.
# Since all our data is multinomial it should be an array of all 1s
Nijr <- array(1, dim = c(Total, J, max(Rj)))
# Number of sub-populations
K <- 3
# There are 3 choices for each of the variables ranging from 0 to 2.
Vj <- rep(3, J)
# we initialize alpha to .2
alpha <- rep(.2, K)
# All variables are multinomial
dist <- rep("multinomial", J)
# obs are the observed responses. it is a 4-d array indexed by i,j,r,n
# note that obs ranges from 0 to 2 for each response
obs <- array(0, dim = c(Total, J, max(Rj), max(Nijr)))
obs[ , ,1,1] <- as.matrix(ANES)
# Initialize theta randomly with Dirichlet distributions
set.seed(123)
theta <- array(0, dim = c(J,K,max(Vj)))
for(j in 1:J)
{
theta[j, , ] <- gtools::rdirichlet(K, rep(.8, Vj[j]))
}
# Create the mixedMemModel
# This object encodes the initialization points for the variational EM algorithim
# and also encodes the observed parameters and responses
initial <- mixedMemModel(Total = Total, J = J, Rj = Rj,
Nijr = Nijr, K = K, Vj = Vj, alpha = alpha,
theta = theta, dist = dist, obs = obs)
## Not run:
# Fit the model
out <- mmVarFit(initial)
summary(out)
## End(Not run)
|
[1] 279 19
[1] "EQ1" "EQ2" "EQ3" "EQ4" "EQ5" "EQ6" "EQ7" "IND1" "IND2" "IND3"
[11] "IND4" "IND5" "IND6" "ENT1" "ENT2" "ENT3" "ENT4" "ENT5" "ENT6"
0 1 2
3295 99 1907
[1] "Model Check: Ok!"
[1] "<== Beginning Model Fit! ==>"
Iter: 1 Elbo: -16810.2
E-Step: -7327.02
M-Step: -4597.15
Iter: 2 Elbo: -4597.15
E-Step: -3918.33
M-Step: -3683.89
Iter: 3 Elbo: -3683.89
E-Step: -3553.89
M-Step: -3482.22
Iter: 4 Elbo: -3482.22
E-Step: -3431.85
M-Step: -3387.2
Iter: 5 Elbo: -3387.2
E-Step: -3348.1
M-Step: -3315.84
Iter: 6 Elbo: -3315.84
E-Step: -3283.93
M-Step: -3260.79
Iter: 7 Elbo: -3260.79
E-Step: -3245.36
M-Step: -3232.12
Iter: 8 Elbo: -3232.12
E-Step: -3218.89
M-Step: -3207.97
Iter: 9 Elbo: -3207.97
E-Step: -3197.42
M-Step: -3185.78
Iter: 10 Elbo: -3185.78
E-Step: -3176.62
M-Step: -3172.02
Iter: 11 Elbo: -3172.02
E-Step: -3168.75
M-Step: -3166.36
Iter: 12 Elbo: -3166.36
E-Step: -3164.26
M-Step: -3162.2
Iter: 13 Elbo: -3162.2
E-Step: -3159.82
M-Step: -3157.87
Iter: 14 Elbo: -3157.87
E-Step: -3156.44
M-Step: -3155.21
Iter: 15 Elbo: -3155.21
E-Step: -3154.08
M-Step: -3153
Iter: 16 Elbo: -3153
E-Step: -3151.96
M-Step: -3150.97
Iter: 17 Elbo: -3150.97
E-Step: -3150
M-Step: -3149.08
Iter: 18 Elbo: -3149.08
E-Step: -3148.18
M-Step: -3147.33
Iter: 19 Elbo: -3147.33
E-Step: -3146.5
M-Step: -3145.71
Iter: 20 Elbo: -3145.71
E-Step: -3144.94
M-Step: -3144.2
Iter: 21 Elbo: -3144.2
E-Step: -3143.47
M-Step: -3142.76
Iter: 22 Elbo: -3142.76
E-Step: -3142.07
M-Step: -3141.39
Iter: 23 Elbo: -3141.39
E-Step: -3140.73
M-Step: -3140.09
Iter: 24 Elbo: -3140.09
E-Step: -3139.47
M-Step: -3138.87
Iter: 25 Elbo: -3138.87
E-Step: -3138.28
M-Step: -3137.71
Iter: 26 Elbo: -3137.71
E-Step: -3137.16
M-Step: -3136.63
Iter: 27 Elbo: -3136.63
E-Step: -3136.11
M-Step: -3135.62
Iter: 28 Elbo: -3135.62
E-Step: -3135.14
M-Step: -3134.68
Iter: 29 Elbo: -3134.68
E-Step: -3134.23
M-Step: -3133.81
Iter: 30 Elbo: -3133.81
E-Step: -3133.4
M-Step: -3133
Iter: 31 Elbo: -3133
E-Step: -3132.62
M-Step: -3132.26
Iter: 32 Elbo: -3132.26
E-Step: -3131.91
M-Step: -3131.58
Iter: 33 Elbo: -3131.58
E-Step: -3131.25
M-Step: -3130.94
Iter: 34 Elbo: -3130.94
E-Step: -3130.65
M-Step: -3130.36
Iter: 35 Elbo: -3130.36
E-Step: -3130.09
M-Step: -3129.83
Iter: 36 Elbo: -3129.83
E-Step: -3129.58
M-Step: -3129.34
Iter: 37 Elbo: -3129.34
E-Step: -3129.11
M-Step: -3128.9
Iter: 38 Elbo: -3128.9
E-Step: -3128.69
M-Step: -3128.49
Iter: 39 Elbo: -3128.49
E-Step: -3128.3
M-Step: -3128.11
Iter: 40 Elbo: -3128.11
E-Step: -3127.94
M-Step: -3127.77
Iter: 41 Elbo: -3127.77
E-Step: -3127.61
M-Step: -3127.45
Iter: 42 Elbo: -3127.45
E-Step: -3127.3
M-Step: -3127.16
Iter: 43 Elbo: -3127.16
E-Step: -3127.02
M-Step: -3126.89
Iter: 44 Elbo: -3126.89
E-Step: -3126.76
M-Step: -3126.64
Iter: 45 Elbo: -3126.64
E-Step: -3126.52
M-Step: -3126.41
Iter: 46 Elbo: -3126.41
E-Step: -3126.29
M-Step: -3126.18
Iter: 47 Elbo: -3126.18
E-Step: -3126.07
M-Step: -3125.97
Iter: 48 Elbo: -3125.97
E-Step: -3125.86
M-Step: -3125.76
Iter: 49 Elbo: -3125.76
E-Step: -3125.65
M-Step: -3125.55
Iter: 50 Elbo: -3125.55
E-Step: -3125.45
M-Step: -3125.36
Iter: 51 Elbo: -3125.36
E-Step: -3125.26
M-Step: -3125.16
Iter: 52 Elbo: -3125.16
E-Step: -3125.07
M-Step: -3124.98
Iter: 53 Elbo: -3124.98
E-Step: -3124.89
M-Step: -3124.81
Iter: 54 Elbo: -3124.81
E-Step: -3124.73
M-Step: -3124.66
Iter: 55 Elbo: -3124.66
E-Step: -3124.59
M-Step: -3124.52
Iter: 56 Elbo: -3124.52
E-Step: -3124.46
M-Step: -3124.4
Iter: 57 Elbo: -3124.4
E-Step: -3124.34
M-Step: -3124.29
Iter: 58 Elbo: -3124.29
E-Step: -3124.24
M-Step: -3124.19
Iter: 59 Elbo: -3124.19
E-Step: -3124.15
M-Step: -3124.1
Iter: 60 Elbo: -3124.1
E-Step: -3124.06
M-Step: -3124.02
Iter: 61 Elbo: -3124.02
E-Step: -3123.98
M-Step: -3123.94
Iter: 62 Elbo: -3123.94
E-Step: -3123.9
M-Step: -3123.87
Iter: 63 Elbo: -3123.87
E-Step: -3123.83
M-Step: -3123.8
Iter: 64 Elbo: -3123.8
E-Step: -3123.76
M-Step: -3123.73
Iter: 65 Elbo: -3123.73
E-Step: -3123.7
M-Step: -3123.67
Iter: 66 Elbo: -3123.67
E-Step: -3123.64
M-Step: -3123.61
Iter: 67 Elbo: -3123.61
E-Step: -3123.58
M-Step: -3123.56
Iter: 68 Elbo: -3123.56
E-Step: -3123.53
M-Step: -3123.5
Iter: 69 Elbo: -3123.5
E-Step: -3123.48
M-Step: -3123.46
Iter: 70 Elbo: -3123.46
E-Step: -3123.43
M-Step: -3123.41
Iter: 71 Elbo: -3123.41
E-Step: -3123.39
M-Step: -3123.37
Iter: 72 Elbo: -3123.37
E-Step: -3123.35
M-Step: -3123.33
Iter: 73 Elbo: -3123.33
E-Step: -3123.31
M-Step: -3123.29
Iter: 74 Elbo: -3123.29
E-Step: -3123.27
M-Step: -3123.26
Iter: 75 Elbo: -3123.26
E-Step: -3123.24
M-Step: -3123.22
Iter: 76 Elbo: -3123.22
E-Step: -3123.21
M-Step: -3123.19
Iter: 77 Elbo: -3123.19
E-Step: -3123.18
M-Step: -3123.16
Iter: 78 Elbo: -3123.16
E-Step: -3123.15
M-Step: -3123.14
Iter: 79 Elbo: -3123.14
E-Step: -3123.13
M-Step: -3123.12
Iter: 80 Elbo: -3123.12
E-Step: -3123.1
M-Step: -3123.09
Iter: 81 Elbo: -3123.09
E-Step: -3123.08
M-Step: -3123.07
Iter: 82 Elbo: -3123.07
E-Step: -3123.06
M-Step: -3123.06
Iter: 83 Elbo: -3123.06
E-Step: -3123.05
M-Step: -3123.04
Iter: 84 Elbo: -3123.04
E-Step: -3123.03
M-Step: -3123.02
Iter: 85 Elbo: -3123.02
E-Step: -3123.02
M-Step: -3123.01
Iter: 86 Elbo: -3123.01
E-Step: -3123
M-Step: -3123
Iter: 87 Elbo: -3123
E-Step: -3122.99
M-Step: -3122.98
Iter: 88 Elbo: -3122.98
E-Step: -3122.98
M-Step: -3122.97
Iter: 89 Elbo: -3122.97
E-Step: -3122.97
M-Step: -3122.96
Iter: 90 Elbo: -3122.96
E-Step: -3122.96
M-Step: -3122.95
Iter: 91 Elbo: -3122.95
E-Step: -3122.95
M-Step: -3122.95
Iter: 92 Elbo: -3122.95
E-Step: -3122.94
M-Step: -3122.94
Iter: 93 Elbo: -3122.94
E-Step: -3122.93
M-Step: -3122.93
Iter: 94 Elbo: -3122.93
E-Step: -3122.93
M-Step: -3122.92
Iter: 95 Elbo: -3122.92
E-Step: -3122.92
M-Step: -3122.92
Iter: 96 Elbo: -3122.92
E-Step: -3122.92
M-Step: -3122.91
Iter: 97 Elbo: -3122.91
E-Step: -3122.91
M-Step: -3122.91
Iter: 98 Elbo: -3122.91
E-Step: -3122.91
M-Step: -3122.9
Iter: 99 Elbo: -3122.9
E-Step: -3122.9
M-Step: -3122.9
Iter: 100 Elbo: -3122.9
E-Step: -3122.9
M-Step: -3122.9
Iter: 101 Elbo: -3122.9
E-Step: -3122.89
M-Step: -3122.89
Iter: 102 Elbo: -3122.89
E-Step: -3122.89
M-Step: -3122.89
Iter: 103 Elbo: -3122.89
E-Step: -3122.89
M-Step: -3122.89
Fit Complete! Elbo: -3122.89 Iter: 103
== Summary for Mixed Membership Model ==
Total: 279 K: 3 ELBO: -3122.89
Variable Variable Type Replicates Categories
1 multinomial 1 3
2 multinomial 1 3
3 multinomial 1 3
4 multinomial 1 3
5 multinomial 1 3
6 multinomial 1 3
7 multinomial 1 3
8 multinomial 1 3
9 multinomial 1 3
10 multinomial 1 3
11 multinomial 1 3
12 multinomial 1 3
13 multinomial 1 3
14 multinomial 1 3
15 multinomial 1 3
16 multinomial 1 3
17 multinomial 1 3
18 multinomial 1 3
19 multinomial 1 3
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.