# HDP: Create objects of type "HDP". In bbricks: Bayesian Methods and Graphical Model Structures for Statistical Modeling

## Description

Create an object of type "HDP", which represents the Hierarchical-Dirichlet-Process conjugate structure:

G|gamma \sim DP(gamma,U)

pi_j|G,alpha \sim DP(alpha,G), j = 1:J

z|pi_j \sim Categorical(pi_j)

k|z,G \sim Categorical(G), \textrm{ if z is a sample from the base measure G}

theta_k|psi \sim H0(psi)

x|theta_k,k \sim F(theta_k)

where DP(gamma,U) is a Dirichlet Process on positive integers, gamma is the "concentration parameter", U is the "base measure" of this Dirichlet process, U is an uniform distribution on all positive integers. DP(alpha,G) is a Dirichlet Process on integers with concentration parameter alpha and base measure G. The choice of F() and H0() can be described by an arbitrary "BasicBayesian" object such as "GaussianGaussian","GaussianInvWishart","GaussianNIW", "GaussianNIG", "CatDirichlet", and "CatDP". See `?BasicBayesian` for definition of "BasicBayesian" objects, and see for example `?GaussianGaussian` for specific "BasicBayesian" instances. As a summary, An "HDP" object is simply a combination of a "CatHDP" object (see `?CatHDP`) and an object of any "BasicBayesian" type.
In the case of HDP, z and k can only be positive integers.
This object will be used as a place for recording and accumulating information in the related inference/sampling functions such as posterior(), posteriorDiscard(), dPosteriorPredictive(), rPosteriorPredictive() and so on.

## Usage

 ```1 2 3 4 5 6``` ```HDP( objCopy = NULL, ENV = parent.frame(), gamma = list(gamma = 1, j = 2L, alpha = 1, H0aF = "GaussianNIW", parH0 = list(m = 0, k = 1, v = 2, S = 1)) ) ```

## Arguments

 `objCopy` an object of type "HDP". If "objCopy" is not NULL, the function create a new "HDP" object by copying the content from objCopy, otherwise this new object will be created by using "ENV" and "gamma". Default NULL. `ENV` environment, specify where the object will be created. `gamma` list, a named list of parameters, gamma=list(gamma,alpha,j,H0aF,parH0). Where gamma\$gamma is a numeric value specifying the concentration parameter of DP(gamma,U), gamma\$alpha is a numeric value specifying the concentration parameter of DP(alpha,G), gamma\$j is the number of groups J. gamma\$H0aF is the name of the "BasicBayesian" object which specifies the structure of H0() and F(). gamma\$parH0 is the parameters passed to the selected H0aF. For example, if gamma\$H0aF="GaussianNIW", then gamma\$parH0 should be a named list of NIW parameters: gamma\$parH0=list(m,k,v,S), where gamma\$parH0\$m is a numeric "location" parameter; gamma\$parH0\$S is a symmetric positive definite matrix representing the "scale" parameters; gamma\$parH0\$k and gamma\$parH0\$v are numeric values.

## Value

An object of class "HDP".

## References

Teh, Yee W., et al. "Sharing clusters among related groups: Hierarchical Dirichlet processes." Advances in neural information processing systems. 2005.

`BasicBayesian`,`GaussianNIW`,`GaussianNIG`,`CatDirichlet`,`CatHDP`,`posterior.HDP`,`posteriorDiscard.HDP`,`marginalLikelihood.HDP` ...
 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45``` ``` ## This is an example of Gibbs sampling on an hierarchical mixture model, ## using Hierarchical Dirichlet Process. ## load some hierarchical mixture data, check ?mmhData for details. data(mmhData) x <- mmhData\$x js <- mmhData\$groupLabel ## Step1: initialize-------------------------------------------------- z <- rep(1L,nrow(x)) k <- rep(1L,nrow(x)) ## create a HDP object to track all the changes obj <- HDP(gamma = list(gamma=1,j=max(js),alpha=1, H0aF="GaussianNIW", parH0=list(m=c(0,0),k=0.001,v=2,S=diag(2)*0.01))) ss <- sufficientStatistics(obj\$H,x=x,foreach = TRUE) #sufficient statistics N <- length(ss) for(i in 1L:N){# initialize k and z tmp <- rPosteriorPredictive(obj = obj,n=1,x=x[i,,drop=FALSE],j=js[i]) z[i] <- tmp["z"] k[i] <- tmp["k"] posterior.HDP(obj = obj,ss = ss[[i]],ss1 = k[i],ss2 = z[i],j = js[i]) } ## Step2: main Gibbs loop--------------------------------------------- maxit <- 20 #iterative for maxit times it <- 0 #iteration tracker pb <- txtProgressBar(min = 0,max = maxit,style = 3) while(TRUE){ for(i in 1L:N){ ## remove the sample from the posterior info posteriorDiscard(obj = obj,ss = ss[[i]],ss1=k[i],ss2=z[i],j=js[i]) ## resample a new partition tmp <- rPosteriorPredictive(obj = obj,n=1,x=x[i,,drop=FALSE],j=js[i]) z[i] <- tmp["z"] k[i] <- tmp["k"] posterior(obj = obj,ss = ss[[i]], ss1=k[i],ss2 = z[i],j=js[i]) } plot(x=x[,1],y=x[,2],col=k) #to visualize the group label dynamics it <- it+1 setTxtProgressBar(pb,it) if(it>=maxit){cat("\n");break} } ```