README.md
In binomialMix: Mixture Models for Binomial and Longitudinal Data

output: github_document

R-package binomialMix

pipeline

Copyright 2019 Faustine Bousquet (faustine.bousquet@tabmo.io or faustine.bousquet@umontpellier.fr) from TabMo and IMAG (Institut Montpelliérain Alexander Grothendieck, University of Montpellier). The binomialMix package is available under the Apache2 license.

The binomialMix package provides a clustering method for longitudinal and non gaussian data. It uses an EM algorithm for GLM.

You can install the binomialMix R package with the following R command:

# install.packages("devtools")
devtools::install_git("https://gitlab.com/tabmo/binomialmix")
devtools::install_gitlab("tabmo/binomialMix")

You can also directly use the git repository :

git clone https://gitlab.com/tabmo/binomialMix

Once you cloned the git repository, you can run to install the binomialMix package:

devtools::install("/path/to/binomialMix/pkg") # edit the path

Import the library :

library(binomialMix)

Load the data :

data(adcampaign)

Of course, you can use your own data. The format you need to have is the following : - a dataframe is needed - a column with factor id representing the objects you want to cluster - a target value * a weighted value variable as we are in case of binomial data - at least, one column as explicative variable

Run the clustering algorithm Here, we want to cluster advertising campaigns. Each campaigns (column "id") is composed of n_c observations from the whole dataset. We have repeated mesure for a same id level. The explicatives variables could be : day, timeSlot or app_or_site. We want to try with K=3 clusters.

model_formula<-"ctr~timeSlot+day"
weighted_variable<-"impressions"
nb_cluster<-3
df_tocluster<-adcampaign
col_id<-"id"
result_K3<-runEM(model_formula,
                  weighted_variable,
                  nb_cluster,
                  df_tocluster,
                  col_id)

Analysis of clustering obtained : The output of the runEM function provides the following values :
loglikelihood for each EM iteration
estimation of β, λ, π parameters
BIC/ICL value
Number of fisher iteration needed for each M-Step

Plotting evolution of Loglikelihood over iteration

# Plotting Loglikelihood :
install.packages("ggplot2")
library(ggplot2)
qplot(seq_along(result_K3[[1]]), result_K3[[1]])

Matrix of beta estimated (values taken for last iteration) :

head(result_K3[[2]][[length(result_K3[[2]])]])

##            [,1]       [,2]       [,3]
## [1,] -3.8126661 -5.2914380 -3.2418550
## [2,] -0.4134079  0.3794783  0.4115441
## [3,] -0.2975236  0.2407683  0.4076950
## [4,] -0.1948168  0.2122175  0.3753815
## [5,] -0.1590104  0.4028323  0.1885215
## [6,] -0.2160946  0.3545593  0.1872363

Vector of proportion in each cluster (values taken for last iteration) :

result_K3[[3]][[length(result_K3[[3]])]]

## [1] 0.1871000 0.7246125 0.0883000

Matrix of proability for each campaign to belong to the different cluster (values taken for last iteration) :

## Too large to print here
result_K3[[4]][[length(result_K3[[4]])]]

BIC value as numeric :

paste0("BIC=",result_K3[[5]][[length(result_K3[[5]])]])

## [1] "BIC=387914.537681485"

ICL value as numeric :

paste0("ICL value=",result_K3[[6]][[length(result_K3[[6]])]])

## [1] "ICL value=387919.96962191"

Total number of EM iteration as numeric value :

paste0("Number of EM iteration :",length(result_K3[[7]]))

## [1] "Number of EM iteration :10"

Matrix of Fisher scoring number of iteration at each M step :

matrix(unlist(result_K3[[7]]),ncol=length(result_K3[[7]])-1)

##      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
## [1,]    4    3    4    6    3    3    2    1    1
## [2,]    3    2    2    2    2    2    2    1    1
## [3,]    5    4    2    2    3    1    1    1    1

#nrow is equal to the number of cluster
#ncol is equal to the number of iteration

Any scripts or data that you put into this service are public.

binomialMix documentation built on March 23, 2020, 5:09 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

binomialMix
Mixture Models for Binomial and Longitudinal Data

README.md
In binomialMix: Mixture Models for Binomial and Longitudinal Data

R-package binomialMix

Description

Instruction for users

Installation

Example of use

Try the binomialMix package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

binomialMix Mixture Models for Binomial and Longitudinal Data

README.md In binomialMix: Mixture Models for Binomial and Longitudinal Data

R-package binomialMix

Description

Instruction for users

Installation

Example of use

Try the binomialMix package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

binomialMix
Mixture Models for Binomial and Longitudinal Data

README.md
In binomialMix: Mixture Models for Binomial and Longitudinal Data