View source: R/hmmclustering.R
| hmm.clust | R Documentation |
Implementation of the DBHC algorithm, an HMM clustering algorithm that finds a mixture of discrete-output HMMs. The algorithm uses heuristics based on BIC to search for the optimal number of hidden states in each HMM and the optimal number of clusters.
hmm.clust( sequences, id = NULL, smoothing = 1e-04, eps = 0.001, init.size = 2, alphabet = NULL, K.max = NULL, log_space = FALSE, print = FALSE, seed.size = 3 )
sequences |
An |
id |
A vector with ids that identify the sequences in |
smoothing |
Smoothing parameter for absolute discounting in
|
eps |
A threshold epsilon for counting parameters in
|
init.size |
The number of HMM states in an initial HMM. |
alphabet |
The alphabet of output labels, if not provided alphabet is
taken from |
K.max |
Maximum number of clusters, if not provided algorithm searches for the optimal number itself. |
log_space |
Logical, parameter provided to
|
print |
Logical, whether to print intermediate steps or not. |
seed.size |
Seed size, the number of sequences to be selected for a seed |
A list with components:
sequencesAn
stslist object of sequences with discrete observations.
idA vector with ids that identify the sequences in
sequences.
clusterA vector with found cluster memberships for the sequences.
partitionA list object with
the partition, a mixture of HMMs. Each element in the list is an hmm
object.
membershipsA matrix with cluster memberships for each sequence.
n.clustersNumerical, the found number of clusters.
sizesA vector with the number of HMM states for each cluster model.
bicA vector with the BICs for each cluster model.
## Simulated data
library(seqHMM)
output.labels <- c("H", "T")
# HMM 1
states.1 <- c("A", "B", "C")
transitions.1 <- matrix(c(0.8,0.1,0.1,0.1,0.8,0.1,0.1,0.1,0.8), nrow = 3)
rownames(transitions.1) <- states.1
colnames(transitions.1) <- states.1
emissions.1 <- matrix(c(0.5,0.75,0.25,0.5,0.25,0.75), nrow = 3)
rownames(emissions.1) <- states.1
colnames(emissions.1) <- output.labels
initials.1 <- c(1/3,1/3,1/3)
# HMM 2
states.2 <- c("A", "B")
transitions.2 <- matrix(c(0.75,0.25,0.25,0.75), nrow = 2)
rownames(transitions.2) <- states.2
colnames(transitions.2) <- states.2
emissions.2 <- matrix(c(0.8,0.6,0.2,0.4), nrow = 2)
rownames(emissions.2) <- states.2
colnames(emissions.2) <- output.labels
initials.2 <- c(0.5,0.5)
# Simulate
hmm.sim.1 <- simulate_hmm(n_sequences = 100,
initial_probs = initials.1,
transition_probs = transitions.1,
emission_probs = emissions.1,
sequence_length = 25)
hmm.sim.2 <- simulate_hmm(n_sequences = 100,
initial_probs = initials.2,
transition_probs = transitions.2,
emission_probs = emissions.2,
sequence_length = 25)
sequences <- rbind(hmm.sim.1$observations, hmm.sim.2$observations)
n <- nrow(sequences)
# Clustering algorithm
id <- paste0("K-", 1:n)
rownames(sequences) <- id
sequences <- sequences[sample(1:n, n),]
res <- hmm.clust(sequences, id = rownames(sequences))
#############################################################################
## Swiss Household Data
data("biofam", package = "TraMineR")
# Clustering algorithm
new.alphabet <- c("P", "L", "M", "LM", "C", "LC", "LMC", "D")
sequences <- seqdef(biofam[,10:25], alphabet = 0:7, states = new.alphabet)
## Not run:
res <- hmm.clust(sequences)
# Heatmaps
cluster <- 1 # display heatmaps for cluster 1
transition.heatmap(res$partition[[cluster]]$transition_probs,
res$partition[[cluster]]$initial_probs)
emission.heatmap(res$partition[[cluster]]$emission_probs)
## End(Not run)
## A smaller example, which takes less time to run
subset <- sequences[sample(1:nrow(sequences), 20, replace = FALSE),]
# Clustering algorithm, limiting number of clusters to 2
res <- hmm.clust(subset, K.max = 2)
# Number of clusters
print(res$n.clusters)
# Table of cluster memberships
table(res$memberships[,"cluster"])
# BIC for each number of clusters
print(res$bic)
# Heatmaps
cluster <- 1 # display heatmaps for cluster 1
transition.heatmap(res$partition[[cluster]]$transition_probs,
res$partition[[cluster]]$initial_probs)
emission.heatmap(res$partition[[cluster]]$emission_probs)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.