View source: R/hmmclustering.R
hmm.clust | R Documentation |
Implementation of the DBHC algorithm, an HMM clustering algorithm that finds a mixture of discrete-output HMMs. The algorithm uses heuristics based on BIC to search for the optimal number of hidden states in each HMM and the optimal number of clusters.
hmm.clust( sequences, id = NULL, smoothing = 1e-04, eps = 0.001, init.size = 2, alphabet = NULL, K.max = NULL, log_space = FALSE, print = FALSE, seed.size = 3 )
sequences |
An |
id |
A vector with ids that identify the sequences in |
smoothing |
Smoothing parameter for absolute discounting in
|
eps |
A threshold epsilon for counting parameters in
|
init.size |
The number of HMM states in an initial HMM. |
alphabet |
The alphabet of output labels, if not provided alphabet is
taken from |
K.max |
Maximum number of clusters, if not provided algorithm searches for the optimal number itself. |
log_space |
Logical, parameter provided to
|
print |
Logical, whether to print intermediate steps or not. |
seed.size |
Seed size, the number of sequences to be selected for a seed |
A list with components:
sequences
An
stslist
object of sequences with discrete observations.
id
A vector with ids that identify the sequences in
sequences
.
cluster
A vector with found cluster memberships for the sequences.
partition
A list object with
the partition, a mixture of HMMs. Each element in the list is an hmm
object.
memberships
A matrix with cluster memberships for each sequence.
n.clusters
Numerical, the found number of clusters.
sizes
A vector with the number of HMM states for each cluster model.
bic
A vector with the BICs for each cluster model.
## Simulated data library(seqHMM) output.labels <- c("H", "T") # HMM 1 states.1 <- c("A", "B", "C") transitions.1 <- matrix(c(0.8,0.1,0.1,0.1,0.8,0.1,0.1,0.1,0.8), nrow = 3) rownames(transitions.1) <- states.1 colnames(transitions.1) <- states.1 emissions.1 <- matrix(c(0.5,0.75,0.25,0.5,0.25,0.75), nrow = 3) rownames(emissions.1) <- states.1 colnames(emissions.1) <- output.labels initials.1 <- c(1/3,1/3,1/3) # HMM 2 states.2 <- c("A", "B") transitions.2 <- matrix(c(0.75,0.25,0.25,0.75), nrow = 2) rownames(transitions.2) <- states.2 colnames(transitions.2) <- states.2 emissions.2 <- matrix(c(0.8,0.6,0.2,0.4), nrow = 2) rownames(emissions.2) <- states.2 colnames(emissions.2) <- output.labels initials.2 <- c(0.5,0.5) # Simulate hmm.sim.1 <- simulate_hmm(n_sequences = 100, initial_probs = initials.1, transition_probs = transitions.1, emission_probs = emissions.1, sequence_length = 25) hmm.sim.2 <- simulate_hmm(n_sequences = 100, initial_probs = initials.2, transition_probs = transitions.2, emission_probs = emissions.2, sequence_length = 25) sequences <- rbind(hmm.sim.1$observations, hmm.sim.2$observations) n <- nrow(sequences) # Clustering algorithm id <- paste0("K-", 1:n) rownames(sequences) <- id sequences <- sequences[sample(1:n, n),] res <- hmm.clust(sequences, id = rownames(sequences)) ############################################################################# ## Swiss Household Data data("biofam", package = "TraMineR") # Clustering algorithm new.alphabet <- c("P", "L", "M", "LM", "C", "LC", "LMC", "D") sequences <- seqdef(biofam[,10:25], alphabet = 0:7, states = new.alphabet) ## Not run: res <- hmm.clust(sequences) # Heatmaps cluster <- 1 # display heatmaps for cluster 1 transition.heatmap(res$partition[[cluster]]$transition_probs, res$partition[[cluster]]$initial_probs) emission.heatmap(res$partition[[cluster]]$emission_probs) ## End(Not run) ## A smaller example, which takes less time to run subset <- sequences[sample(1:nrow(sequences), 20, replace = FALSE),] # Clustering algorithm, limiting number of clusters to 2 res <- hmm.clust(subset, K.max = 2) # Number of clusters print(res$n.clusters) # Table of cluster memberships table(res$memberships[,"cluster"]) # BIC for each number of clusters print(res$bic) # Heatmaps cluster <- 1 # display heatmaps for cluster 1 transition.heatmap(res$partition[[cluster]]$transition_probs, res$partition[[cluster]]$initial_probs) emission.heatmap(res$partition[[cluster]]$emission_probs)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.