initial_cluster: initial clustering of the data set

View source: R/initial-cluster.R

initial_clusterR Documentation

initial clustering of the data set

Description

Provides an initial clustering for a data of class "hhsmmdata" which determines the initial states and mixture components (if necessary) to be used for initial parameter and model estimation

Usage

initial_cluster(
  train,
  nstate,
  nmix,
  ltr = FALSE,
  equispace = FALSE,
  final.absorb = FALSE,
  verbose = FALSE,
  regress = FALSE,
  resp.ind = 1
)

Arguments

train

the train data set of class "hhsmmdata", which can also contain missing data (NA or NaN)

nstate

number of states

nmix

number of mixture components which is of one of the following forms:

  • a vector of positive (non-zero) integers of length nstate

  • a positive (non-zero) integer

  • the text "auto": the number of mixture components will be determined automatically based on the within cluster sum of squares

  • NULL if no mixture distribution is not considered as the emission. This option is usefull for the nonparametric emission distribution (nonpar_mstep and dnonpar)

ltr

logical. if TRUE a left to right hidden hybrid Markov/semi-Markov model is assumed

equispace

logical. if TRUE the left to right clustering will be performed simply with equal time spaces. This option is suitable for speech recognition applications

final.absorb

logical. if TRUE the final state of the sequence is assumed to be the absorbance state

verbose

logical. if TRUE the outputs will be printed

regress

logical. if TRUE the linear regression clustering will be performed

resp.ind

the column indices of the response variables for the linear regression clustering approach. The default is 1, which means that the first column is the univariate response variable

Details

In reliability applications, the hhsmm models are often left-to-right and the modeling aims to predict the future states. In such cases, the ltr=TRUE and final.absorb=TRUE should be set.

Value

a list containing the following items:

  • clust.X a list of clustered observations for each sequence and state

  • mix.clus a list of the clusters for the mixtures for each state

  • state.clus the exact state clusters of each observation (available if ltr=FALSE)

  • nmix the number of mixture components (a vector of positive (non-zero) integers of length nstate)

  • ltr logical. if TRUE a left to right hidden hybrid Markov/semi-Markov model is assumed

  • final.absorb logical. if TRUE the final state of the sequence is assumed to be the absorbance state

  • miss logical. if TRUE the train$x matrix contains missing data (NA or NaN)

Author(s)

Morteza Amini, morteza.amini@ut.ac.ir, Afarin Bayat, aftbayat@gmail.com

Examples

J <- 3
initial <- c(1, 0, 0)
semi <- c(FALSE, TRUE, FALSE)
P <- matrix(c(0.8, 0.1, 0.1, 0.5, 0, 0.5, 0.1, 0.2, 0.7), nrow = J, 
byrow = TRUE)
par <- list(mu = list(list(7, 8), list(10, 9, 11), list(12, 14)),
sigma = list(list(3.8, 4.9), list(4.3, 4.2, 5.4), list(4.5, 6.1)),
mix.p = list(c(0.3, 0.7), c(0.2, 0.3, 0.5), c(0.5, 0.5)))
sojourn <- list(shape = c(0, 3, 0), scale = c(0, 10, 0), type = "gamma")
model <- hhsmmspec(init = initial, transition = P, parms.emis = par,
dens.emis = dmixmvnorm, sojourn = sojourn, semi = semi)
train <- simulate(model, nsim = c(10, 8, 8, 18), seed = 1234, 
remission = rmixmvnorm)
clus = initial_cluster(train, nstate = 3, nmix = c(2 ,2, 2),ltr = FALSE,
final.absorb = FALSE, verbose = TRUE)


hhsmm documentation built on Sept. 11, 2024, 7:34 p.m.

Related to initial_cluster in hhsmm...