Observation: R6 class for HMM observation model

ObservationR Documentation

R6 class for HMM observation model

Description

Contains the data, distributions, parameters, and formulas for the observation model from a hidden Markov model.

Methods

Public methods


Method new()

Create new Observation object

Usage
Observation$new(
  data = NULL,
  dists,
  formulas = NULL,
  n_states = NULL,
  par,
  fixpar = NULL
)
Arguments
data

Data frame containing response variables (named in dists and par) and covariates (named in formulas)

dists

Named list of distribution names for each data stream, with the following options: beta, binom, cat, dir, exp, foldednorm, gamma, gamma2, lnorm, mvnorm, nbinom, norm, pois, t, truncnorm, tweedie, vm, weibull, wrpcauchy, zibinom, zigamma, zigamma2, zinbinom, zipois, zoibeta, ztnbinom, ztpois. See vignette about list of distributions for more details, e.g., list of parameters for each distribution.

formulas

List of formulas for observation parameters. This should be a nested list, where the outer list has one element for each observed variable, and the inner lists have one element for each parameter. Any parameter that is not included is assumed to have the formula ~1. By default, all parameters have the formula ~1 (i.e., no covariate effects).

n_states

Number of states (optional). If not provided, the number of states is derived from the length of entries of par.

par

List of initial observation parameters. This should be a nested list, where the outer list has one element for each observed variable, and the inner lists have one element for each parameter. The choice of good initial values can be important, especially for complex models; the package vignettes discuss approaches to selecting them (e.g., see Observation$suggest_initial()).

fixpar

List with optional elements "obs" (fixed coefficients for observation parameters), and "lambda_obs" (fixed smoothness parameters), Each element is a named vector of coefficients that should either be fixed (if the corresponding element is set to NA) or estimated to a common value (using integers or factor levels).

Returns

A new Observation object

Examples
# Load data set from MSwM package
data(energy, package = "MSwM")

# Initial observation parameters
par0 <- list(Price = list(mean = c(3, 6), sd = c(2, 2)))

# Model "energy" with normal distributions
obs <- Observation$new(data = energy, 
                       dists = list(Price = "norm"),
                       par = par0)
                       
# Model "energy" with gamma distributions
obs <- Observation$new(data = energy, 
                       dists = list(Price = "gamma2"),
                       par = par0)
                       
# Model with non-linear effect of EurDol on mean price
f <- list(Price = list(mean = ~ s(EurDol, k = 5, bs = "cs")))
obs <- Observation$new(data = energy, 
                       dists = list(Price = "norm"),
                       par = par0, 
                       formula = f)

Method data()

Data frame

Usage
Observation$data()

Method dists()

List of distributions

Usage
Observation$dists()

Method nstates()

Number of states

Usage
Observation$nstates()

Method par()

Parameters on natural scale

Usage
Observation$par(t = 1, full_names = TRUE, linpred = NULL, as_list = FALSE)
Arguments
t

Time index or vector of time indices; default t = 1. If t = "all", then return observation parameters for all time points.

full_names

Logical. If TRUE, the rows of the output are named in the format "variable.parameter" (default). If FALSE, the rows are names in the format "parameter". The latter is used in various internal functions, when the parameters need to be passed on to an R function.

linpred

Optional custom linear predictor.

as_list

Logical. If TRUE, the output is a nested list with three levels: (1) time step, (2) observed variable, (3) observation parameter. If FALSE (default), the output is an array with one row for each observation parameter, one column for each state, and one slice for each time step.

Returns

Array of parameters with one row for each observation parameter, one column for each state, and one slice for each time step. (See as_list argument for alternative output format.)

Examples
# Load data set from MSwM package
data(energy, package = "MSwM")

# Initial observation parameters
par0 <- list(Price = list(mean = c(3, 6), sd = c(2, 2)))

# Model with linear effect of EurDol on mean price
f <- list(Price = list(mean = ~ EurDol))
obs <- Observation$new(data = energy, 
                       dists = list(Price = "norm"),
                       par = par0,
                       n_states = 2, 
                       formula = f)

# Set slope coefficients
obs$update_coeff_fe(coeff_fe = c(3, 2, 6, -2, log(2), log(2)))

# Observation parameter values for given data rows
obs$par(t = c(1, 10, 20))

Method par_alt()

Alternative parameter output

This function is only useful for the categorical and multivariate normal distributions, and it formats the parameters in a slightly nicer way.

Usage
Observation$par_alt(var = NULL, t = 1)
Arguments
var

Name of observation variable for which parameters are required. By default, the first variable in 'dists' is used.

t

Time index for covariate values. Only one value should be provided.

Returns

List of distribution parameters, with one element for each state


Method inipar()

Return initial parameter values supplied

Usage
Observation$inipar()

Method coeff_fe()

Fixed effect parameters on working scale

Usage
Observation$coeff_fe()

Method coeff_re()

Random effect parameters

Usage
Observation$coeff_re()

Method X_fe()

Fixed effect design matrix

Usage
Observation$X_fe()

Method X_re()

Random effect design matrix

Usage
Observation$X_re()

Method lambda()

Smoothness parameters

Usage
Observation$lambda()

Method sd_re()

Standard deviation of smooth terms

This function transforms the smoothness parameter of each smooth term into a standard deviation, given by SD = 1/sqrt(lambda). It is particularly helpful to get the standard deviations of independent normal random effects.

Usage
Observation$sd_re()

Method formulas()

List of model formulas for observation model

Usage
Observation$formulas(raw = FALSE)
Arguments
raw

Logical. If FALSE, returns the nested list created by make_formulas (default). If TRUE, returns formulas passed as input.


Method terms()

Terms of model formulas

Usage
Observation$terms()

Method obs_var()

Data frame of response variables

Usage
Observation$obs_var(expand = FALSE)
Arguments
expand

If TRUE, then multivariate variables in observations are expanded to be univariate, creating extra columns.

Returns

Data frame of observation variables


Method known_states()

Vector of known states

Usage
Observation$known_states(mat = TRUE)
Arguments
mat

Logical.


Method fixpar()

Fixed parameters

Usage
Observation$fixpar(all = FALSE)
Arguments
all

Logical. If FALSE, only user-specified fixed parameters are returned, but not parameters that are fixed for some other reason (e.g., size of binomial distribution)


Method empty()

Empty model? (for simulation only)

Usage
Observation$empty()

Method update_par()

Update parameters

Updates the 'par' attribute to the list passed as input, and updates the intercept elements of 'coeff_fe' using the list passed as input

Usage
Observation$update_par(par)
Arguments
par

New list of parameters


Method update_coeff_fe()

Update coefficients for fixed effect parameters

Usage
Observation$update_coeff_fe(coeff_fe)
Arguments
coeff_fe

New vector of coefficients for fixed effect parameters


Method update_coeff_re()

Update random effect parameters

Usage
Observation$update_coeff_re(coeff_re)
Arguments
coeff_re

New vector of coefficients for random effect parameters


Method update_X_fe()

Update fixed effect design matrix

Usage
Observation$update_X_fe(X_fe)
Arguments
X_fe

New fixed effect design matrix


Method update_X_re()

Update random effect design matrix

Usage
Observation$update_X_re(X_re)
Arguments
X_re

New random effect design matrix


Method update_lambda()

Update smoothness parameters

Usage
Observation$update_lambda(lambda)
Arguments
lambda

New smoothness parameter vector


Method update_data()

Update data

Usage
Observation$update_data(data)
Arguments
data

New data frame


Method update_fixpar()

Update information about fixed parameters

Usage
Observation$update_fixpar(fixpar)
Arguments
fixpar

New list of fixed parameters, in the same format expected by Observation$new()


Method make_mat()

Make model matrices

Usage
Observation$make_mat(new_data = NULL)
Arguments
new_data

Optional new data set, including covariates for which the design matrices should be created. If this argument is not specified, the design matrices are based on the original data frame.

Returns

A list with elements:

X_fe

Design matrix for fixed effects

X_re

Design matrix for random effects

S

Smoothness matrix for random effects

ncol_fe

Number of columns of X_fe for each parameter

ncol_re

Number of columns of X_re and S for each random effect


Method make_newdata_grid()

Design matrices for grid of covariates

Usage
Observation$make_newdata_grid(var, covs = NULL, n_grid = 1000)
Arguments
var

Name of variable

covs

Optional named list for values of covariates (other than 'var') that should be used in the plot (or dataframe with single row). If this is not specified, the mean value is used for numeric variables, and the first level for factor variables.

n_grid

Grid size (number of points). Default: 1000.

Returns

A list with the same elements as the output of make_mat, plus a data frame of covariates values.


Method n2w()

Natural to working parameter transformation

This function applies the link functions of the distribution parameters, to transform parameters from their natural scale to the working scale (i.e., linear predictor scale)

Usage
Observation$n2w(par)
Arguments
par

List of parameters on natural scale

Returns

Vector of parameters on working scale


Method w2n()

Working to natural parameter transformation

This function applies the inverse link functions of the distribution parameters, to transform parameters from the working scale (i.e., linear predictor scale) to their natural scale.

Usage
Observation$w2n(wpar)
Arguments
wpar

Vector of parameters on working scale

Returns

List of parameters on natural scale


Method linpred()

Compute linear predictor

Usage
Observation$linpred()

Method obs_probs()

Observation likelihoods

Usage
Observation$obs_probs(data = NULL)
Arguments
data

Optional dataframe to include in form of obs_var() output

Returns

Matrix of likelihoods of observations, with one row for each time step, and one column for each state.


Method cdf()

Cumulative probabilities of observations

Usage
Observation$cdf()
Returns

List of cumulative probabilities, with one element for each observed variable. Matrix rows correspond to time steps, and columns correspond to states.


Method suggest_initial()

Suggest initial observation parameters

The K-means algorithm is used to define clusters of observations (supposed to approximate the HMM states). Then, for each cluster, the parapprox function of the relevant Dist object is used to obtain parameter values.

Usage
Observation$suggest_initial()
Returns

List of initial parameters for each observation variable

Examples
# Load data set from MSwM package
data(energy, package = "MSwM")

# Initial observation parameters
par0 <- list(Price = list(mean = c(3, 6), sd = c(2, 2)))

# Model "energy" with normal distributions
obs <- Observation$new(data = energy, 
                       dists = list(Price = "norm"),
                       par = par0,
                       n_states = 2)

# Print observation parameters
obs$par()

# Suggest initial parameters
par0_new <- obs$suggest_initial()
par0_new

# Update model parameters to suggested
obs$update_par(par = par0_new)
obs$par()

Method plot_dist()

Plot histogram of data and pdfs

Plot histogram of observations for the variable specified by the argument name, overlaid with the pdf of the specified distribution for that data stream. Helpful to select initial parameter values for model fitting, or to visualise fitted state-dependent distributions.

Usage
Observation$plot_dist(var = NULL, weights = NULL, t = 1)
Arguments
var

Name of response variable for which the histogram and pdfs should be plotted.

weights

Optional vector of length the number of pdfs that are plotted. Useful to visualise a mixture of distributions weighted by the proportion of time spent in the different states.

t

Index of time step to use for covariates (default: 1).

Returns

A ggplot object


Method formulation()

Print model formulation

Usage
Observation$formulation()

Method print()

Print Observation object Check constructor arguments

Usage
Observation$print()

Method clone()

The objects of this class are cloneable with this method.

Usage
Observation$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.

Examples


## ------------------------------------------------
## Method `Observation$new`
## ------------------------------------------------

# Load data set from MSwM package
data(energy, package = "MSwM")

# Initial observation parameters
par0 <- list(Price = list(mean = c(3, 6), sd = c(2, 2)))

# Model "energy" with normal distributions
obs <- Observation$new(data = energy, 
                       dists = list(Price = "norm"),
                       par = par0)
                       
# Model "energy" with gamma distributions
obs <- Observation$new(data = energy, 
                       dists = list(Price = "gamma2"),
                       par = par0)
                       
# Model with non-linear effect of EurDol on mean price
f <- list(Price = list(mean = ~ s(EurDol, k = 5, bs = "cs")))
obs <- Observation$new(data = energy, 
                       dists = list(Price = "norm"),
                       par = par0, 
                       formula = f)

## ------------------------------------------------
## Method `Observation$par`
## ------------------------------------------------

# Load data set from MSwM package
data(energy, package = "MSwM")

# Initial observation parameters
par0 <- list(Price = list(mean = c(3, 6), sd = c(2, 2)))

# Model with linear effect of EurDol on mean price
f <- list(Price = list(mean = ~ EurDol))
obs <- Observation$new(data = energy, 
                       dists = list(Price = "norm"),
                       par = par0,
                       n_states = 2, 
                       formula = f)

# Set slope coefficients
obs$update_coeff_fe(coeff_fe = c(3, 2, 6, -2, log(2), log(2)))

# Observation parameter values for given data rows
obs$par(t = c(1, 10, 20))

## ------------------------------------------------
## Method `Observation$suggest_initial`
## ------------------------------------------------

# Load data set from MSwM package
data(energy, package = "MSwM")

# Initial observation parameters
par0 <- list(Price = list(mean = c(3, 6), sd = c(2, 2)))

# Model "energy" with normal distributions
obs <- Observation$new(data = energy, 
                       dists = list(Price = "norm"),
                       par = par0,
                       n_states = 2)

# Print observation parameters
obs$par()

# Suggest initial parameters
par0_new <- obs$suggest_initial()
par0_new

# Update model parameters to suggested
obs$update_par(par = par0_new)
obs$par()

TheoMichelot/hmmTMB documentation built on Dec. 13, 2024, 11:52 a.m.