prep_ptMCMC_inputs: Prepare the inputs for the ptMCMC algorithm estimation of...

View source: R/ptMCMC.R

prep_ptMCMC_inputsR Documentation

Prepare the inputs for the ptMCMC algorithm estimation of change points

Description

Package the static inputs (controls and data structures) used by the ptMCMC algorithm in the context of estimating change points.

This function was designed to work within TS and specifically est_changepoints. It is still hardcoded to do so, but has the capacity to be generalized to work with any estimation via ptMCMC with additional coding work.

Usage

prep_ptMCMC_inputs(
  data,
  formula,
  nchangepoints,
  timename,
  weights = NULL,
  control = list()
)

Arguments

data

Class data.frame object including [1] the time variable (indicated in control), [2] the predictor variables (required by formula) and [3], the multinomial response variable (indicated in formula).

formula

formula describing the continuous change. Any predictor variable included must also be a column in the data. Any (multinomial) response variable must also be a set of columns in data.

nchangepoints

Integer corresponding to the number of change points to include in the model. 0 is a valid input (corresponding to no change points, so a singular time series model), and the current implementation can reasonably include up to 6 change points. The number of change points is used to dictate the segmentation of the data for each continuous model and each LDA model.

timename

character element indicating the time variable used in the time series. Defaults to "time". The variable must be integer-conformable or a Date. If the variable named is a Date, the input is converted to an integer, resulting in the timestep being 1 day, which is often not desired behavior.

weights

Optional class numeric vector of weights for each document. Defaults to NULL, translating to an equal weight for each document. When using multinom_TS in a standard LDATS analysis, it is advisable to weight the documents by their total size, as the result of LDA is a matrix of proportions, which does not account for size differences among documents. For most models, a scaling of the weights (so that the average is 1) is most appropriate, and this is accomplished using document_weights.

control

A list of parameters to control the fitting of the Time Series model including the parallel tempering Markov Chain Monte Carlo (ptMCMC) controls. Values not input assume defaults set by TS_control.

Value

Class ptMCMC_inputs list, containing the static inputs for use within the ptMCMC algorithm for estimating change points.

Examples


  data(rodents)
  document_term_table <- rodents$document_term_table
  document_covariate_table <- rodents$document_covariate_table
  LDA_models <- LDA_set(document_term_table, topics = 2)[[1]]
  data <- document_covariate_table
  data$gamma <- LDA_models@gamma
  weights <- document_weights(document_term_table)
  data <- data[order(data[,"newmoon"]), ]
  saves <- prep_saves(1, TS_control())
  inputs <- prep_ptMCMC_inputs(data, gamma ~ 1, 1, "newmoon", weights, 
                               TS_control())


LDATS documentation built on Sept. 19, 2023, 5:08 p.m.