est_regressors: Estimate the distribution of regressors, unconditional on the...
In weecology/LDATS: Latent Dirichlet Allocation Coupled with Time Series Analyses

Description Usage Arguments Details Value References Examples

This function uses the marginal posterior distributions of the change point locations (estimated by est_changepoints) in combination with the conditional (on the change point locations) posterior distributions of the regressors (estimated by multinom_TS) to estimate the marginal posterior distribution of the regressors, unconditional on the change point locations.

1 2	est_regressors(rho_dist, data, formula, timename, weights, control = list())

`rho_dist`	List of saved data objects from the ptMCMC estimation of change point locations (unless `nchangepoints` is 0, then `NULL`) returned from `est_changepoints`.
`data`	`data.frame` including [1] the time variable (indicated in `timename`), [2] the predictor variables (required by `formula`) and [3], the multinomial response variable (indicated in `formula`) as verified by `check_timename` and `check_formula`. Note that the response variables should be formatted as a `data.frame` object named as indicated by the `response` entry in the `control` list, such as `gamma` for a standard TS analysis on LDA output.
`formula`	`formula` defining the regression between relationship the change points. Any predictor variable included must also be a column in `data` and any (multinomial) response variable must be a set of columns in `data`, as verified by `check_formula`.
`timename`	`character` element indicating the time variable used in the time series.
`weights`	Optional class `numeric` vector of weights for each document. Defaults to `NULL`, translating to an equal weight for each document. When using `multinom_TS` in a standard LDATS analysis, it is advisable to weight the documents by their total size, as the result of `LDA` is a matrix of proportions, which does not account for size differences among documents. For most models, a scaling of the weights (so that the average is 1) is most appropriate, and this is accomplished using `document_weights`.
`control`	A `list` of parameters to control the fitting of the Time Series model including the parallel tempering Markov Chain Monte Carlo (ptMCMC) controls. Values not input assume defaults set by `TS_control`.

The general approach follows that of Western and Kleykamp (2004), although we note some important differences. Our regression models are fit independently for each chunk (segment of time), and therefore the variance-covariance matrix for the full model has 0 entries for covariances between regressors in different chunks of the time series. Further, because the regression model here is a standard (non-hierarchical) softmax (Ripley 1996, Venables and Ripley 2002, Bishop 2006), there is no error term in the regression (as there is in the normal model used by Western and Kleykamp 2004), and so the posterior distribution used here is a multivariate normal, as opposed to a multivariate t, as used by Western and Kleykamp (2004).

matrix of draws (rows) from the marginal posteriors of the coefficients across the segments (columns).

Bishop, C. M. 2006. Pattern Recognition and Machine Learning. Springer, New York, NY, USA.

Ripley, B. D. 1996. Pattern Recognition and Neural Networks. Cambridge University Press, Cambridge, UK.

Venables, W. N. and B. D. Ripley. 2002. Modern and Applied Statistics with S. Fourth Edition. Springer, New York, NY, USA.

Western, B. and M. Kleykamp. 2004. A Bayesian change point model for historical time series analysis. Political Analysis 12:354-374. link.

  data(rodents)
  document_term_table <- rodents$document_term_table
  document_covariate_table <- rodents$document_covariate_table
  LDA_models <- LDA_set(document_term_table, topics = 2)[[1]]
  data <- document_covariate_table
  data$gamma <- LDA_models@gamma
  weights <- document_weights(document_term_table)
  formula <- gamma ~ 1
  nchangepoints <- 1
  control <- TS_control()
  data <- data[order(data[,"newmoon"]), ]
  rho_dist <- est_changepoints(data, formula, nchangepoints, "newmoon", 
                               weights, control)
  eta_dist <- est_regressors(rho_dist, data, formula, "newmoon", weights, 
                             control)