ccas: Infer Communication Content and Structure

Description Usage Arguments Value Examples

View source: R/ccas.R

Description

Performs inference on the content conditional structure of a text valued communication network.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
ccas(formula, interaction_patterns = 4, topics = 40, alpha = 1,
  beta = 0.01, iterations = 1000, metropolis_hastings_iterations = 500,
  final_metropolis_hastings_burnin = 50000,
  final_metropolis_hastings_iterations = 1e+05, thin = 1/100,
  target_accept_rate = 0.25, tolerance = 0.05,
  LSM_proposal_variance = 0.5, LSM_prior_variance = 1, LSM_prior_mean = 0,
  iterations_before_t_i_p_updates = 5, update_t_i_p_every_x_iterations = 5,
  adaptive_metropolis = TRUE, adaptive_metropolis_update_size = 0.05,
  seed = 12345, adaptive_metropolis_every_x_iterations = 1000,
  stop_adaptive_metropolis_after_x_updates = 50,
  slice_sample_alpha_m = FALSE, slice_sample_step_size = 1,
  parallel = FALSE, cores = 2, output_directory = NULL,
  output_name_stem = NULL, generate_plots = TRUE, verbose = TRUE)

Arguments

formula

A formula object of the form 'ComNet ~ euclidean(d = 2)' where d is the number of dimensions in the latent space that the user would like to include, and ComNet is an object of class 'ComNet' generated by the prepare_data() function. This object will contain all of the relevant information about the corpus so that multiple models may be specified without the need to repreprocess the data. The formula may also include optional terms 'sender("covariate_name")', 'receiver("covariate_name")', 'nodemix("covariate_name", base = value)' and 'netcov("network_covariate")', which are defined analogously to the arguments in the latentnet package.

interaction_patterns

The number of different interaction patterns governing message sending and recieving under the model. Defaults to 4.

topics

The number of topics to be used in the model. Defaults to 40.

alpha

The hyperparameter governing document-topic distributions. Lower values encourage more peaked distributions. Defaults to 1.

beta

The hyperparameter governing the Dirichlet prior on the topic-word distributions. Lower values encourage more peaked distributions. Defaults to 0.01.

iterations

The number of iterations of Metropolis-within-Gibbs sampling to be used in model estimation. Defaults to 1,000.

metropolis_hastings_iterations

The number of Metropolis Hastings iterations to be run during each iteration of Metropolis-within-Gibbs sampling to update interaction pattern parameters. Defaults to 500.

final_metropolis_hastings_burnin

The number of iterations of Metropolis Hastings to run after completing the main iterations of Gibbs sampling to discard before keeping samples. Defaults to 50,000.

final_metropolis_hastings_iterations

The number of iterations to run Metropolis Hastings after completing all main iterations of Gibbs sampling. Defaults to 100,000. This additional number of iterations is required to ensure that the Markov chain of interaction pattern parameters has mixed appropriately and converged to the target distribution. This can occasionally take on the order of 1-10 million iterations, but is often much faster in practice.

thin

The proportion of network samples to keep from the final run of Metropolis Hastings to convergence. Defaults to 1/100, meaning that every 100'th network sample will be returned.

target_accept_rate

The target acceptance rate for the Metropolis Hastings algorithm. Defaults to 0.25 which is standard in the literature.

tolerance

The tolerance for differences between the observed and target Metropolis Hastings accept rates (+-). Defaults to 0.05.

LSM_proposal_variance

The Metropolis Hastings proposal variance for all interaction pattern parameters. Defaults to .5.

LSM_prior_variance

The variance of the multivariate normal prior on all interaction pattern parameters. Defaults to 1.

LSM_prior_mean

The mean of the multivariate normal prior on all interaction pattern parameters. Defaults to 0.

iterations_before_t_i_p_updates

The number of iterations to wait before beginning updates to topic interaction pattern assignments. Defaults to 5. If the user does not wish to update these assignments, the value can be set to a value greater than 'iterations'.

update_t_i_p_every_x_iterations

The number of iterations between updates to topic interaction pattern assignments. Defaults to 5.

adaptive_metropolis

Logical indicating whether adaptive Metropolis should be used (whether the proposal variance should be optimized). Defaults to TRUE.

adaptive_metropolis_update_size

The amount by which the Metroplis Hastings proposal distribution variance is changed (up or down) durring adaptive Metropolis. Defaults to 0.05.

seed

The seed to be used (for replicability across runs). Defaults to 12345.

adaptive_metropolis_every_x_iterations

The nubmer of iterations between proposal variance updates during the final run of Metropolis Hastings to convergence. Defaults to 1000.

stop_adaptive_metropolis_after_x_updates

The number of Metropolis Hastings proposal variance updates to complete during the final run of Metropolis Hastings to convergence before fixing its value. Defualts to 50. Make sure that the selection of this number is such that the proposal variance is fixed after burnin.

slice_sample_alpha_m

Logical indicating whether hyperparameter optimization should be used to determine the optimal value of alpha. Defaults to FALSE. If TRUE, then alpha_m will be slice sampled. This can improve model fit.

slice_sample_step_size

The initial size of the slice to use when slice sampling alpha (hyperparameter optimization). Defaults to 1.

parallel

Argument indicating whether the token topic distributions should be generated in parallel. Defaults to FALSE. Can significantly reduce runtime when training a model with a large number of topics.

cores

The number of cores to be used if the parallel option is set to TRUE. Should not exceed the number of cores available on the machine and will not show performance gains if cores > topics.

output_directory

The directory where the user would like to store output from the model. Defaults to NULL. If NULL, then the current working directory will be used to store output if an output_name_stem is provided.

output_name_stem

Defaults to NULL. If not NULL, then output will be saved to disk using the output_name_stem to differentiate it from output from other model runs.

generate_plots

Logical indicating whether diagnostic and summary plots should be generated, defaults to TRUE.

verbose

Defaults to TRUE, if FALSE, then no output is printed to the screen by the inference code.

Value

An object of class CCAS containing estimation results.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
## Not run: 
set.seed(12345)
# read in data prepared by the prepare_data() function.
data(ComNet_data)
# specify a formula that we will use for testing.
formula <- ComNet_data ~ euclidean(d = 2) +
           nodemix("Gender", base = "Male")
CCAS_Object <- ccas(formula,
                    interaction_patterns = 4,
                    topics = 40,
                    alpha = 1,
                    beta = 0.01,
                    iterations = 20,
                    metropolis_hastings_iterations = 500,
                    final_metropolis_hastings_iterations = 10000,
                    final_metropolis_hastings_burnin = 5000,
                    thin = 1/10,
                    target_accept_rate = 0.25,
                    tolerance = 0.05,
                    adaptive_metropolis_update_size = 0.05,
                    LSM_proposal_variance = .5,
                    LSM_prior_variance = 1,
                    LSM_prior_mean = 0,
                    slice_sample_alpha_m = TRUE,
                    slice_sample_step_size = 1,
                    generate_plots = TRUE,
                    output_directory = NULL,
                    output_name_stem = NULL)

## End(Not run)

matthewjdenny/CCAS documentation built on May 21, 2019, 1:01 p.m.