Description Usage Arguments Value
A Function to run the ContentStructure model to convergence for one dataset.
1 2 3 4 5 6 7 | Run_Full_Model(Auth_Attr, Doc_Edge_Matrix, Doc_Word_Matrix, Vocab,
main_iterations = 4000, sample_step_burnin = 2e+06,
sample_step_iterations = 8e+06, sample_step_sample_every = 2000,
topics = 10, clusters = 2, latent_space_dimensions = 2,
run_MH_only = F, mixing_variable = NULL, Seed = 123456,
save_results_to_file = FALSE, output_directory = NULL,
output_filename = NULL, Main_Estimation_Results = NULL)
|
Auth_Attr |
A dataframe with one row for each unique sender/reciever and containing atleast one column with the ID of each sender/reciever and any number of additional varaibles which will be ignored unless specified as a binary attribute for which the user would like to calculate mixing parameter estimates by specifying the mixing_variable. |
Doc_Edge_Matrix |
A matrix with one row for each email and one column which records the index of the sender of the email (indexed from 1) followed by one column for each unique sender/receiver in the dataset. |
Doc_Word_Matrix |
A matrix with one row for each email and one column for each unique word in the vocabulary that records the number of times each word was used in each document. |
Vocab |
A vector containing every unique term in the vocabulary an corresponding in length to the number of columns in the Doc_Word_Matrix. |
main_iterations |
The number of iterations of Gibbs sampling for the LDA part of the model. We have found 4,000 seems to work well. |
sample_step_burnin |
The number of iterations of burnin that should be completed before sampling the latent space parameters when running MH for the LSM to convergence. |
sample_step_iterations |
The total number of iterations to run MH for the LSM for (before thinning). |
sample_step_sample_every |
How many iterations to skip when thinning the MH for the LSM chain in our MH for the LSM sample step. |
topics |
The number of topics to use |
clusters |
The number of topic clusters to use. |
latent_space_dimensions |
THe number of dimensions to be included in the latent space model. Note that plotting is only currently supported for two dimensions. |
run_MH_only |
If TRUE, then we only rerun MH for the LSM to convergence |
mixing_variable |
if not NULL, specifies the name of the binary variable in the author_attributes dataset that will be used to estimate mixing parameter effects. |
Seed |
Sets the seed in R and C++ for replicability. |
save_results_to_file |
A logical value indicating whether intermediate results should be saved to file or whether they will be return to the R session. |
output_directory |
This is where all output will be saved. Defaults to NULL if save_results_to_file == FALSE. |
output_filename |
The name of the .Rdata file you would like to save model output in. Defaults to NULL if save_results_to_file == FALSE. |
Main_Estimation_Results |
A list object returned by previous model estimation to be supplied if the user wishes to select run_MH_only == TRUE. Useful if the user would like to specify a greater number of iterations for the final step of LSM estimation. |
Does not return anything, just saves everything to our data_directory folder.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.