Diagnostics to Assess the Effects of Text Preprocessing Decisions

calculate_prediction_errorsCalculate mean prediction error for preprocessing decisions.
dfm_scaling_testComparison of dfms using N-dimensional scaling, with a test...
document_position_plotsDocument Position Plots
factorial_preprocessingA function to perform factorial preprocessing of a corpus of...
mantel_comparisonEnsemble Mantel Tests
mantel_comparison_to_baseEnsemble Mantel Tests
optimal_k_comparisonOptimal Topic Model k Comparison
preprocessing_choice_regressionPreprocessing Choice Regressions
preTextpreText: Diagnostics to Assess The Effects of Text...
preText_score_plotpreText specification plot
preText_testpreText Test
regression_coefficient_plotRegression Coefficient Plot
remove_infrequent_termsRemove infrequently occurring terms from quanteda dfm.
scaling_comparisonScaling Comparison.
topic_key_term_plotPlot Prevalence of Topic Key Terms
topic_novelty_scoreTopic Top-Terms Novelty Score
UK_ManifestosFull text of 69 UK party manifestos from 1918-2001.
wordfish_comparisonWordfish Comparison.
wordfish_rank_plotPlot of Wordfish rankings of documents
