Functions to prepare gender and science reports for analysis with STM, and to run that analysis

british_to_american_spellingsReturn character "dictionary" for converting British to...
create_stm_inputCreate objects for input to STM functions
data.tabledata.table package
extract_text_from_pdfsExtract text from PDF files
get_ngramsExtract and save ngrams from text
make_sure_dirCreate directory if it doesn't already exist
remove_boilerplateRemove repetitive "boilerplate" text from documents
remove_named_entitiesRemove named entities using Python module ntlk
remove_state_and_country_namesRemove state and country names
run_stmRun STM analysis
substitute_documentsSubstitute alternate versions of some documents
