knitr::opts_chunk$set(echo = TRUE)
my_directory <- "~/synthesisr/inst/extdata/" search_results <- synthesisr::import_results(directory=my_directory, filename = NULL, save_dataset = FALSE, verbose = TRUE) first_dedupe <- synthesisr::deduplicate(df=search_results, field = "title", method = "quick", language = "English") second_dedupe <- synthesisr::deduplicate(df=first_dedupe, field="title", method="similarity", language="English", cutoff_distance = 2)
Before accepting our deduped dataset, we want to check for any multilingual, chimera titles that could be limiting the functionality of the dedupe function. If chimeras are detected, they should be split into single-language titles and passed back to the deduplicate function. An automated function to trim chimera text is pending development for the next version of synthesisr.
any_chimeras <- synthesisr::chimera_detect(second_dedupe$title, overlap = .6)
my_dfm <- synthesisr::create_dfm(second_dedupe$abstract, language="English")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.