remove_similar: Removes similar documents based on text similarity
In elizagrames/synthesisr: Import, Assemble, and Deduplicate Systematic Review Search Results

Removes documents from a data frame that are highly similar to other documents in the same data frame.

1	remove_similar(data, distance_data, id_column, distance_column, cutoff)

`data`	the data frame containing all documents
`distance_data`	a data frame with document identification and distance information
`id_column`	the name or index of the column in the distance dataset that contains document IDs
`distance_column`	the name or index of the column in the distance dataset that contains distance scores
`cutoff`	the maximum distance at which documents should be considered duplicates

the documents data frame with duplicate documents removed

elizagrames/synthesisr documentation built on May 26, 2019, 10:34 a.m.

elizagrames/synthesisr index

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

Description