Description Details Running the User Interface Backend Wrapper Functions Dissimilarity Functions Matrix Summary Functions
This package provides an interactive way to visualize potentially duplicated records across tabular data sets by calculating dissimilarity scores on user-specified columns in the data.
Find matching patient records across tabular datasets
The user interface can be invoked with
the function launch
. This will launch the app in your
browser.
The backend to the user interface is a modular set of functions that can calculate dissimilarity scores on any column(s) of the data. Once dissimilarity scores are calculated, they are given weights based on importance, summed, and scaled from zero to one. This resulting matrix is traversed, and indices below the given threshold are returned.
The wrapper functions provide a way to programmatically execute the distance functions on the data. They retun a list of matrices and a list of matching indices, respectively.
processFunctionList
matchEpiData
Each dissimilarity function returns a distance matrix scaled from 0 to 1 where 0 indicates a perfect match and 1 indicates no match. The following distances are available:
ageDists
dateDists
genderDists
genericDists
locationDists
nameDists
Once matrices are computed and stored in a list, they have weights applied, and are summed. When summing, missing values are given a custom defined weight (default 0.5). The following functions work with the matrices:
collapseDistMatrices
returnMatches
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.