View source: R/confirm_matches.R
confirm_matches | R Documentation |
confirm_matches
takes the image matches produced by
identify_matches
and displays them in an interactive Shiny app
for visual inspection and confirmation. Image matches with extremely low
Hamming distances can be optionally excluded, and pairwise duplicates can be
detected and excluded as well.
confirm_matches(
result,
remove_duplicates = TRUE,
batch_size = 100L,
thresholds = c(Identical = 80L, Match = 100L, `Likely match` = 120L, `Possible match`
= 150L),
previous = TRUE,
quiet = FALSE
)
result |
A data frame produced by |
remove_duplicates |
A logical scalar. Should x-y pairs which are identical to other x-y pairs be reduced to a single x-y pair? This step can be computationally expensive for large datasets, but can dramatically reduce the number of matches to be verified. |
batch_size |
An integer scalar. The number of images to display at a time in the Shiny app (default 100). |
thresholds |
A named integer vector. Which Hamming distances establish
thresholds for an "Identical" match (default 2L), a "Match" (default 4L), a
"Likely match" (default 12L), a "Possible match" (default 15L), and "No
match" (remaining values)? Image pairs with a distance equal to or less than
the "Identical" threshold will be considered exact duplicates and will not be
shown for verification in the comparison app. (Set "Identical" to -1L to
force manual verification of all image pairs). Remaining image pairs will be
grouped in the comparison app by these thresholds. Image pairs with distances
equal to or under the "Likely match" value will be given a default value of
"match" in the comparison app, while others will be given a default value of
"no match". If |
previous |
A logical scalar. Should the results of previous runs of
|
quiet |
A logical scalar. Should the function execute quietly, or should it return status updates throughout the function (default)? |
The interface presents pairs of images alongside a best guess as to the match status ("Match" or "No match"). For matches which are correctly identified, no further action is necessary, while incorrect identifications can be corrected by clicking "Match" or "No match" next to the image pair. Images are presented in batches, and at any point the user can click the "Save and exit" button to close the comparison app and retrieve the results up through the last batch which was viewed. This means that even extremely large sets of potential matches can be manually verified over the course of several sessions.
Through the "Enable highlighting" button, specific matches can be highlighted for further follow-up after image comparison is finished.
The Shiny app will only launch in an interactive R session; if
confirm_matches
is called in a non-interactive context, it will
identify identical matches according to the thresholds
argument and return
only those results.
A data frame with the following fields: index
from the original
result
data frame; a logical vector new_match_status
, which is TRUE for
confirmed matches, FALSE for confirmed non-matches, and NA for matches which
were not confirmed; and a logical vector new_highlight
which is TRUE for
any matches which were highlighted using the in-app interface, FALSE for
matches which were not highlighted, and NA for matches which were not
confirmed. Confirmation is determined by how many pages into the Shiny app
the user proceeded, and thus how many pairings were viewed. If all pages are
viewed, then the output will have no NA values.
## Not run:
# Setup
sigs <- create_signature(test_urls)
matches <- match_signatures(sigs)
result <- identify_matches(matches)
# Assign the output of compare_images to retrieve results
change_table <- confirm_matches(result)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.