match_signatures: Match images based on colour signatures

View source: R/match_signatures.R

match_signaturesR Documentation

Match images based on colour signatures

Description

match_signatures takes one or two vectors of image signatures and produces a Hamming distance matrix to identify matches.

Usage

match_signatures(
  x,
  y = NULL,
  distance = ~nearest * bilinear,
  compare_ar = TRUE,
  stretch = 1.2,
  mem_scale = 0.2,
  mem_override = FALSE,
  quiet = FALSE
)

Arguments

x, y

Vectors of class matchr_signature to be matched. If y is missing (default), each object in x will be matched against each other object in x. If y is present, each object in x will be matched against each object in y.

distance

A one-sided formula (or character string which can be coerced to a formula) with one or both of the terms nearest and bilinear, expressing how the Hamming distance between image signature vectors should be calculated. The default (~nearest * bilinear) takes the Hamming distances of each of the two image signature components and multiplies them together. Any arithmetical combination of these distances is a valid argument to distance, e.g. ~ nearest + log(bilinear).

compare_ar

A logical scalar. Should signatures only be compared for images with similar aspect ratios (default)? If TRUE, k-means clustering is used to identify breakpoints between aspect ratios that maximize between-group distance and minimize the total number of calculations that the function needs to execute. (Values of k between 3 and 8 are evaluated.) Image signatures from x are split into lists between these break points. This argument is forced to FALSE if either x or y has fewer than 10 non-NA elements.

stretch

A numeric scalar. When compare_ar is TRUE, in order to catch matches that would fall across a break point, image signatures from y are split into lists between the break point / stretch (default 1.2) on the lower bound and the break point * stretch on the upper bound. Increasing this value will possibly catch matches between extremely distorted images, but at the cost of potentially larger numbers of false positives, and substantially increased processing time.

mem_scale

A numeric scalar between 0 and 1. What portion of total system memory should be made available for a single correlation matrix calculation (default 0.2)? Increasing this value might speed up function execution, but at the cost of significantly increased system instability.

mem_override

A logical scalar. Should the function attempt to run even if it detects insufficient system memory (default FALSE)? If so, the usual error for insufficient memory will be downgraded to a warning.

quiet

A logical scalar. Should the function execute quietly, or should it return status updates throughout the function (default)?

Details

A function for identifying matching images. The function takes one or two matchr_signature vectors and compares their signatures using a calculation based on the Hamming distance to find matches.

The function can optionally filter images by aspect ratio, so only images with very similar aspect ratios will be compared. This can remove potential false positives and possibly speed up function execution, if images are relatively evenly split between aspect ratios.

Value

A vector of class matchr_matrix, each element of which is the Hamming distance for the x and y signatures falling in a given aspect ratio range. If x and y are both present, each matrix will have length(x) rows and length(y) columns, and for the matrix Q the cell Q[i, j] will be the Hamming distance between images x[i] and y[j]. If y is not present, each matrix will be square, and the cell Q[i, j] will be the Hamming distance between images x[i] and x[j]. The formula supplied to the distance argument will be present as an additional attribute to the return vector, named formula.

Examples

## Not run: 
# Setup
sigs <- create_signature(test_urls)

# Find matches within a single matchr_signature vector
match_signatures(sigs)

# Find matches between two matchr_signature vectors
match_signatures(sigs[1:8], sigs[9:15])

# To find matches between images with different aspect ratios, set `compare_ar = FALSE`
match_signatures(sigs, compare_ar = FALSE)

## End(Not run)

UPGo-McGill/matchr documentation built on July 19, 2023, 1:02 p.m.