multiMatch: Simple Multi-to-Multi Matching of (Concatenated) Terms

View source: R/multiMatch.R

multiMatchR Documentation

Simple Multi-to-Multi Matching of (Concatenated) Terms

Description

This function allows convenient matching of multi-to-multi relationships between two objects/vectors. It was designed for finding common elements in multiple to multiple matching situations (eg when comparing c("aa; bb", "cc") to c("bb; ab","dd"), ie to find 'bb' as matching between both objects).

Usage

multiMatch(
  x,
  y,
  sep = "; ",
  sep2 = NULL,
  method = "byX",
  silent = FALSE,
  debug = FALSE,
  callFrom = NULL
)

Arguments

x

(vector or list) first object to compare; if vector, the (partially) concatenated identifyers (will be split using separator sep), or list of items to be matched (ie already split)

y

(vector or list) second object to compare; if vector, the (partially) concatenated identifyers (will be split using separator sep), or list of items to be matched (ie already split)

sep

(character, length=1) separator used to split concatenated identifyers (if x or y is vector)

sep2

(character, length=1) optional separator used when method="matched" to concatenate all indexes of y for column y.allInd

method

(character) mode of operation: 'asIndex' to return index of y (those hwo have matches) with names of x (which x are the correpsonding match)

silent

(logical) suppress messages

debug

(logical) display additional messages for debugging

callFrom

(character) allow easier tracking of message(s) produced

Details

method='byX' .. returns data.frame with view oriented towards entries of x: character column x for entire content of x; integer column x.Ind for index of x; character column TagBest for most frequent matching isolated tag/ID; integer column y.IndBest index of most frequent matching y; character column y.IndAll index for all y matching any of the tags; character column y.Match for entire content of best matching y; character column y.Adj for y adjusted to best matching y for easier subsequent perfect matching.

method=c("byX","filter") .. combinded argument to keep only lines with any matches

method='byTag' .. returns matrix (of integers) from view of isolated tags from x (a separate line for each tag from x matching to y);

method=c("byTag","filter") ..if combined as arguments, this will return a data.frame for all unique tags with any matches between x and y, with additional colunms x.AllInd for all matching x-indexes, y.IndBest best matching y index; x.n for number of different x conatining this tag; y.AllInd for all matching y-indexes

method='adjustXtoY' .. returns vector with x adjusted to y, ie those elements of x matching are replace by the exact corresponding term of y.

method=NULL .. If no term matching the options shown above is given, another version of 'asIndex' is returned, but indexes to y _after_ spliting by sep. Again, this method can be filtered by using method="filter" to focus on the best matches to x.

Value

matrix, data.frame or list with matching results depending on method chosen

See Also

match; strsplit

Examples

aa <- c("m","k", "j; aa", "m; aa; bb; o; ee", "n; dd; cc", "aa", "cc")
bb <- c("dd; r", "aa", "ee; bb; q; cc", "p; cc")
(match1 <- multiMatch(aa, bb, method=NULL))      # match bb to aa
(match2 <- multiMatch(aa, bb, method="byX"))     # match bb to aa
(match3 <- multiMatch(aa, bb, method="byTag"))   # match bb to aa
(match4 <- multiMatch(aa, bb, method=c("byTag","filter")))   # match bb to aa


wrMisc documentation built on Sept. 11, 2024, 6:10 p.m.