match returns a vector of the positions of (first) matches of
its first argument in its second.
%in% is a more intuitive interface as a binary operator,
which returns a logical vector indicating if there is a match or not
for its left operand.
1 2 3
the value to be returned in the case when no match is
found. Note that it is coerced to
a vector of values that cannot be matched. Any
%in% is currently defined as
"%in%" <- function(x, table) match(x, table, nomatch = 0) > 0
Factors, raw vectors and lists are converted to character vectors, and
table are coerced to a common type (the later
of the two types in R's ordering, logical < integer < numeric <
complex < character) before matching. If
positive length it is coerced to the common type.
Matching for lists is potentially very slow and best avoided except in simple cases.
Exactly what matches what is to some extent a matter of definition.
For all types,
NA and no other value.
For real and complex values,
NaN values are regarded
as matching any other
NaN value, but not matching
where for complex
x, real and imaginary parts must match both
(unless containing at least one
Character strings will be compared as byte sequences if any input is
"bytes", and otherwise are regarded as equal if they are
in different encodings but would agree when translated to UTF-8 (see
%in% never returns
NA makes it particularly
A vector of the same length as
match: An integer vector giving the position in
the first match if there is a match, otherwise
x[i] is found to equal
table[j] then the value
returned in the
i-th position of the return value is
for the smallest possible
j. If no match is found, the value
%in%: A logical vector, indicating if a match was located for
each element of
x: thus the values are
FALSE and never
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) The New S Language. Wadsworth & Brooks/Cole.
charmatch for (partial)
match.arg, etc for function argument
findInterval similarly returns a vector of positions, but
finds numbers within intervals, rather than exact matches.
is.element for an S-compatible equivalent of
duplicated) are using the same
definitions of “match” or “equality” as
and these are less strict than
==, e.g., for
NaN in numeric or complex vectors,
or for strings with different encodings, see also above.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
## The intersection of two sets can be defined via match(): ## Simple version: ## intersect <- function(x, y) y[match(x, y, nomatch = 0)] intersect # the R function in base is slightly more careful intersect(1:10, 7:20) 1:10 %in% c(1,3,5,9) sstr <- c("c","ab","B","bba","c",NA,"@","bla","a","Ba","%") sstr[sstr %in% c(letters, LETTERS)] "%w/o%" <- function(x, y) x[!x %in% y] #-- x without y (1:10) %w/o% c(3,7,12) ## Note that setdiff() is very similar and typically makes more sense: c(1:6,7:2) %w/o% c(3,7,12) # -> keeps duplicates setdiff(c(1:6,7:2), c(3,7,12)) # -> unique values ## Illuminating example about NA matching r <- c(1, NA, NaN) zN <- c(complex(real = NA , imaginary = r ), complex(real = r , imaginary = NA ), complex(real = r , imaginary = NaN), complex(real = NaN, imaginary = r )) zM <- cbind(Re=Re(zN), Im=Im(zN), match = match(zN, zN)) rownames(zM) <- format(zN) zM ##--> many "NA's" (= 1) and the four non-NA's (3 different ones, at 7,9,10) length(zN) # 12 unique(zN) # the "NA" and the 3 different non-NA NaN's stopifnot(identical(unique(zN), zN[c(1, 7,9,10)])) ## very strict equality would have 4 duplicates (of 12): symnum(outer(zN, zN, Vectorize(identical,c("x","y")), FALSE,FALSE,FALSE,FALSE)) ## removing "(very strictly) duplicates", i <- c(5,8,11,12) # we get 8 pairwise non-identicals : Ixy <- outer(zN[-i], zN[-i], Vectorize(identical,c("x","y")), FALSE,FALSE,FALSE,FALSE) stopifnot(identical(Ixy, diag(8) == 1))
We want your feedback!
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.