grpDuplicated: Grouping by duplicated elements

Description Usage Arguments Details Value Author(s) See Also Examples

View source: R/duplicated.matrix.R

Description

grpDuplicated is a generic function that outputs an integer vector such that input elements share a common output integer value if and only if they are identical to each other; in other words, duplicated elements are grouped together with a common integer group number. From matrices, this can be used to reconstruct the original matrix from the result of unique funciton.

Usage

1
2
3
4
5
6
7
grpDuplicated(x, incomparables = FALSE, factor=FALSE, ...)
## Default S3 method:
grpDuplicated(x, incomparables = FALSE, factor=FALSE, 
	fromLast=FALSE, signif=Inf, ...)
## S3 method for class 'matrix'
grpDuplicated(x, incomparables = FALSE, factor=FALSE, MARGIN = 1, 
	fromLast = FALSE, signif=Inf, ...)

Arguments

x

a vector or matrix of atomic mode "numeric", "integer", "logical", "complex", "character" or "raw". Currently, non-atomic vector/matrix is not supported.

incomparables

a vector of values that cannot be compared, as in base::unique.matrix. Only FALSE is supported.

factor

a logical scalar, indicating if the result should be given as an integer vector (default) or a factor.

fromLast

a logical scalar indicating if duplication should be considered from the last, as in base::unique.matrix.

...

arguments for particular methods.

MARGIN

a numeric scalar, the matrix margin to be held fixed, as in apply. Only MARGIN=0, MARGIN=1 and MARGIN=2 are allowed values.

signif

a numerical scalar only applicable to numeric or complex x. If signif=NULL, then x will first be passed to signif function with the number of significant digits being the C constant DBL_DIG, as explained in as.character. If signif=Inf (which is the default value), then x is untouched before finding duplicates. If signif is any other number, it specifies the required number of significant digits for signif function.

Details

For atomic x, the implementation is based on std::map in C++98 standard template library on systems with empty R CMD config CXX1X, and based on std::unordered_map otherwise.

grpDuplicated function returns vector of integers that agree with (up to signif digits) the corresponding results from the call to unique if MARGIN=1 or MARGIN=2, as long as the same fromLast argument is used for both grpDuplicated and unique. Specifically, all the following will recover the original x values (attributes being ignored) when x is a matrix:

unique(x, MARGIN=1L, fromLast=tf)[grpDuplicated(x, MARGIN=1L, fromLast=tf),,drop=TRUE]
unique(x, MARGIN=2L, fromLast=tf)[, grpDuplicated(x, MARGIN=2L, fromLast=tf),,drop=TRUE]

where tf above is either TRUE or FALSE.

Value

If factor = FALSE, the result is an integer vector with all elements ranging from 1 to k, where k is the number of unique elements. For vector x or a matrix x with MARGIN=0, the output has the same length as the input; for matrix x, the output has length NROW(x) if MARGIN=1 and length NCOL(x) if MARGIN=2.

If factor = TRUE, the result is a factor, with levels being 1 through k.

In either case, the nlevels attribute of the result will be set to k.

Author(s)

Long Qu

See Also

duplicated.matrix

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
## prepare example data
set.seed(9992722L, kind="Mersenne-Twister")
trt.original=gl(5,8)[sample(40)]

## equivalent recoding: 
(trt.equivalent=grpDuplicated(trt.original, factor=TRUE))
## check equivalence: should be a permutation matrix:
(table(trt.original, trt.equivalent)!=0)*1

## equivalent recoding based on a design matrix
x.double=model.matrix(~trt.original)
(trt.equivalent=grpDuplicated(x.double, factor=TRUE, MARGIN=1))

## check equivalence: should be a permutation matrix:
(table(trt.original, trt.equivalent)!=0)*1

## check equivalence: recovering the original matrix from unique: 
x.uniq.row=unique(x.double, MARGIN=1L)
all.equal(x.double, x.uniq.row[trt.equivalent,], check.attributes=FALSE)

x.uniq.row=unique(x.double, MARGIN=1L, fromLast=TRUE)
all.equal(x.double, x.uniq.row[grpDuplicated(x.double, fromLast=TRUE),], check.attributes=FALSE)

uniqueAtomMat documentation built on July 9, 2017, 1:02 a.m.