duplicated.ff: Duplicated for ff and ffdf objects

Description Usage Arguments Value See Also Examples

View source: R/duplicated.R

Description

Duplicated for ff and ffdf objects similar as in duplicated.
Remark that this duplicated function is slightly different from the duplicated method in the base package as it first orders the ffdf or ff_vector object and then applies duplicated. This means you need to order the ffdf or ff_vector in case you want to have the exact same result as the result of the base package. See the example.

Usage

1
2
3
4
5
## S3 method for class 'ff'
duplicated(x, incomparables = FALSE, fromLast = FALSE, trace = FALSE, ...)

## S3 method for class 'ffdf'
duplicated(x, incomparables = FALSE, fromLast = FALSE, trace = FALSE, ...)

Arguments

x

ff object or ffdf object

incomparables

a vector of values that cannot be compared. FALSE is a special value, meaning that all values can be compared, and may be the only value accepted for methods other than the default. It will be coerced internally to the same type as x.

fromLast

logical indicating if duplication should be considered from the last, i.e., the last (or rightmost) of identical elements will be kept

trace

logical indicating to show on which chunk the function is computing

...

other parameters passed on to chunk

Value

A logical ff vector of length nrow(x) or length(x) indicating if each row or element is duplicated.

See Also

duplicated, ffdforder, fforder

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
## duplicated.ffdf - mark that you need to order according to the records you 
## like in order to have similar results as the base unique method 
data(iris)
irisdouble <- rbind(iris, iris)
irisdouble <- irisdouble[ sample(x=1:nrow(irisdouble), size=nrow(irisdouble)
                        , replace = FALSE), ]
ffiris <- as.ffdf(irisdouble)
duplicated(ffiris, by=10, trace=TRUE)
duplicated(ffiris$Sepal.Length, by=10, trace=TRUE)
table(duplicated(irisdouble), duplicated(ffiris, by=10)[])
irisdouble <- irisdouble[order(apply( irisdouble
                                    , FUN=function(x) paste(x, collapse=".")
                                    , MARGIN=1
                                    )), ]
ffiris <- as.ffdf(irisdouble)
table(duplicated(irisdouble), duplicated(ffiris, by=10)[])
table(duplicated(ffiris$Sepal.Width, by=10)[], duplicated(ffiris$Sepal.Width[]))

measures <- c("Sepal.Width","Species")
irisdouble <- irisdouble[order(apply( irisdouble[, measures]
                                    , FUN=function(x) paste(x, collapse=".")
                                    , MARGIN=1)), ]
ffiris <- as.ffdf(irisdouble)
table(duplicated(irisdouble[, measures]), duplicated(ffiris[measures], by=10)[])
table(duplicated(ffiris$Sepal.Width, by=10)[], duplicated(ffiris$Sepal.Width[]))

ffbase documentation built on Feb. 27, 2021, 5:06 p.m.