# funique: Fast duplicated and unique In kit: Data Manipulation Functions Implemented in C

## Description

Similar to base R functions `duplicated` and `unique`, `fduplicated` and `funique` are slightly faster for vectors and much faster for `data.frame`. Function `uniqLen` is equivalent to base R `length(unique)` or `data.tbale::uniqueN`.

## Usage

 ```1 2 3``` ``` fduplicated(x, fromLast = FALSE) funique(x, fromLast = FALSE) uniqLen(x) ```

## Arguments

 `x` A vector, data.frame or matrix. `fromLast` A logical value to indicate whether the search should start from the end or beginning. Default is `FALSE`.

## Value

Function `fduplicated` returns a logical vector and `funique` returns a vector of the same type as `x` without the duplicated value. Function `uniqLen` returns an integer.

Morgan Jacob

## Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48``` ```# Example 1: fduplicated fduplicated(iris\$Species) # Example 2: funique funique(iris\$Species) # Example 3: uniqLen uniqLen(iris\$Species) # Benchmarks # ---------- # x = sample(c(1:10,NA_integer_),1e8,TRUE) # 382 Mb # microbenchmark::microbenchmark( # duplicated(x), # fduplicated(x), # times = 5L # ) # Unit: seconds # expr min lq mean median uq max neval # duplicated(x) 2.21 2.21 2.48 2.21 2.22 3.55 5 # fduplicated(x) 0.38 0.39 0.45 0.48 0.49 0.50 5 # # vs data.table # ------------- # df = iris[,5:1] # for (i in 1:16) df = rbind(df, df) # 338 Mb # dt = data.table::as.data.table(df) # microbenchmark::microbenchmark( # kit = funique(df), # data.table = unique(dt), # times = 5L # ) # Unit: seconds # expr min lq mean median uq max neval # kit 1.22 1.27 1.33 1.27 1.36 1.55 5 # data.table 6.20 6.24 6.43 6.33 6.46 6.93 5 # (setDTthreads(1L)) # data.table 4.20 4.25 4.47 4.26 4.32 5.33 5 # (setDTthreads(2L)) # # microbenchmark::microbenchmark( # kit=uniqLen(x), # data.table=uniqueN(x), # times = 5L, unit = "s" # ) # Unit: seconds # expr min lq mean median uq max neval # kit 0.17 0.17 0.17 0.17 0.17 0.17 5 # data.table 1.66 1.68 1.70 1.71 1.71 1.72 5 # (setDTthreads(1L)) # data.table 1.13 1.15 1.16 1.16 1.18 1.18 5 # (setDTthreads(2L)) ```

kit documentation built on March 9, 2021, 5:12 p.m.