all.equal.data.table: Equality Test Between Two Data Tables

View source: R/setops.R

all.equalR Documentation

Equality Test Between Two Data Tables

Description

Convenient test of data equality between data.table objects. Performs some factor level stripping.

Usage

  ## S3 method for class 'data.table'
all.equal(target, current, trim.levels=TRUE, check.attributes=TRUE,
    ignore.col.order=FALSE, ignore.row.order=FALSE, tolerance=sqrt(.Machine$double.eps),
    ...)

Arguments

target, current

data.tables to compare. If current is not a data.table, but check.attributes is FALSE, it will be coerced to one via as.data.table.

trim.levels

A logical indicating whether or not to remove all unused levels in columns that are factors before running equality check. It effect only when check.attributes is TRUE and ignore.row.order is FALSE.

check.attributes

A logical indicating whether or not to check attributes, will apply not only to data.table but also attributes of the columns. It will skip c("row.names",".internal.selfref") data.table attributes.

ignore.col.order

A logical indicating whether or not to ignore columns order in data.table.

ignore.row.order

A logical indicating whether or not to ignore rows order in data.table. This option requires datasets to use data types on which join can be made, so no support for list, complex, raw, but still supports integer64.

tolerance

A numeric value used when comparing numeric columns, by default sqrt(.Machine$double.eps). Unless non-default value provided it will be forced to 0 if used together with ignore.row.order and duplicate rows detected or factor columns present.

...

Passed down to internal call of all.equal.

Details

For efficiency data.table method will exit on detected non-equality issues, unlike most all.equal methods which process equality checks further. Besides that fact it also handles the most time consuming case of ignore.row.order = TRUE very efficiently.

Value

Either TRUE or a vector of mode "character" describing the differences between target and current.

See Also

all.equal

Examples

dt1 <- data.table(A = letters[1:10], X = 1:10, key = "A")
dt2 <- data.table(A = letters[5:14], Y = 1:10, key = "A")
isTRUE(all.equal(dt1, dt1))
is.character(all.equal(dt1, dt2))

# ignore.col.order
x <- copy(dt1)
y <- dt1[, .(X, A)]
all.equal(x, y)
all.equal(x, y, ignore.col.order = TRUE)

# ignore.row.order
x <- setkeyv(copy(dt1), NULL)
y <- dt1[sample(nrow(dt1))]
all.equal(x, y)
all.equal(x, y, ignore.row.order = TRUE)

# check.attributes
x = copy(dt1)
y = setkeyv(copy(dt1), NULL)
all.equal(x, y)
all.equal(x, y, check.attributes = FALSE)
x = data.table(1L)
y = 1L
all.equal(x, y)
all.equal(x, y, check.attributes = FALSE)

# trim.levels
x <- data.table(A = factor(letters[1:10])[1:4]) # 10 levels
y <- data.table(A = factor(letters[1:5])[1:4]) # 5 levels
all.equal(x, y, trim.levels = FALSE)
all.equal(x, y, trim.levels = FALSE, check.attributes = FALSE)
all.equal(x, y)

Rdatatable/data.table documentation built on Dec. 21, 2024, 3:08 a.m.