diffdf: diffdf

Description Usage Arguments Examples

Description

Compares 2 dataframes and outputs any differences.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
diffdf(
  base,
  compare,
  keys = NULL,
  suppress_warnings = FALSE,
  strict_numeric = TRUE,
  strict_factor = TRUE,
  file = NULL,
  tolerance = sqrt(.Machine$double.eps),
  scale = NULL
)

Arguments

base

input dataframe

compare

comparison dataframe

keys

vector of variables (as strings) that defines a unique row in the base and compare dataframes

suppress_warnings

Do you want to suppress warnings? (logical)

strict_numeric

Flag for strict numeric to numeric comparisons (default = TRUE). If False diffdf will cast integer to double where required for comparisons. Note that variables specified in the keys will never be casted.

strict_factor

Flag for strict factor to character comparisons (default = TRUE). If False diffdf will cast factors to characters where required for comparisons. Note that variables specified in the keys will never be casted.

file

Location and name of a text file to output the results to. Setting to NULL will cause no file to be produced.

tolerance

Set tolerance for numeric comparisons. Note that comparisons fail if (x-y)/scale > tolerance.

scale

Set scale for numeric comparisons. Note that comparisons fail if (x-y)/scale > tolerance. Setting as NULL is a slightly more efficient version of scale = 1.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
x <- subset( iris,  -Species)
x[1,2] <- 5
COMPARE <- diffdf( iris, x)
print( COMPARE )
print( COMPARE , "Sepal.Length" )

#### Sample data frames

DF1 <- data.frame(
    id = c(1,2,3,4,5,6),
    v1 = letters[1:6],
    v2 = c(NA , NA , 1 , 2 , 3 , NA)
)

DF2 <- data.frame(
    id = c(1,2,3,4,5,7),
    v1 = letters[1:6],
    v2 = c(NA , NA , 1 , 2 , NA , NA),
    v3 = c(NA , NA , 1 , 2 , NA , 4)
)

diffdf(DF1 , DF1 , keys = "id")

# We can control matching with scale/location for example:

DF1 <- data.frame(
    id = c(1,2,3,4,5,6),
    v1 = letters[1:6],
    v2 = c(1,2,3,4,5,6)
)
DF2 <- data.frame(
    id = c(1,2,3,4,5,6),
    v1 = letters[1:6],
    v2 = c(1.1,2,3,4,5,6)
)

diffdf(DF1 , DF2 , keys = "id")
diffdf(DF1 , DF2 , keys = "id", tolerance = 0.2)
diffdf(DF1 , DF2 , keys = "id", scale = 10, tolerance = 0.2)
 
# We can use strict_factor to compare factors with characters for example:

DF1 <- data.frame(
    id = c(1,2,3,4,5,6),
    v1 = letters[1:6],
    v2 = c(NA , NA , 1 , 2 , 3 , NA), 
    stringsAsFactors = FALSE
)

DF2 <- data.frame(
    id = c(1,2,3,4,5,6),
    v1 = letters[1:6],
    v2 = c(NA , NA , 1 , 2 , 3 , NA)
)

diffdf(DF1 , DF2 , keys = "id", strict_factor = TRUE)
diffdf(DF1 , DF2 , keys = "id", strict_factor = FALSE)
 

diffdf documentation built on March 26, 2020, 6:30 p.m.