Description Usage Arguments Value See Also Examples
Compare two data frames (or objects coercible to data frames) and produce a dataCompareR object containing
details of the matching and mismatching elements of the data. See vignette("dataCompareR")
for more details.
1 2 3 4 5 6 7 8 |
dfA |
data frame. The first data object. dataCompareR will attempt to coerce all data objects to data frames. |
dfB |
data frame. The second data object. dataCompareR will attempt to coerce all data objects to data frames. |
keys |
String. Name of identifier column(s) used to compare dfA and dfB. NA if no identifier (row order will be used instead), a character for a single column name, or a vector of column names to match of multiple columns |
roundDigits |
Integer. If NA, numerics are not rounded before comparison. If specified, numerics are rounded to the specified number of decimal places using round. |
mismatches |
Integer. The max number of mismatches to assess, after which dataCompareR will stop (without producing an dataCompareR object). Designed to improve performance for large data sets. |
trimChars |
Boolean. If true, strings and factors have whitespace trimmed before comparison. |
An dataCompareR object. An S3 object containing details of the comparison between the two data objects. Can be used with summary, print, saveReport and generateMismatchData
Other dataCompareR.functions:
generateMismatchData()
,
print.dataCompareRobject()
,
saveReport()
,
summary.dataCompareRobject()
1 2 3 4 5 6 7 8 9 10 11 12 13 | iris2 <- iris
iris2 <- iris2[1:130,]
iris2[1,1] <- 5.2
iris2[2,1] <- 5.2
rCompare(iris,iris2,key=NA)
compDetails <- rCompare(iris,iris2,key=NA, trimChars = TRUE)
print(compDetails)
summary(compDetails)
pressure2 <- pressure
pressure2[2,2] <- pressure2[2,2] + 0.01
rCompare(pressure2,pressure2,key='temperature')
rCompare(pressure2,pressure2,key='temperature', mismatches = 10)
|
Running rCompare...
All columns were compared, 20 row(s) were dropped from comparison
There are 1 mismatched variables:
First and last 5 observations for the 1 mismatched variables
rowNo valueA valueB variable typeA typeB diffAB
1 1 5.1 5.2 SEPAL.LENGTH double double -0.1
2 2 4.9 5.2 SEPAL.LENGTH double double -0.3
Warning messages:
1: `select_()` is deprecated as of dplyr 0.7.0.
Please use `select()` instead.
This warning is displayed once every 8 hours.
Call `lifecycle::last_warnings()` to see where this warning was generated.
2: `funs()` is deprecated as of dplyr 0.8.0.
Please use a list of either functions or lambdas:
# Simple named list:
list(mean = mean, median = median)
# Auto named with `tibble::lst()`:
tibble::lst(mean, median)
# Using lambdas
list(~ mean(., trim = .2), ~ median(., na.rm = TRUE))
This warning is displayed once every 8 hours.
Call `lifecycle::last_warnings()` to see where this warning was generated.
Running rCompare...
All columns were compared, 20 row(s) were dropped from comparison
There are 1 mismatched variables:
First and last 5 observations for the 1 mismatched variables
rowNo valueA valueB variable typeA typeB diffAB
1 1 5.1 5.2 SEPAL.LENGTH double double -0.1
2 2 4.9 5.2 SEPAL.LENGTH double double -0.3
dataCompareR is generating the summary...
Data Comparison
===============
Date comparison run: 2021-03-29 12:53:42
Comparison run on R version 4.0.3 (2020-10-10)
With dataCompareR version 0.1.3
Meta Summary
============
|Dataset Name |Number of Rows |Number of Columns |
|:------------|:--------------|:-----------------|
|iris |150 |5 |
|iris2 |130 |5 |
Variable Summary
================
Number of columns in common: 5
Number of columns only in iris: 0
Number of columns only in iris2: 0
Number of columns with a type mismatch: 0
No match key used, comparison is by row
Row Summary
===========
Total number of rows read from iris: 150
Total number of rows read from iris2: 130
Number of rows in common: 130
Number of rows dropped from iris: 20
Number of rows dropped from iris2: 0
Data Values Comparison Summary
==============================
Number of columns compared with ALL rows equal: 4
Number of columns compared with SOME rows unequal: 1
Number of columns with missing value differences: 0
Columns with all rows equal : PETAL.LENGTH, PETAL.WIDTH, SEPAL.WIDTH, SPECIES
Summary of columns with some rows unequal:
|Column |Type (in iris) |Type (in iris2) | # differences|Max difference | # NAs|
|:------------|:--------------|:---------------|-------------:|:--------------|-----:|
|SEPAL.LENGTH |double |double | 2|0.3 | 0|
Unequal column details
======================
#### Column - SEPAL.LENGTH
| | SEPAL.LENGTH (iris)| SEPAL.LENGTH (iris2)|Type (iris) |Type (iris2) | Difference|
|:--|-------------------:|--------------------:|:-----------|:------------|----------:|
|1 | 5.1| 5.2|double |double | -0.1|
|2 | 4.9| 5.2|double |double | -0.3|
Running rCompare...
All columns were compared, all rows were compared
All compared variables match
Number of rows compared: 19
Number of columns compared: 2Warning message:
`arrange_()` is deprecated as of dplyr 0.7.0.
Please use `arrange()` instead.
See vignette('programming') for more help
This warning is displayed once every 8 hours.
Call `lifecycle::last_warnings()` to see where this warning was generated.
Running rCompare...
All columns were compared, all rows were compared
All compared variables match
Number of rows compared: 19
Number of columns compared: 2
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.