dframeCompare: Compare two dataframes for their differences

Description Usage Arguments Details Value

Description

Creates a minimal output for easy perusal of where two datasets differ. Useful for determining whether two datasets are "meaningfully" different, where that judgement is subjective, by providing the user minimal information to evaluate.

Usage

1
dframeCompare(df1, df2, ids, comp_cols = NULL, ...)

Arguments

df1

(data.frame) The first dataset

df2

(data.frame) The second dataset

ids

(character vector) Columns that uniquely identify observations in the datasets to be compared

comp_cols

= NULL

Details

Useful for confirming reproducible analysis or on receiving updated data from a different party.

Assumptions made: * The ID keys are unique * Shared columns share the same name * Factors should be compared as characters, not numerics

Value

(list) Containing three named objects: * "differences_df" - (data.frame) Contains the ID columns, columns from df1 and df2 (marked by postscript, e.g. "_df1") where any discrepancy was found, and rows where a discrepancy was found. Only cells with identified discrepancies are populated. * "id_overlap" - (list) A named list containing character vectors describing the union and set differences of ID values: * "ids_in_both" * "ids_in_df1" * "ids_in_df2" * "column_overlap" - (list) A named list containing character vectors describing the union and set differences of column values: * "columns_in_both" * "columns_in_df1" * "columns_in_df2"


ElizabethAB/rutils documentation built on May 6, 2019, 3:24 p.m.