compareDiff: Get differences between two data.frames

View source: R/compareTables.R

compareDiffR Documentation

Get differences between two data.frames

Description

Get differences between two data.frames

Usage

compareDiff(
  newData,
  oldData,
  referenceVars = intersect(colnames(newData), colnames(oldData)),
  changeableVars = NULL
)

Arguments

newData

data.frame object representing the new data

oldData

data.frame object representing the old data

referenceVars

character vector of the columns in the data that are the used as reference for the comparison.
If not specified, all columns present both in newData and oldData are considered.

changeableVars

character vector of the columns in the data for which you want to assess the change, e.g. variables that might have changed from the old to the new data.
If not specified, only 'Addition' and 'Removal' are detected.

Value

Object of class 'diff.data', i.e. a data.frame with columns:

  • 'Comparison type': type of difference between the old and new data, either:

    • 'Change': records present both in new and old data, based on the reference variables, but with difference(s) in changeable vars

    • 'Addition': records with reference variables present in new but not in old data

    • 'Removal': records with reference variables present in old but not in new data

  • 'Version': 'Previous' or 'Current' depending if record represents content from old or new data respectively

  • referenceVars

  • changeableVars

Identification of the differences between datasets

To identify the differences between datasets, the following steps are followed:

  1. removal of records identical between the old and new dataset (will be considered as 'Identical' later on)

  2. records with a reference value present in the old dataset but not in the new dataset are considered 'Removal'

  3. records with a reference value present in the new dataset but not in the old dataset are considered 'Addition'

  4. records with reference value present both in the new and old dataset, after filtering of identical records and with difference in the changeable variables are considered 'Change'

Author(s)

Laure Cougnaud


clinUtils documentation built on May 29, 2024, 5:01 a.m.