dataFusion: Fuse two datasets based on harmonized variable names

View source: R/matchResult.R

dataFusionR Documentation

Fuse two datasets based on harmonized variable names

Description

Concatenate two datasets given information on which columns are the same, removing any NA values and keeping only the specified columns in the result

Usage

dataFusion(d1, d2, fuseon, sourcecol)

Arguments

d1

A data.frame for the first dataset.

d2

A data.frame for the second dataset.

fuseon

A named vector of the harmonized features, where the names are the features in d1 and elements are features in d2.

sourcecol

Character name of key column used to identify the row sources (whether the row came from d1 or d2) after the two datasets are fused.

Details

The datasets can have different names and dimensions, with fuseon acting as the map. The implementation uses data.table and will output data.table. This also doesn't check for same data type and will simply coerce as necessary. A created sourcecol is conceptually the same as and uses idcol.

Value

A data.table of the fused data.


avucoh/DIVE documentation built on Aug. 29, 2023, 6:02 p.m.