best.merged.dt: Looks for the best merging operation(s) between two...

View source: R/dt_chk.R

best.merged.dtR Documentation

Looks for the best merging operation(s) between two data.tables trying a set of columns from the second one.

Description

Looks for the best merging operation(s) between two data.tables trying a set of columns from the second one.

Usage

best.merged.dt(dt.x, dt.y, by.x, try.y = NULL, skip.incompatible.type = FALSE)

Arguments

dt.x

A data.table.

dt.y

Another data.table.

by.x

A character specifying a single column name from 'dt.x'.

try.y

A character vector specifying multiple column names from 'dt.y' to be tried for the merging. If NULL, all columns from 'dt.y' will be tried for the merging (Default: try.y = NULL). Columns with a different type from the one specified in 'by.x' will raise an error by default (See 'skip.incompatible.type').

skip.incompatible.type

A logical specifying whether potential 'dt.y' columns of incompatible type specified in 'try.y' should be automatically skipped (skip.incompatible.type = TRUE) or not (skip.incompatible.type = FALSE).

Value

A list containing:

  • 'best.merged.dt': a data.table resulting of the best merging operation if a single merging operation performed the best. If multiple merging operations gave best results, operation names are given as a character vector.

  • 'merging.results': a list of the merging operation results. Each result contains 2 elements:

    • 'merge.res': the data.table resulting from the merging operation.

    • 'NA.count': an integer vector giving the number of NAs contained in each columns from 'dt.y' after the merging.

Author(s)

Yoann Pageaud.


YoannPa/DTrsiv documentation built on Nov. 28, 2022, 5:44 p.m.