make_join_safer: Make a join function safer.

Description Usage Arguments Value See Also Examples

View source: R/joins.r

Description

Make a join function safer.

Usage

1
make_join_safer(join_fn, fast = TRUE)

Arguments

join_fn

Original join function to wrap

fast

Boolean, should the check be fast or thorough (defaults to fast)

Value

A new join function

The new join function will have arguments x, y, by, ... and allow_cartesian = FALSE. The goal here is to emulate merge(), which will raise an error if rows aren't uniquely identified. Instead, dplyr follows SQL and does a cartesian join if rows aren't unique.

Additionally, the by variables are required. They cannot be implicit. The ... arguments are passed to the dplyr join function.

If fast == TRUE, do the join, then check that the number of rows is not greater than the sum of the row counts of the input tables. If fast == FALSE, make sure that the by variables uniquely identify rows in at least one of the tables before doing the join.

See Also

merge(), dplyr::inner_join()

Examples

1
2
3
4
5
6
7
nrow(mtcars)  # 32
nrow(dplyr::inner_join(mtcars, mtcars, by = 'cyl'))  # 366
inner_join <- make_join_safer(dplyr::inner_join)
## Not run: 
inner_join(mtcars, mtcars, by = 'cyl')  # error

## End(Not run)

karldw/kdw.junk documentation built on Dec. 24, 2018, 1:07 a.m.