union_select: Union Select Columns From Multiple Datasets

Description Usage Arguments Value Examples

View source: R/union_select.R

Description

This function will union the records from multiple data sets returning only the requested columns (all of which are assumed to be named the same between data sets).

Usage

1
union_select(.data, ..., .all = TRUE)

Arguments

.data

A list() of data.frames or tbl_sparks.

...

<tidy-select> One or more unquoted expressions separated by commas. Variable names can be used as if they were positions in the data.frame, so expressions like x:y can be used to select a range of variables.

.all

logical(1). Whether to keep duplicate records (def: TRUE) or not (FALSE).

Value

A tbl_spark or a data.frame depending on the input, .data.

Examples

1
2
3
4
5
6
7
8
9
a <- data.frame(col1 = c(1:10, 10), col2 = 6)
b <- data.frame(col1 = c(1:5, 5), col2 = 4)
c <- data.frame(col1 = c(0, 1, 1, 2, 3, 5, 8))

# You can union specific columns
union_select(.data = list(a, b, c), "col1")

# And you can remove duplicate records
union_select(.data = list(a, b, c), ends_with("1"), .all = FALSE)

nathaneastwood/sparkplugs documentation built on Feb. 28, 2021, 4:57 p.m.