joins: Joining verbs

joinsR Documentation

Joining verbs

Description

Two-table joins. Check the "Joining verbs" vignette for more information.

Usage

## S3 method for class 'ExprBuilder'
anti_join(x, y, ...)

## S3 method for class 'data.table'
anti_join(x, ..., .expr = FALSE)

## S3 method for class 'ExprBuilder'
full_join(x, y, ..., sort = TRUE, allow = TRUE, .parent_env)

## S3 method for class 'data.table'
full_join(x, ...)

## S3 method for class 'ExprBuilder'
inner_join(x, y, ...)

## S3 method for class 'data.table'
inner_join(x, ..., .expr = FALSE)

## S3 method for class 'ExprBuilder'
left_join(
  x,
  y,
  ...,
  nomatch,
  mult,
  roll,
  rollends,
  .parent_env,
  .to_eager = FALSE
)

## S3 method for class 'data.table'
left_join(x, y, ..., allow = FALSE, .expr = FALSE)

mutate_join(x, y, ...)

## S3 method for class 'ExprBuilder'
mutate_join(
  x,
  y,
  ...,
  .SDcols,
  mult,
  roll,
  rollends,
  allow = FALSE,
  .by_each = NULL,
  .parent_env
)

## S3 method for class 'EagerExprBuilder'
mutate_join(x, ..., .parent_env = rlang::caller_env())

## S3 method for class 'data.table'
mutate_join(x, y, ...)

## S3 method for class 'ExprBuilder'
right_join(
  x,
  y,
  ...,
  allow = FALSE,
  which,
  nomatch,
  mult,
  roll,
  rollends,
  .selecting,
  .framing
)

## S3 method for class 'data.table'
right_join(x, y, ..., allow = FALSE, .expr = FALSE, .selecting, .framing)

## S3 method for class 'ExprBuilder'
semi_join(x, y, ..., allow = FALSE, .eager = FALSE)

## S3 method for class 'data.table'
semi_join(x, y, ..., allow = FALSE, .eager = FALSE)

Arguments

x

An ExprBuilder instance.

y

A data.table::data.table or, for some verbs (see details), a call to nest_expr().

...

Expressions for the on part of the join.

.expr

If the input is a data.table and .expr is TRUE, an instance of EagerExprBuilder will be returned. Useful if you want to add clauses to j, e.g. with mutate-table.express.

sort

Passed to data.table::merge.

allow

Passed as data.table's allow.cartesian.

.parent_env

See end_expr().

nomatch, mult, roll, rollends

See data.table::data.table.

.to_eager

Internal, should be left as FALSE in all external calls.

.SDcols

For mutate_join. See the details below.

.by_each

For mutate_join. See the details below.

which

If TRUE, return the row numbers that matched in x instead of the result of the join.

.selecting

One or more expressions, possibly contained in a call to list or ., that will be added to j in the same frame as the join.

.framing

Similar to .selecting, but added to the frame with frame_append().

.eager

For semi_join. If TRUE, it uses nest_expr() to build an expression like this instead of the default one. This uses the captured data.table eagerly, so use chain() when needed. The default is lazy.

Details

The following joins support nest_expr() in y:

  • anti_join

  • inner_join

  • right_join

The full_join method is really a wrapper for data.table::merge that specifies all = TRUE. The expression in x gets evaluated, merged with y, and the result is captured in a new ExprBuilder. Useful in case you want to keep building expressions after the merge.

Mutating join

The ExprBuilder method for mutate_join implements the idiom described in this link. The columns specified in .SDcols are those that will be added to x from y. The specification can be done by:

  • Using tidyselect::select_helpers.

  • Passing a character vector. If the character is named, the names are taken as the new column names for the values added to x.

  • A list, using base::list() or .(), containing:

    • Column names, either as characters or symbols.

    • Named calls expressing how the column should be summarized/modified before adding it to x.

The last case mentioned above is useful when the join returns many rows from y for each row in x, so they can be summarized while joining. The value of by in the join depends on what is passed to .by_each:

  • If NULL (the default), by is set to .EACHI if a call is detected in any of the expressions from the list in .SDcols

  • If TRUE, by is always set to .EACHI

  • If FALSE, by is never set to .EACHI

See Also

data.table::data.table, dplyr::join

Examples


lhs <- data.table::data.table(x = rep(c("b", "a", "c"), each = 3),
                              y = c(1, 3, 6),
                              v = 1:9)

rhs <- data.table::data.table(x = c("c", "b"),
                              v = 8:7,
                              foo = c(4, 2))


rhs %>%
    anti_join(lhs, x, v)


lhs %>%
    inner_join(rhs, x)


# creates new data.table
lhs %>%
    left_join(rhs, x)


# would modify lhs by reference
lhs %>%
    start_expr %>%
    mutate_join(rhs, x, .SDcols = c("foo", rhs.v = "v"))

# would modify rhs by reference, summarizing 'y' before adding it.
rhs %>%
    start_expr %>%
    mutate_join(lhs, x, .SDcols = .(y = mean(y)))


# creates new data.table
lhs %>%
    right_join(rhs, x)


# keep only columns from lhs
lhs %>%
    semi_join(rhs, x)


table.express documentation built on April 3, 2023, 5:43 p.m.