man-roxygen/class_container_extractors.R

# class_container_extractors.R


#' @name Container-extractors
#'
#' @title
#' Container: extract parts of a Container object
#'
#' @description
#' Usual operators to subset and manipulate an object of class
#' [`Container`][Container-class].
#'
#' @param x an object of class [`Container`][Container-class].
#' @param i,j elements to subset on and/or extract. These can be:
#' * integer, logical or character vector or
#' * an expression wrapped into function [e()]. Check the descriptions of
#' arguments `i` and `j` of [`data.table`][data.table::data.table()].
#' @param name a column `name` in slot `table`.
#' @param drop a logical. If `TRUE`, the result is coerced to the lowest
#' possible dimension. Never use this argument if expressions are passed
#' to `i` and/or `j`.
#' @param expr an [`expression`][base::expression()].
#' @param ... optional arguments passed to [`[.data.table`][data.table::data.table()]
#' such as `by`, `.SDcols`, etc.
#'
#' @return
#' All operators works on the `table` slot.
#'
#' They are not *endomorphism* and **do not** return objects of class
#' [`Container`][Container-class]. Instead, a subset of slot `table` is
#' returned in various ways that depends on the operator and on the arguments'
#' signature.
#'
#' * Operator [`[`][Container-extractors] always returns a
#' [`data.table`][data.table::data.table()], sometimes invisibly. This
#' depends on what is passed to argument `j`.
#' * Operators [`[[`][Container-extractors] and [`$`][Container-extractors]
#' both return a vector with a class that matches the extracted column's class.
#'
#' @details
#' To modify subsets of the `table` stored in an instance of class
#' [`Container`][Container-class], use either \pkg{data.table}'s
#' [`:=()`][data.table::assign] operator or [set()][data.table::assign] function.
#' Modifications are done *by reference*.
#'
#' Package \pkg{cargo} needs to know you are passing (unevaluated)
#' \pkg{data.table} expressions to the operator [`[`][Container-extractors].
#' To do so, wrap arguments `i` and `j` with function [e()]. See examples below.
#'
#' @note
#' **This note is irrelevent for most users, but must still appear somewhere.**
#'
#' The [`data.table`][data.table::data.table()] class relies on the
#' *non-standard evaluation* mechanism of \R: expressions passed to functions
#' arguments are not evaluated upon a function call. Instead, \R evaluates them
#' only when it is forced to.
#'
#' Unfortunately, package \pkg{methods}' dispatch mechanism breaks this
#' feature, because it needs to know the arguments' signature to dispatch a
#' method on it. This is very problematic for what \pkg{cargo} is trying to
#' achieve.
#'
#' The workaround is to protect expressions from early evaluation. \R
#' only needs to know they are proper expressions, not their result. Hence, the
#' solution is to wrap expressions passed to `i` and `j` arguments, so that they
#' can be  evaluated within the frame of `table` later. The function [e()]
#' is a convenient shortcut doing just that.
#'
#' @examples
#' ## Construct an object of class Container from a data.table and a Schema.
#' myTable <- data.table::data.table(
#'     field1 = c("s1", "s2", "s3"),
#'     field2 = c(1L, 2L, 3L),
#'     field3 = c(47.13, 48.72, 53.32),
#'     field4 = c(-122.56, -79.12, -114.67)
#' )
#'
#' mySchema <- Schema(
#'     inputs = c("field1", "field2", "field3", "field4"),
#'     prototypes = list(
#'         Prototype(character()),
#'         Prototype(integer()),
#'         Prototype(numeric()),
#'         Prototype(numeric()))
#' )
#'
#' myCont <- Container(myTable, mySchema)
#'
#' ## Say we need to compute conditional sums of values in numeric columns.
#' myTable[field2 > 0L, base::colSums(.SD),
#'         .SDcols = c("field2", "field3", "field4")]
#'
#' ## The cargo way would be to write
#' myCont[cargo::e(field1 > 0L), cargo::e(base::colSums(.SD)),
#'        .SDcols = c("field2", "field3", "field4")]
#'
#' ## The latter is equivalent to writing this expression.
#' cargo::table(myCont)[field1 > 0L, base::colSums(.SD),
#'                      .SDcols = c("field2", "field3", "field4")]
#'
#' @family Container
jeanmathieupotvin/cargo documentation built on Oct. 27, 2020, 5:22 p.m.