#' Append one Crunch dataset to another
#'
#' With Crunch, you can add additional rows to a dataset by appending a second
#' dataset to the bottom of the original dataset. Crunch makes intelligent
#' guesses to align the variables between the two datasets and to harmonize the
#' categories and subvariables of variables, as appropriate.
#'
#' Variables are matched between datasets based on their aliases. Variables
#' present in only one of the two datasets are fine; they're handled by filling
#' in with missing values for the rows corresponding to the dataset where they
#' don't exist. For variables present in both datasets, you will have best
#' results if you ensure that the two datasets have the same variable names
#' and types, and that their categorical and array variables have consistent
#' categories. To preview how datasets will align when appended, see
#' [compareDatasets()].
#'
#' Particularly if you're appending to datasets that are already shared with
#' others, you may want to use the fork-edit-merge workflow when appending
#' datasets. This allows you to verify your changes before releasing them to
#' the other viewers of the dataset. To do this fork the dataset with
#' [forkDataset()], append the new data to the fork, ensure that the append
#' worked as expected, and then merge the fork back to the original dataset
#' with `mergeFork()`. For more, see `vignette("fork-and-merge", package
#' = "crunch")`.
#'
#' @param dataset1 a CrunchDataset
#' @param dataset2 another CrunchDataset, or possibly a data.frame. If
#' `dataset2` is not a Crunch dataset, it will be uploaded as a new
#' dataset before appending. If it is a CrunchDataset, it may be subsetted with
#' a filter expression on the rows and a selection of variables on the columns.
#' @param upsert Logical: should the append instead "update" rows based on the
#' primary key variable and "insert" (append) where the primary key values are
#' new? Default is `FALSE`. Note that this upserting behavior requires a primary
#' key variable to have been set previously; see [pk()].
#' @return `dataset1`, updated with `dataset2`, potentially filtered on rows and
#' variables, appended to it.
#' @examples
#' \dontrun{
#' ds <- loadDataset("Survey, 2016")
#' new_wave <- loadDataset("Survey, 2017")
#' ds <- appendDataset(ds, new_wave)
#' }
#' @export
appendDataset <- function(dataset1, dataset2, upsert = FALSE) {
stopifnot(is.dataset(dataset1))
if (!is.dataset(dataset2)) {
temp.ds.name <- paste("Appending to", name(dataset1), now())
message("Creating ", dQuote(temp.ds.name), " as temporary dataset")
dataset2 <- newDataset(dataset2, name = temp.ds.name)
## TODO: on exit, delete dataset2
}
## Validate
if (identical(self(dataset1), self(dataset2))) {
halt("Cannot append dataset to itself")
}
## Assemble the payload
payload <- list(dataset = self(dataset2))
## Include a variable map, if appropriate
payload$where <- variablesFilter(dataset2)
## And filter the rows, if appropriate
payload$filter <- zcl(activeFilter(dataset2))
## Preventatively, delete the primary key on dataset1 so that this appends
## and not "upsert"
if (!upsert) {
pk(dataset1) <- NULL
}
## POST the batch. This will error with a useful message if it fails
dataset1 <- addBatch(dataset1, body = payload)
invisible(dataset1)
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.