R/compute_parquet.R

Defines functions compute_parquet

Documented in compute_parquet

#' Compute results to a Parquet file
#'
#' For a duckplyr frame, this function executes the query
#' and stores the results in a Parquet file,
#' without converting it to an R data frame.
#' The result is a duckplyr frame that can be used with subsequent dplyr verbs.
#' This function can also be used as a Parquet writer for regular data frames.
#'
#' @inheritParams rlang::args_dots_empty
#' @inheritParams compute.duckplyr_df
#' @param x A duckplyr frame.
#' @param path The path of the Parquet file to create.
#' @param options A list of additional options to pass to create the Parquet file,
#'   see <https://duckdb.org/docs/sql/statements/copy.html#parquet-options>
#'   for details.
#'
#' @return A duckplyr frame.
#'
#' @export
#' @examples
#' library(duckplyr)
#' df <- data.frame(x = c(1, 2))
#' df <- mutate(df, y = 2)
#' path <- tempfile(fileext = ".parquet")
#' df <- compute_parquet(df, path)
#' explain(df)
#' @seealso [compute_csv()], [compute.duckplyr_df()], [dplyr::collect()]
compute_parquet <- function(x, path, ..., prudence = NULL, options = NULL) {
  check_dots_empty()

  if (is.null(options)) {
    options <- list()
  }

  if (is.null(prudence)) {
    prudence <- get_prudence_duckplyr_df(x)
  }

  rel <- duckdb_rel_from_df(x)

  duckdb$rel_to_parquet(rel, path, options)

  # If the path is a directory, we assume that the user wants to write multiple files
  if (dir.exists(path)) {
    path <- file.path(path, "**", "**.parquet")
  }

  read_parquet_duckdb(path, prudence = prudence)
}
duckdblabs/duckplyr documentation built on March 5, 2025, 3:46 a.m.