spark_write_delta: Write a 'spark_tbl' to a Delta file

Description Usage Arguments Details Examples

View source: R/read-write.R

Description

Write a spark_tbl to Delta.

Usage

1
spark_write_delta(.data, path, mode = "error", partition_by = NULL, ...)

Arguments

.data

a spark_tbl

path

string, the path where the file is to be saved.

mode

string, usually "error" (default), "overwrite", "append", or "ignore"

partition_by

string, column names to partition by on disk

...

any other named options. See details below.

Details

For Delta, a few additional options can be specified using ...: #'

compression

(default null), compression codec to use when saving to file. This can be one of the known case-insensitive shorten names (none, bzip2, gzip, lz4, snappy and deflate)

replaceWhere

(default null), You can selectively overwrite only the data that matches predicates over partition columns (e.g. "date >= '2017-01-01' AND date <= '2017-01-31'")

overwriteSchema

(default FALSE), when overwriting a table using mode("overwrite") without replaceWhere, you may still want to overwrite the schema of the data being written. You replace the schema and partitioning of the table by setting this param option to TRUE

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
## Not run: 
# here using open-source delta jar dropped in the $SPARK_HOME/lib dir
spark_session(sparkPackages = "io.delta:delta-core_2.11:0.5.0")

iris_tbl <- spark_tbl(iris)

iris_tbl %>%
  spark_write_delta("/tmp/iris_tbl")

# you can go further and add to hive metastore like this:
spark_sql("CREATE TABLE iris_ddl USING DELTA LOCATION '/tmp/iris_tbl'")
# right now this throws a warning, you can ignore it.

## End(Not run)

danzafar/tidyspark documentation built on Sept. 30, 2020, 12:19 p.m.