repartition: Repartition a 'spark_tbl'

Description Usage Arguments Examples

View source: R/spark_tbl.R

Description

Repartitions a spark_tbl. Optionally allos for vector of columns to be used for partitioning.

Usage

1
repartition(.data, n_partitions, partition_by)

Arguments

.data

a data frame to be repartitioned

n_partitions

integer, the target number of partitions

partition_by

vector of column names used for partitioning, only supported for Spark 2.0+

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
## Not run: 
spark_session()
df <- spark_tbl(mtcars)

df %>% n_partitions() # 1

df_repartitioned <- df %>% repartition(5)
df %>% n_partitions() # 5

df_repartitioned <- df %>% repartition(5, c("cyl"))

spark_session_stop()

## End(Not run)

danzafar/tidyspark documentation built on Sept. 30, 2020, 12:19 p.m.