sdf_with_unique_id: Add a Unique ID Column to a Spark DataFrame

View source: R/sdf_interface.R

sdf_with_unique_idR Documentation

Add a Unique ID Column to a Spark DataFrame

Description

Add a unique ID column to a Spark DataFrame. The Spark monotonicallyIncreasingId function is used to produce these and is guaranteed to produce unique, monotonically increasing ids; however, there is no guarantee that these IDs will be sequential. The table is persisted immediately after the column is generated, to ensure that the column is stable – otherwise, it can differ across new computations.

Usage

sdf_with_unique_id(x, id = "id")

Arguments

x

A spark_connection, ml_pipeline, or a tbl_spark.

id

The name of the column to host the generated IDs.


sparklyr documentation built on Nov. 2, 2023, 5:09 p.m.