sdf_sum_col: Sum_col method
In nathaneastwood/sparkts: Call Methods from the spark-ts Package

Description Usage Arguments Value Examples

This function will perform a function similar to a SQL Group By. It should be noted that it does not perform this identically to what you'd typically expect of an ANSI like SQL statement. A new column is added onto the returning data rather than automatically returning columns parameterised as part of the call. With this function you need to performa an additional select. Also only one SINGLE sum-by column can be used.

1	sdf_sum_col(sc, data, group_by_cols, sum_col_name)

`sc`	A `spark_connection`.
`data`	A `jobj`: the Spark `DataFrame` on which to perform the function.
`group_by_cols`	c(String). A vector of columns to Group-By
`sum_col_name`	String.A column to Sum-By

Returns a jobj

## Not run: 
# Set up a spark connection
sc <- spark_connect(master = "local", version = "2.2.0")

# Extract some data
lag_data <- spark_read_json(
  sc,
  "lag_data",
  path = system.file(
    "data_raw/lag_data.json",
    package = "sparkts"
  )
) %>%
  spark_dataframe()

# Call the method
p <- sdf_lag(
  sc = sc, data = lag_data, partition_cols = "id", order_cols = "t",
  target_col = "v", lag_num = 2L
)

# Return the data to R
p %>% dplyr::collect()

spark_disconnect(sc = sc)

## End(Not run)