powerIterationClustering: PowerIterationClustering
In danzafar/tidyspark: A Tidy Interface to Spark

Description Usage Arguments Value Note Examples

A scalable graph clustering algorithm. Users can call ml_assign_clusters to return a cluster assignment for each input vertex. Run the PIC algorithm and returns a cluster assignment for each input vertex.

ml_assign_clusters(
  data,
  k = 2L,
  initMode = c("random", "degree"),
  maxIter = 20L,
  sourceCol = "src",
  destinationCol = "dst",
  weightCol = NULL
)

`data`	a spark_tbl.
`k`	the number of clusters to create.
`initMode`	the initialization algorithm; "random" or "degree"
`maxIter`	the maximum number of iterations.
`sourceCol`	the name of the input column for source vertex IDs.
`destinationCol`	the name of the input column for destination vertex IDs
`weightCol`	weight column name. If this is not set or `NULL`, we treat all instance weights as 1.0.
`...`	additional argument(s) passed to the method.

A dataset that contains columns of vertex id and the corresponding cluster for the id. The schema of it will be: id: integer, cluster: integer

ml_assign_clusters(spark_tbl) since 3.0.0

## Not run: 
df <- spark_tbl(
  tribble(~src, ~dst, ~weight,
          0L, 1L, 1.0,
          0L, 2L, 1.0,
          1L, 2L, 1.0,
          3L, 4L, 1.0,
          4L, 0L, 0.1))
clusters <- ml_assign_clusters(df, initMode = "degree", weightCol = "weight")
show(clusters)

## End(Not run)