sdf_rhyper: Generate random samples from a hypergeometric distribution
In rstudio/sparklyr: R Interface to Apache Spark

sdf_rhyper

R Documentation

Generate random samples from a hypergeometric distribution

Generator method for creating a single-column Spark dataframes comprised of i.i.d. samples from a hypergeometric distribution.

sdf_rhyper(
  sc,
  nn,
  m,
  n,
  k,
  num_partitions = NULL,
  seed = NULL,
  output_col = "x"
)

`sc`	A Spark connection.
`nn`	Sample Size.
`m`	The number of successes among the population.
`n`	The number of failures among the population.
`k`	The number of draws.
`num_partitions`	Number of partitions in the resulting Spark dataframe (default: default parallelism of the Spark cluster).
`seed`	Random seed (default: a random long integer).
`output_col`	Name of the output column containing sample values (default: "x").