drHexbin: HexBin Aggregation for Distributed Data Frames

Description Usage Arguments Value Author(s) References See Also Examples

Description

Create "hexbin" object of hexagonally binned data for a distributed data frame. This computation is division agnostic - it does not matter how the data frame is split up.

Usage

1
2
3
drHexbin(data, xVar, yVar, by = NULL, xTransFn = identity,
  yTransFn = identity, xRange = NULL, yRange = NULL, xbins = 30,
  shape = 1, params = NULL, packages = NULL, control = NULL)

Arguments

data

a distributed data frame

xVar, yVar

names of the variables to use

by

an optional variable name or vector of variable names by which to group hexbin computations

xTransFn, yTransFn

a transformation function to apply to the x and y variables prior to binning

xRange, yRange

range of x and y variables (can be left blank if summaries have been computed)

xbins

the number of bins partitioning the range of xbnds

shape

the shape = yheight/xwidth of the plotting regions

params

a named list of objects external to the input data that are needed in the distributed computing (most should be taken care of automatically such that this is rarely necessary to specify)

packages

a vector of R package names that contain functions used in fn (most should be taken care of automatically such that this is rarely necessary to specify)

control

parameters specifying how the backend should handle things (most-likely parameters to rhwatch in RHIPE) - see rhipeControl and localDiskControl

Value

a "hexbin" object

Author(s)

Ryan Hafen

References

Carr, D. B. et al. (1987) Scatterplot Matrix Techniques for Large N. JASA 83, 398, 424–436.

See Also

drQuantile

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# create dummy data and divide it
dat <- data.frame(
  xx = rnorm(1000),
  yy = rnorm(1000),
  by = sample(letters, 1000, replace = TRUE))
d <- divide(dat, by = "by", update = TRUE)
# compute hexbins on divided object
dhex <- drHexbin(d, xVar = "xx", yVar = "yy")
# dhex is equivalent to running on undivided data:
hexbin(dat$xx, dat$yy)

datadr documentation built on May 1, 2019, 8:06 p.m.