datadr-package: datadr

Description Details Author(s) Examples


datadr: Divide and Recombine for Large, Complex Data



Ryan Hafen Maintainer: Ryan Hafen <>


help(package = datadr)

Example output

		Information on package 'datadr'


Package:                      datadr
Type:                         Package
Title:                        Divide and Recombine for Large, Complex
Date:                         2016-09-22
Authors@R:                    c(person("Ryan", "Hafen", email =
                              "", role = c("aut",
                              "cre")), person("Landon", "Sego", role =
Maintainer:                   ORPHANED
Description:                  Methods for dividing data into subsets,
                              applying analytical methods to the
                              subsets, and recombining the results.
                              Comes with a generic MapReduce interface
                              as well.  Works with key-value pairs
                              stored in memory, on local disk, or on
                              HDFS, in the latter case using the R and
                              Hadoop Integrated Programming Environment
License:                      BSD_3_clause + file LICENSE
LazyLoad:                     yes
LazyData:                     yes
NeedsCompilation:             no
Imports:                      data.table (>= 1.9.6), digest, codetools,
                              hexbin, parallel, magrittr, dplyr,
Suggests:                     testthat (>= 0.11.0), roxygen2 (>=
                              5.0.1), Rhipe, lattice
RoxygenNote:                  5.0.1
Packaged:                     2018-08-19 08:39:09 UTC; hornik
Author:                       Ryan Hafen [aut, cre], Landon Sego [ctb]
Repository:                   CRAN
Date/Publication:             2018-08-19 08:51:19 UTC
Depends:                      R (>= 2.10)
X-CRAN-Original-Maintainer:   Ryan Hafen <>
X-CRAN-Comment:               Orphaned and corrected on 2018-08-19 as
                              check problems were not corrected despite
Built:                        R 3.4.4; ; 2019-05-11 03:51:32 UTC; unix


%>%                     Pipe data
addData                 Add Key-Value Pairs to a Data Connection
addTransform            Add a Transformation Function to a Distributed
                        Data Object
adult                   "Census Income" Dataset
applyTransform          Apply transformation function(s)       Turn 'ddf' Object into Data Frame
as.list.ddo             Turn 'ddo' / 'ddf' Object into a list
bsv                     Construct Between Subset Variable (BSV)
charFileHash            Character File Hash Function
combCollect             "Collect" Recombination
combDdf                 "DDF" Recombination
combDdo                 "DDO" Recombination
combMean                Mean Recombination
combMeanCoef            Mean Coefficient Recombination
combRbind               "rbind" Recombination
condDiv                 Conditioning Variable Division
convert                 Convert 'ddo' / 'ddf' Objects
datadr-package          datadr
ddf                     Instantiate a Distributed Data Frame ('ddf')
ddf-accessors           Accessor methods for 'ddf' objects
ddo                     Instantiate a Distributed Data Object ('ddo')
ddo-ddf-accessors       Accessor Functions
ddo-ddf-attributes      Managing attributes of 'ddo' or 'ddf' objects
digestFileHash          Digest File Hash Function
divide                  Divide a Distributed Data Object
divide-internals        Functions used in divide()
drAggregate             Division-Agnostic Aggregation
drBLB                   Bag of Little Bootstraps Transformation Method
drFilter                Filter a 'ddo' or 'ddf' Object
drGLM                   GLM Transformation Method
drGetGlobals            Get Global Variables and Package Dependencies
drHexbin                HexBin Aggregation for Distributed Data Frames
drJoin                  Join Data Sources by Key
drLM                    LM Transformation Method
drLapply                Apply a function to all key-value pairs of a
                        ddo/ddf object
drPersist               Persist a Transformed 'ddo' or 'ddf' Object
drQuantile              Sample Quantiles for 'ddf' Objects
drRead.table            Data Input
drSample                Take a Sample of Key-Value Pairs Take a sample
                        of key-value Pairs
drSubset                Subsetting Distributed Data Frames
flatten                 "Flatten" a ddf Subset
getCondCuts             Get names of the conditioning variable cuts
getSplitVar             Extract "Split" Variable(s)
hdfsConn                Connect to Data Source on HDFS
kvApply                 Apply Function to Key-Value Pair
kvPair                  Specify a Key-Value Pair
kvPairs                 Specify a Collection of Key-Value Pairs
localDiskConn           Connect to Data Source on Local Disk
localDiskControl        Specify Control Parameters for MapReduce on a
                        Local Disk Connection
makeExtractable         Take a ddo/ddf HDFS data object and turn it
                        into a mapfile
mr-summary-stats        Functions to Compute Summary Statistics in
mrExec                  Execute a MapReduce Job
print.ddo               Print a "ddo" or "ddf" Object
print.kvPair            Print a key-value pair
print.kvValue           Print value of a key-value pair
readHDFStextFile        Experimental HDFS text reader helper function
readTextFileByChunk     Experimental sequential text reader helper
recombine               Recombine
removeData              Remove Key-Value Pairs from a Data Connection
rhipeControl            Specify Control Parameters for RHIPE Job
rrDiv                   Random Replicate Division
setupTransformEnv       Set up transformation environment
to_ddf                  Convert dplyr grouped_df to ddf
updateAttributes        Update Attributes of a 'ddo' or 'ddf' Object

