datadr: Divide and Recombine for Large, Complex Data
Version 0.8.6

Methods for dividing data into subsets, applying analytical methods to the subsets, and recombining the results. Comes with a generic MapReduce interface as well. Works with key-value pairs stored in memory, on local disk, or on HDFS, in the latter case using the R and Hadoop Integrated Programming Environment (RHIPE).

AuthorRyan Hafen [aut, cre], Landon Sego [ctb]
Date of publication2016-10-02 15:51:50
MaintainerRyan Hafen <rhafen@gmail.com>
LicenseBSD_3_clause + file LICENSE
Version0.8.6
URL http://deltarho.org/docs-datadr
Package repositoryView on CRAN
InstallationInstall the latest version of this package by entering the following in R:
install.packages("datadr")

Getting started

Package overview
README.md

Popular man pages

addData: Add Key-Value Pairs to a Data Connection
adult: "Census Income" Dataset
convert: Convert 'ddo' / 'ddf' Objects
ddf-accessors: Accessor methods for 'ddf' objects
drBLB: Bag of Little Bootstraps Transformation Method
drRead.table: Data Input
recombine: Recombine
See all...

All man pages Function index File listing

Man pages

addData: Add Key-Value Pairs to a Data Connection
addTransform: Add a Transformation Function to a Distributed Data Object
adult: "Census Income" Dataset
applyTransform: Apply transformation function(s)
as.data.frame.ddf: Turn 'ddf' Object into Data Frame
as.list.ddo: Turn 'ddo' / 'ddf' Object into a list
bsv: Construct Between Subset Variable (BSV)
charFileHash: Character File Hash Function
combCollect: "Collect" Recombination
combDdf: "DDF" Recombination
combDdo: "DDO" Recombination
combMean: Mean Recombination
combMeanCoef: Mean Coefficient Recombination
combRbind: "rbind" Recombination
condDiv: Conditioning Variable Division
convert: Convert 'ddo' / 'ddf' Objects
datadr-package: datadr
ddf: Instantiate a Distributed Data Frame ('ddf')
ddf-accessors: Accessor methods for 'ddf' objects
ddo: Instantiate a Distributed Data Object ('ddo')
ddo-ddf-accessors: Accessor Functions
ddo-ddf-attributes: Managing attributes of 'ddo' or 'ddf' objects
digestFileHash: Digest File Hash Function
divide: Divide a Distributed Data Object
divide-internals: Functions used in divide()
drAggregate: Division-Agnostic Aggregation
drBLB: Bag of Little Bootstraps Transformation Method
drFilter: Filter a 'ddo' or 'ddf' Object
drGetGlobals: Get Global Variables and Package Dependencies
drGLM: GLM Transformation Method
drHexbin: HexBin Aggregation for Distributed Data Frames
drJoin: Join Data Sources by Key
drLapply: Apply a function to all key-value pairs of a ddo/ddf object
drLM: LM Transformation Method
drPersist: Persist a Transformed 'ddo' or 'ddf' Object
drQuantile: Sample Quantiles for 'ddf' Objects
drRead.table: Data Input
drSample: Take a Sample of Key-Value Pairs Take a sample of key-value...
drSubset: Subsetting Distributed Data Frames
flatten: "Flatten" a ddf Subset
getCondCuts: Get names of the conditioning variable cuts
hdfsConn: Connect to Data Source on HDFS
kvApply: Apply Function to Key-Value Pair
kvPair: Specify a Key-Value Pair
kvPairs: Specify a Collection of Key-Value Pairs
localDiskConn: Connect to Data Source on Local Disk
localDiskControl: Specify Control Parameters for MapReduce on a Local Disk...
makeExtractable: Take a ddo/ddf HDFS data object and turn it into a mapfile
mrExec: Execute a MapReduce Job
mr-summary-stats: Functions to Compute Summary Statistics in MapReduce
pipe: Pipe data
print.ddo: Print a "ddo" or "ddf" Object
print.kvPair: Print a key-value pair
print.kvValue: Print value of a key-value pair
readHDFStextFile: Experimental HDFS text reader helper function
readTextFileByChunk: Experimental sequential text reader helper function
recombine: Recombine
removeData: Remove Key-Value Pairs from a Data Connection
rhipeControl: Specify Control Parameters for RHIPE Job
rrDiv: Random Replicate Division
setupTransformEnv: Set up transformation environment
splitvars: Extract "Split" Variable(s)
to_ddf: Convert dplyr grouped_df to ddf
updateAttributes: Update Attributes of a 'ddo' or 'ddf' Object

Functions

Files

tests
tests/testthat.R
tests/testthat
tests/testthat/test-hexbin.R
tests/testthat/test-summary.R
tests/testthat/test-join.R
tests/testthat/test-quantile.R
tests/testthat/test-globals.R
tests/testthat/test-kvMemory.R
tests/testthat/test-dataops.R
tests/testthat/test-spark.R
tests/testthat/test-readtext.R
tests/testthat/test-kvHDFS.R
tests/testthat/test-kvLocalDisk.R
NAMESPACE
NEWS.md
data
data/adult.rda
R
R/ddo_ddf_kvMemory.R
R/zzz_constants.R
R/bsv.R
R/dataops_join.R
R/mapreduce_kvLocalDisk.R
R/mapreduce_kvHDFS.R
R/agnostic_summary.R
R/conn_spark.R
R/agnostic_hexbin.R
R/recombine_transforms.R
R/divSpec.R
R/ddo_ddf_updateAttrs.R
R/ddo_ddf_kvHDFS.R
R/dataops_subset.R
R/dataops_persist.R
R/dataops_filter.R
R/ddo_ddf_methods.R
R/agnostic_aggregate.R
R/conn_HDFS.R
R/dataops_readTable.R
R/globals.R
R/ddo_addTransform.R
R/ddf_summary_print.R
R/agnostic_quantile.R
R/divide_df.R
R/recombine_combine.R
R/mapreduce_spark.R
R/dplyr.R
R/dataset_census.R
R/divide.R
R/ddo_ddf_kvSpark.R
R/mapreduce_kvMemory.R
R/divSpec_rrDiv.R
R/dataops_lapply.R
R/ddo_ddf_kvLocalDisk.R
R/kvPairs.R
R/recombine.R
R/ddo_ddf_print.R
R/misc.R
R/conn_localDisk.R
R/mapreduce.R
R/divSpec_condDiv.R
R/dataops_sample.R
R/ddo_ddf.R
R/dataops_read.R
R/datadr-package.R
R/conn_memory.R
README.md
MD5
DESCRIPTION
man
man/applyTransform.Rd
man/readTextFileByChunk.Rd
man/localDiskControl.Rd
man/drLM.Rd
man/pipe.Rd
man/ddo-ddf-attributes.Rd
man/drFilter.Rd
man/drHexbin.Rd
man/getCondCuts.Rd
man/drGetGlobals.Rd
man/combRbind.Rd
man/combMeanCoef.Rd
man/divide.Rd
man/drBLB.Rd
man/digestFileHash.Rd
man/ddf-accessors.Rd
man/readHDFStextFile.Rd
man/rhipeControl.Rd
man/kvPair.Rd
man/ddo-ddf-accessors.Rd
man/removeData.Rd
man/combDdf.Rd
man/setupTransformEnv.Rd
man/combDdo.Rd
man/convert.Rd
man/mr-summary-stats.Rd
man/divide-internals.Rd
man/condDiv.Rd
man/flatten.Rd
man/drGLM.Rd
man/to_ddf.Rd
man/mrExec.Rd
man/drSubset.Rd
man/ddo.Rd
man/bsv.Rd
man/as.data.frame.ddf.Rd
man/combMean.Rd
man/print.kvPair.Rd
man/kvPairs.Rd
man/ddf.Rd
man/recombine.Rd
man/drPersist.Rd
man/print.ddo.Rd
man/adult.Rd
man/updateAttributes.Rd
man/drLapply.Rd
man/drAggregate.Rd
man/combCollect.Rd
man/splitvars.Rd
man/rrDiv.Rd
man/makeExtractable.Rd
man/drSample.Rd
man/localDiskConn.Rd
man/addData.Rd
man/datadr-package.Rd
man/drJoin.Rd
man/drQuantile.Rd
man/as.list.ddo.Rd
man/charFileHash.Rd
man/drRead.table.Rd
man/kvApply.Rd
man/hdfsConn.Rd
man/addTransform.Rd
man/print.kvValue.Rd
LICENSE
datadr documentation built on May 20, 2017, 3:28 a.m.

Questions? Problems? Suggestions? Tweet to @rdrrHQ or email at ian@mutexlabs.com.

Please suggest features or report bugs in the GitHub issue tracker.

All documentation is copyright its authors; we didn't write any of that.