Description Commands Author(s) References Examples
ddR simplifies large-scale data analysis. It
includes new language constructs to express distributed programs in R.
Distributed programs writted in ddR can work across multiple execution engines such
as parallel, distributedR, and others. ddR provides
data-structures such as distributed array darray
to
partition and share data across multiple R instances. Users can
express parallel execution using dmapply
.
ddR contains the following commands. For more details use help function on each command.
useBackend
- choose execution engine
darray
- create distributed array
dframe
- create distributed data frame
dlist
- create distributed list
as.darray
- create darray object from matrix object
is.darray
- check if object is distributed array
parts
- obtain partitions of an object
nparts
- number of partitions as vector
totalParts
- obtain total number of partitions
psize
- obtain dimensions of partitions
collect
- fetch darray, dframe or dlist object at the master
repartition
- repartition input object
dmapply
- execute function on cluster
dlapply
- execute function on cluster
HP Vertica Development Team
Prasad, S., Fard, A., Gupta, V., Martinez, J., LeFevre, J., Xu, V., Hsu, M., Roy, I. Large scale predictive analytics in Vertica: Fast data transfer, distributed model creation and in-database prediction (2015). _Sigmod 2015_, 1657-1668.
Venkataraman, S., Bodzsar, E., Roy, I., AuYoung, A., and Schreiber, R. (2013) Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices. EuroSys'13, 197–210.
Homepage: https://github.com/vertica/DistributedR
1 2 3 4 5 6 7 |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.