Description Usage Arguments Details Value Author(s) Examples
Creates the configuration object, uploads needed files, and starts a Segue Hadoop cluster on Elastic Map Reduce.
1 2 3 4 5 6 | createCluster(numInstances=2, cranPackages, customPackages,
filesOnNodes, rObjectsOnNodes, enableDebugging=FALSE,
instancesPerNode, masterInstanceType="m1.large",
slaveInstanceType="m1.large", location="us-east-1c", ec2KeyName,
copy.image=FALSE, otherBootstrapActions, sourcePackagesToInstall,
masterBidPrice, slaveBidPrice)
|
numInstances |
number of nodes (EC2 instances) |
cranPackages |
vector of string names of CRAN packages to load on each cluster node |
customPackages |
vector of string file names of custom packages to load on each cluster node |
filesOnNodes |
vector of string names of full path of files to be loaded on each node. Files will be loaded into the local path (i.e. ./file) on each node. |
rObjectsOnNodes |
a named list of R objects which will be passed to the R session on the worker nodes. Be sure the list has names. The list will be attached on the remote nodes using attach(rObjectsOnNodes). If you list does not have names, this will fail. |
enableDebugging |
T/F whether EMR debugging should be enabled |
instancesPerNode |
Number of R instances per node. Default of NULL uses AWS defaults. |
masterInstanceType |
EC2 instance type for the master node |
slaveInstanceType |
EC2 instance type for the slave nodes |
location |
EC2 location name for the cluster |
ec2KeyName |
EC2 Key used for logging into the main node. Use the user name 'hadoop' |
copy.image |
T/F whether to copy the entire local environment to the nodes. If this feels fast and loose... you're right! It's nuts. Use it with caution. Very handy when you really need it. |
otherBootstrapActions |
a list-of-lists of other bootstrap actions to run; chlid list members |
sourcePackagesToInstall |
vector of full paths to source packages to be installed on each node |
masterBidPrice |
Bid price for master server |
slaveBidPrice |
Bid price for slave (task) server |
The the needed files are uploaded to S3 and the EMR nodes are started.
an emrlapply() cluster object with appropriate fields populated. Keep in mind that this creates the cluster and starts the cluster running.
James "JD" Long
1 2 3 4 5 | ## Not run:
myCluster <- createCluster(numInstances=2,
cranPackages=c("Hmisc", "plyr"))
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.