docs/README.md

doAzureParallel Guide

This section will provide information about how Azure works, how best to take advantage of Azure, and best practices when using the doAzureParallel package.

  1. Azure Introduction (link)

Using the Data Science Virtual Machine (DSVM) & Azure Batch

  1. Virtual Machine Sizes (link)

How do you choose the best VM type/size for your workload?

  1. Autoscale (link)

Automatically scale up/down your cluster to save time and/or money.

  1. Azure Limitations (link)

Learn about the limitations around the size of your cluster and the number of foreach jobs you can run in Azure.

  1. Package Management (link)

Best practices for managing your R packages in code. This includes installation at the cluster or job level as well as how to use different package providers.

  1. Distributing your Data (link)

Best practices and limitations for working with distributed data.

  1. Parallelizing on each VM Core (link)

Best practices and limitations for parallelizing your R code to each core in each VM in your pool

  1. Persistent Storage (link)

Taking advantage of persistent storage for long-running jobs

  1. Customize Cluster (link)

Setting up your cluster to user's specific needs

  1. Long Running Job (link)

Best practices for managing long running jobs

Additional Documentation

Take a look at our Troubleshooting Guide for information on how to diagnose common issues.

Read our FAQ for known issues and common questions.



LuisFilipe236/doAzureParallel documentation built on May 28, 2019, 1:45 p.m.