Interested? See registration information here: RStudio Conference 2020
:spiral_calendar: January 27 and 28, 2020 :alarm_clock: 09:00 - 17:00 :hotel: [ADD ROOM] :writing_hand: RStudio Conference 2020
This 2-day workshop covers how to analyze large amounts of data in R. We will focus on scaling up our analyses using the same dplyr verbs that we use in our everyday work. We will use dplyr with data.table, databases, and Spark. We will also cover best practices on visualizing, modeling, and sharing against these data sources. Where applicable, we will review recommended connection settings, security best practices, and deployment options.
In this 2-day workshop, attendees will learn how to connect to and analyze large scale data
You should take this workshop if you want to learn how to work with big data in R. This data can be in-memory, in databases (like SQL Server), or in a cluster (like Spark).
Some have asked for material that would be useful to review prior to the class. The following is a compilation of subjects would be great if you are familiar with already by the time the class begins, but it is not a requirement that you study or review them.
For database background, please review the articles in the following links:
For spark background, please review the following:
We plan to provide a personal server to each student for use during the class. The server will contain all of the applications and materials needed, including R and RStudio. All you will need is a laptop with a web browser. For those of you that need to use their work provided laptops for the class, please ensure that the web browser in it will not be prevented from navigating to Amazon AWS, which is where the servers will be set up.
| Time | Activity | | :------------ | :--------------- | | 09:00 - 10:30 | Session 1 | | 10:30 - 11:00 | Coffee break | | 11:00 - 12:30 | Session 2 | | 12:30 - 13:30 | Lunch break | | 13:30 - 15:00 | Session 3 | | 15:00 - 15:30 | Coffee break | | 15:30 - 17:00 | Session 4 |
Edgar Ruiz
Solutions Engineer @ RStudio
Twitter: theotheredgar
LinkedIn: edgararuiz
James Blair
Solutions Engineer @ RStudio
Twitter: Blair09M
LinkedIn: blairjm
The following is a tentative outline of the subjects that will be covered during the class. The content and order is subject to change.
vroom
vroom
basicsdtplyr
dtplyr
basicsdtplyr
worksdtplyr
mutate()
verbDBI
knitr
SQL enginedplyr
connections
compute
functionstidymodels
for modelingtidypredict
sparklyr
dplyr
sparklyr
vroom
vroom
basicsdtplyr
dtplyr
basicsdtplyr
worksdtplyr
mutate()
verbDBI
knitr
SQL enginedplyr
connections
compute
functionstidymodels
for modelingtidypredict
sparklyr
dplyr
sparklyr
Interested? See registration information here: RStudio Conference 2020
This work is licensed under a Creative Commons Attribution 4.0 International License.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.