options(htmltools.dir.version = FALSE)
# see: https://github.com/yihui/xaringan
# install.packages("xaringan")
# see: 
# https://github.com/yihui/xaringan/wiki
# https://github.com/gnab/remark/wiki/Markdown
options(width=110)
options(digits = 4)

R

From Wikipedia (emphasis added):

R is an open source programming language and software environment for statistical computing and graphics that is supported by the R Foundation for Statistical Computing. The R language is widely used among statisticians and data miners for developing statistical software and data analysis. Polls, surveys of data miners, and studies of scholarly literature databases show that R's popularity has increased substantially in recent years.

R is a GNU package. The source code for the R software environment is written primarily in C, Fortran, and R. R is freely available under the GNU General Public License, and pre-compiled binary versions are provided for various operating systems. While R has a command line interface, there are several graphical front-ends available.


Programming language

From Wikipedia (emphasis added):

A programming language is a formal language that specifies a set of instructions that can be used to produce various kinds of output. Programming languages generally consist of instructions for a computer. Programming languages can be used to create programs that implement specific algorithms.

.pull-left4[

Algorithm

  1. Load data
  2. Extract variables
  3. Run analysis
  4. Print result ]

.pull-right4[

Implementation in R

data <- read.table('my_data.txt')
variables <- data[,c('group','variable')]
analysis <- lm(variable ~ group, data = variables)
summary(analysis)

]


R is purpose specific

R has been build for statistical computing and graphics and that is basically it:

.pull-left4[

Use R for...

  1. Loading and handling data
  2. Run statistical analyses
  3. Run analyses
  4. Prepare reproducible reports ]

.pull-right4[

Don't use R for...

  1. OS programs
  2. Server-side programs
  3. Database handling
  4. (Behvioral experiments) ]

R is widely used

R steadily grows in popularity. Today, R is one of the most popular languages for data science and overall. In terms of the number of data science jobs, R beats SAS and Matlab, and is on par with Python:


knitr::include_graphics("https://i0.wp.com/r4stats.com/wp-content/uploads/2017/02/Fig-1a-IndeedJobs-2017.png")

source: https://i0.wp.com/r4stats.com/

--- # R is so popular because R has been implemented in **C, Fortran, and R**. This means that R can be fast and efficient, however, often it is not. R's strengths lie in its flexibility, cutting-edge development, and producitivity tools. .pull-left5[ ### Pro 1. **It's free** 2. Relatively easy 3. Extensibility ([CRAN](https://cran.r-project.org/), packages) 4. User base (e.g., [stackoverflow](https://stackoverflow.com/)) 5. [Tidyverse](https://www.tidyverse.org/) (`dplyr`, `ggplot`, etc.) 6. [RStudio](https://www.rstudio.com/) 7. Producitivity options: [Latex](https://www.latex-project.org/), [Markdown](https://daringfireball.net/projects/markdown/), [GitHub](https://github.com/) ] .pull-right5[ ### Con 1. Slow and wordy 2. Limited (no iterators, pointers, etc.) $\rightarrow$ [Rcpp](http://www.rcpp.org/), [rPython](http://rpython.r-forge.r-project.org/) ] --- # RStudio: R's favorite environment Next to many useful packages, R users greatly benefit from R's integrated development environment [RStudio](https://www.rstudio.com/). Rstudio is a graphical user interface that allows you to (a) edit code, (b) run code, (c) access files and progress, and (d) create plots. In addition RStudio helps you with version control via [Github](https://github.com/), to write reports using [markdown](http://rmarkdown.rstudio.com/authoring_basics.html) and [knitr](https://yihui.name/knitr/), integrating C++ into R, writing clean code, and to debug code.

wzxhzdk:4 --- # The workflow of R wzxhzdk:5 --- # Project management RStudio facilitate project management via the use of *projects*. Projects support: .pull-left5[ 1. **File management** by automatically setting the working directory (see `setwd()`)

2. **Project transitioning** by saving re-opening scripts, history, and workspace.

3. **Customization** by enabling project specific settings.

4. **Version control** by linking projects to repositories (e.g., using [GitHub](https://github.com/)) ] .pull-right5[ wzxhzdk:6 ] --- # The almighty **tidyverse** Among its many packages, R contains a collection of high-performance, easy-to-use packages (libraries) designed specifically for handling data know as the [tidyverse](https://www.tidyverse.org/). The tidyverse includes: 1. `ggplot2` -- creating graphics. 2. `dplyr` -- data manipulation. 3. `tidyr` -- tidying data. 4. `readr` -- read wild data. 5. `purrr` -- functional programming. 6. `tibble` -- modern data frame.

wzxhzdk:7 --- # Essentials of the R language .pull-left7[ >"To understand computations in R, two slogans are helpful: >###(1) Everything that exists is an object and >###(2) everything that happens is a function call." ] .pull-right7[

John Chambers
Author of S and developer of R

] --- # Calls, assignments, and expressions In R every action is a function call. Specifically R scripts advances by **passing on data and arguments to functions**, calling the function, and receiving its output. However, functions may however not always look like functions. The output is then stored in a new object using assignment. .pull-left9[ wzxhzdk:8 ] .pull-right9[ wzxhzdk:9 ] --- # Object-orientation R is an object-oriented language. This means that for R that **everything is an object** (including functions). This also means that there are several generic functions that respond to the **object's class**. Another important feature of R regarding objects is that R **always copies deep**. This is why practically everything in R is an assignment. wzxhzdk:10 --- # Syntax Every language has specific expressive style. R is characterized by the following elements. .pull-left9[ + Comment symbol `#` + Quotations with either `""` or '' + Curly brackets `{}` enclose expressions explicitly + Parentheses `()` call functions + Semicolon `;` separates expressions + `<`,`>`,`|`,`&`,`==`, `!=` define logical statements ] wzxhzdk:11 ] --- .pull-left9[ # Help An important part of using R are **help files** and **vignettes**. Help files are required documentations for every R function and package published on [**CRAN**](https://cran.r-project.org/). In the beginning the help files may appear cryptical, however, over time you will realise that they are exceptionally helpful. Vignettes are longer tutorials sometimes provided by the authors of a package. wzxhzdk:12 ] .pull-right9[

] --- # Packages One of the huge benefits of R is its vast and cutting-edge collection of packages. Responsible for this is the larger and active user base, but also the [**CRAN**](https://cran.r-project.org/), who examine every package, apply a rigorous quality control, and eventually host the packages on various mirrors throughout the world. When downloading one of the many packages never forget that the package must also be loaded. wzxhzdk:13 ] --- # Interactive sessions

therbootcamp/BaselRBootcamp2017 documentation built on May 3, 2019, 10:45 p.m.