Factors
In r4ds.tutorials: Tutorials for "R for Data Science"

library(learnr)
library(tutorial.helpers)
library(tidyverse)

# Without this hack, we fail on GHA! cols() and col_factor() have conflicting
# versions in readr, vroom and scales. Of course, we want the readr versions. We
# could solve this with either:

# library(conflicted)
# conflict_prefer("cols", "readr") # and so on. Or with:

# library(tidymodels)
# tidymodels_prefer() # because this applies to tidyverse functions we well

# But that all seems like overkill, given that it all works for students
# regardless.

cols <- readr::cols
col_factor <- readr::col_factor

knitr::opts_chunk$set(echo = FALSE)
options(tutorial.exercise.timelimit = 60, 
        tutorial.storage = "local") 

x1 <- c("Dec", "Apr", "Jan", "Mar")

x2 <- c("Dec", "Apr", "Jam", "Mar")

month_levels <- c(
  "Jan", "Feb", "Mar", "Apr", "May", "Jun", 
  "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"
)

csv <- "
month,value
Feb,12
Mar,56
Feb,14
Jan,12"

Introduction

This tutorial covers Chapter 16: Factors from R for Data Science (2e) by Hadley Wickham, Mine Çetinkaya-Rundel, and Garrett Grolemund. forcats is the core Tidyverse package for working with categorical variables, called "factors" in R. Key commands include fct() for creating factors, fct_reorder() for changing the order of the levels, and fct_recode() for recoding factors.

Factor basics

Factors are used for categorical variables --- variables that have a fixed and known set of possible values. They are also useful when you want to display character vectors in a non-alphabetical order.

Exercise 1

Load the tidyverse library.

library(...)

library(tidyverse)

One of the nine core packages within the Tidyverse is forcats, a package dedicated to working with factors. By loading tidyverse, we automatically get access to forcats and the other "core" Tidyverse packages.

Exercise 2

Look up the help page for forcats by entering help(package = "forcats") at the Console. Copy/paste the lines for the first help pages.

question_text(NULL,
    answer(NULL, correct = TRUE),
    allow_retry = TRUE,
    try_again_button = "Edit Answer",
    incorrect = NULL,
    rows = 5)

forcats provides tools for dealing with categorical variables --- and it’s an anagram of the word "factors" --- using a wide range of helpers for working with factors.

Exercise 3

Hit "Run Code" to create the variable x1.

x1 <- c("Dec", "Apr", "Jan", "Mar")

x1 <- c("Dec", "Apr", "Jan", "Mar")

Note that x1 is a character variable. This can lead to all sorts of problems given that months are a good example of a categorical variable, given that there are exactly 12 possible values.

Exercise 4

Run sort() on x1.

sort(...)

sort(x1)

Because x1 is a character variable, this sorts alphabetically, which is not what we want. We would prefer that the sort order correspond to the order in which months appear in the calendar.

Exercise 5

Hit "Run Code" to create the x2 variable, another character vector. But note the misspelling of "Jan" as "Jam".

x2 <- c("Dec", "Apr", "Jam", "Mar")

x2 <- c("Dec", "Apr", "Jam", "Mar")

Because x1 and x2 are both character vectors, nothing will catch the contradiction between "Jan" and "Jam." Using factors will force us to notice such errors.

Exercise 6

To create a factor you must start by creating a list of the valid "levels." Hit "Run Code" to create the month_levels variable.

month_levels <- c(
  "Jan", "Feb", "Mar", "Apr", "May", "Jun", 
  "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"
)

Note that month_levels is just another character vector. We need it, however, to create a factor variable.

Exercise 7

Run factor(x1, levels = month_levels).

factor(x1, ... = month_levels)

factor(x1, levels = month_levels)

The function factor() is a part of base R, not forcats. It creates a factor variable. Notice how, in addition to the values of x1 being printed, as they are when we print a character variable, we also see the levels, printed out in order.

Exercise 8

Wrap factor(x1, levels = month_levels) within a call to sort().

sort(...(x1, levels = ...))

sort(factor(x1, levels = month_levels))

Instead of being sorted in alphabetical order, as before, the values are sorted in the order of the levels, which is almost always what we want when sorting months.

Exercise 9

Run factor() with two arguments: x2 and levels = month_levels.

...(x2, ... = month_levels)

factor(x2, levels = month_levels)

Since "Jam" is not one of the levels, factor() coerces it to be missing, as shown with the <NA> symbol. One big advantage of working with factors is that you are prevented from using values which are not one of the levels.

Exercise 10

Instead of factor(), we recommend using the fct() function from the forcats package, precisely because it generates an explicit error rather than a silent conversion to NA.

Run fct(x2, levels = month_levels) to see an example of this error.

...(x2, ... = month_levels)

Notice the thorough error message. "Jam" is missing from the levels as defined in the month_levels variable.

Exercise 11

Run factor() and x1.

factor(...)

factor(x1)

Because we did not provide a levels argument, the values for the levels will be taken from the values of the x1 vector, sorted in alphabetical order.

Exercise 12

Run fct() and x1.

fct(...)

fct(x1)

Sorting alphabetically is slightly risky because not every computer will sort strings in the same way. So forcats::fct() orders by first appearance in the original vector.

Exercise 13

Take the code from the previous exercise use it as an argument to the function levels().

levels(fct(...))

levels(fct(x1))

If you ever need to access the set of valid levels directly, you can do so with levels().

Exercise 14

The next exercises will be focusing on the variable csv. Hit "Run Code" to look at csv.

csv

Note the \n. That signifies a new line being formed, but doesn't make csv easier to read.

Exercise 15

Run read_csv() with csv set as the argument

read_csv(...)

read_csv(csv)

csv is now much easier to read and understand. We can see what belongs in the month column and the value column, as well as variable types the columns are.

Exercise 16

Currently, the month column is a character type variable and the value column is a double type variable. Add the arguments col_types to read_csv and set it equal to "cc"

read_csv(csv, ...)

read_csv(csv, col_types = "cc")

col_types = "cc" changes the variable types of the columns to both be characters. The first c in cc corresponds to the first column, and so on.

Exercise 17

Change the value of col_types from "cc" to cols(month = "c") inside of read_csv()

read_csv(csv, col_types = cols(...))

read_csv(csv, col_types = cols(month = "c"))

Exercise 18

Change the value of col_types to cols(month = "f"))

read_csv(csv, col_types = cols(...))

read_csv(csv, col_types = cols(month = "f"))

The month variable is now a variable of factor type. Having month as a factor will allow us to perform certain actions on it later on.

Exercise 19

Continue the current pipe to count(). Set the argument to month.

... |>
  count(...)

read_csv(csv, col_types = cols(month = "f")) |> count(month)

While the tibble has all the correct information, it's not amazing to read. No one thinks of the months in those order. Luckily, this can be changed.

Exercise 20

In read_csv, change the value of col_types to cols(month = col_factor(month_levels))

read_csv(csv, col_types = ...) |>
  count(month)

read_csv(csv, 
         col_types = cols(month = col_factor(month_levels))) |>  
  count(month)

Besides the use of factor() and fct() as described earlier, col_factor(), when used within read_csv() and similar import functions, is the most common way of creating factor variables.

General Social Survey

The gss_cat tibble is a data set in the forcats package. It’s a sample of data from the General Social Survey, a long-running US survey conducted by the independent research organization NORC at the University of Chicago.

Exercise 1

Type gss_cat and hit "Run Code."

gss_cat

gss_cat

There are 9 variables and more than 20,000 observations. Note how the print() method for tibbles, which is called whenever you just enter the name of a tibble, like gss_cat, gives the variable types across the top.

Exercise 2

Look up the help page for gss_cat by typing ?gss_cat at the Console. Copy/paste the Description.

question_text(NULL,
    answer(NULL, correct = TRUE),
    allow_retry = TRUE,
    try_again_button = "Edit Answer",
    incorrect = NULL,
    rows = 2)

When referring to a tibble (or other variable) which is part of a package, you can just use the variable name if you have already loaded the package. (Recall that running library(tidyverse) loads all the Tidyverse libraries, including forcats.) You can also refer to the variable directly using the double colon notation -- :: -- i.e., forcats::gss_cat.

Exercise 3

When factors are stored in a tibble, you can’t see their levels so easily. One way to view them is with count(). Pipe gss_cat to count(race).

gss_cat |>
  ...(race)

gss_cat |>
  count(race)

The <fct> indicator above race indicates that it is a factor variable, not character.

Modifying factor order

When working with factors, one common operation is changing the order of the levels. Let's create this plot:

plot1 <- gss_cat |>
  summarize(n = n(),
            tvhours = mean(tvhours, na.rm = TRUE),
            .by = relig) |> 
  mutate(relig = fct_reorder(relig, tvhours)) |> 
  ggplot(aes(x = tvhours, y = relig)) +
  geom_point() +
  labs(title = "TV Watching and Religious Affiliation",
       subtitle = "Don't Knows watch a lot of TV",
       x = "TV Hours Watched Per Day",
       y = "Religious Affiliation")

plot1

Exercise 1

Run glimpse() on gss_cat.

glimpse(...)

glimpse(gss_cat)

We will be working with two variables: relig and tvhours. relig is a factor variable reporting religious affiliation, if any. tvhours is hours per day spent watching TV, on average.

Exercise 2

Pipe gss_cat to summarize(n = n())

gss_cat |>
  summarize(n = n())

gss_cat |>
  summarize(n = n())

Note how the letter "n" is used in two ways. First, it is the name of a new variable n, created via summarize(). In statistics, it is common for the letter "n" to mean the number of observations. Second, n() is a function, hence the (), which calculates the number of observations. Since there is no .by argument, the result is a tibble with a single row.

Exercise 3

Use the same pipe again, but add .by = relig as an argument/value pairing to summarize().

gss_cat |>
  summarize(n = n(),
            .by = relig)

gss_cat |>
  summarize(n = n(),
            .by = relig)

The result is a tibble with one row for each level of relig. (Older R code will often use the group_by() function when calculating statistics for each level of a factor. You should avoid this approach. Use the .by argument to summarize() and similar functions.)

Exercise 4

Use the same code again, adding another variable creation step to summarize(): tvhours = mean(tvhours).

gss_cat |>
  summarize(n = n(),
            tvhours = mean(tvhours),
            .by = relig)

gss_cat |>
  summarize(n = n(),
            tvhours = mean(tvhours),
            .by = relig)

Note that each argument (or variable creation step) in summarize() must be separated by a command. Alas, there are NA values present at least one person in every level of relig.

Exercise 5

Modify the pipe by add na.rm = TRUE as an argument within the mean() function.

gss_cat |>
  summarize(n = n(),
            tvhours = mean(tvhours, na.rm = TRUE),
            .by = relig)

gss_cat |>
  summarize(n = n(),
            tvhours = mean(tvhours, na.rm = TRUE),
            .by = relig)

All statistical functions in R will produce a value of NA if even a single one of the input values is NA, consistent with the rules of mathematics. Most statistical functions have a na.rm --- short for NA remove --- which allows us to remove any NA values prior to the calculation.

Exercise 6

Continue the pipe with a call to ggplot(), setting the mapping argument to aes(x = tvhours, y = relig).

gss_cat |>
  summarize(n = n(),
            tvhours = mean(tvhours, na.rm = TRUE),
            .by = relig) |> 
  ggplot(aes(x = tvhours, y = relig))

gss_cat |>
  summarize(n = n(),
            tvhours = mean(tvhours, na.rm = TRUE),
            .by = relig) |> 
  ggplot(aes(x = tvhours, y = relig))

Without a geom function, no data is plotted. But we still get the plotting area and the axis labels. Does the ordering of the religious affiliations on the y-axis seem reasonable?

Exercise 7

Add geom_point() to the pipe. Don't forget that calls to ggplot components are separated by plus signs, not pipes -- by + not |>.

gss_cat |>
  summarize(n = n(),
            tvhours = mean(tvhours, na.rm = TRUE),
            .by = relig) |> 
  ggplot(aes(x = tvhours, y = relig)) +
  geom_point()

gss_cat |>
  summarize(n = n(),
            tvhours = mean(tvhours, na.rm = TRUE),
            .by = relig) |> 
  ggplot(aes(x = tvhours, y = relig)) +
  geom_point()

It is hard to read this plot because there’s no overall pattern. We can improve it by reordering the levels of relig using fct_reorder(). fct_reorder() takes three arguments:

f, the factor whose levels you want to modify.
x, a numeric vector that you want to use to reorder the levels.
Optionally, fun, a function that’s used if there are multiple values of x for each value of f. The default value is median.

Exercise 8

Replace y = relig with y = fct_reorder(relig, tvhours) in your pipe.

gss_cat |>
  summarize(n = n(),
            tvhours = mean(tvhours, na.rm = TRUE),
            .by = relig) |> 
  ggplot(aes(x = tvhours, y = fct_reorder(relig, tvhours))) +
  geom_point()

gss_cat |>
  summarize(n = n(),
           tvhours = mean(tvhours, na.rm = TRUE),
           .by = relig) |>
  ggplot(aes(x = tvhours, y = fct_reorder(relig, tvhours)))

Reordering religion makes it much easier to see that people in the “Don’t know” category watch much more TV, and Hinduism & Other Eastern religions watch much less.

Exercise 9

As you start making more complicated transformations, we recommend moving them out of aes() and into a separate mutate() step. After the summarize() step, insert this line: mutate(relig = fct_reorder(relig, tvhours)) |>. Then, change y = fct_reorder(relig, tvhours) back to y = relig.

gss_cat |>
  summarize(n = n(),
            tvhours = mean(tvhours, na.rm = TRUE),
            .by = relig) |> 
  mutate(... = fct_reorder(relig, ...)) |> 
  ggplot(aes(x = tvhours, ... = relig)) +
  geom_point()

gss_cat |>
  summarize(n = n(),
            tvhours = mean(tvhours, na.rm = TRUE),
            .by = relig) |> 
  mutate(relig = fct_reorder(relig, tvhours)) |> 
  ggplot(aes(x = tvhours, y = relig)) +
  geom_point()

It is almost always better to complete your data transformations before starting your plot.

Exercise 10

Finish the plot by adding a title, subtitle, and axis labels. Remember that the plot looks like this:

plot1

... +
  labs(... = "TV Watching and Religious Affiliation",
       subtitle = ...,
       x = ...,
       ... = "Religious Affiliation")

gss_cat |>
  summarize(n = n(),
            tvhours = mean(tvhours, na.rm = TRUE),
            .by = relig) |> 
  mutate(relig = fct_reorder(relig, tvhours)) |> 
  ggplot(aes(x = tvhours, y = relig)) +
  geom_point()+
  labs(title = "TV Watching and Religious Affiliation",
       subtitle = "???",
       x = "Hours watched",
       y = "Religious Affiliation")

The subtitle of a plot should be the one sentence conclusion/summary/observation with which you most want viewers to come away.

Let's create this graph now.

plot2 <- gss_cat |>
  filter(!is.na(age)) |> 
  count(age, marital) |>
  mutate(
    prop = n / sum(n),
    .by = age) |> 
  ggplot(aes(x = age, 
             y = prop, 
             color = fct_reorder2(marital, age, prop))) +
    geom_line(linewidth = 1) +
    scale_color_brewer(palette = "Set1") + 
    labs(color = "Marital", x = "Age", y = "Proportion")

plot2

Exercise 11

Pipe gss_cat to filter(), with the argument set as !is.na(age)

gss_cat |>
  filter(...)

gss_cat |>
  filter(!is.na(age))

This will remove all rows with a NA response to age, allowing for easier calculations in the future.

Exercise 12

Continue the pipe to count(), with age and marital set as the arguments

... |>
  count(age, ...)

gss_cat |>
  filter(!is.na(age))|> count(age,marital)

Exercise 13

Continue the pipe to mutate(). Create the variable prop and set it n/sum(n)

... |>
  mutate(... = n/sum(n))

Exercise 14

Add the argument .by within the mutate() function. Set .by to age

... |>
  mutate(prop = n/sum(n), .by = ...)

Adding the .by argument allows us to sort the tibble by age for the mutate function.

Exercise 15

Add ggplot() to the current pipe. Set x to age, y to prop and color to marital

... |>
  ggplot(aes(x = ..., y = ..., color = ...))

Exercise 16

Continue the pipe to geom_line(). Inside the function, set linewidth to 1.

... +
  geom_line(linewidth = ...)

Exercise 17

Continue the pipe with scale_color_brewer(), with the argument palette set to "Set1".

... |>
  scale_color_brewer(... = "Set1")

Exercise 18

Finish the pipe with labs() giving an appropriate axes and legend titles.

... |>
  labs(x = ..., y = ..., color = ...)

This graph is confusing to read as the colors assigned to the lines don't match up well with the legend. We can use fct_reorder2() to solve this problem.

Exercise 19

In the ggplot() function, change color from marital to fct_reorder2(marital, age, prop)

... |>
  ggplot(aes(x = age, y = prop, color = fct_reorder2(marital, age, prop))) |>
  ...

Rearranging the legend makes the plot easier to read because the legend colors now match the order of the lines on the far right of the plot. fct_reorder2(f, x, y) reorders the factor f by the y values associated with the largest x values.

Modifying factor levels

More powerful than changing the orders of the levels is changing their values. This allows you to clarify labels for publication, and collapse levels for high-level displays. The most general and powerful tool is fct_recode(). It allows you to recode, or change, the value of each level.

Exercise 1

Pipe gss_cat to count(partyid)

... |> 
  count(...)

gss_cat |> 
  count(partyid)

The levels of partyid are terse and inconsistent. Let’s tweak them to be longer and use a parallel construction.

Exercise 2

Like most rename and recoding functions in the Tidyverse, the new values go on the left and the old values go on the right. Pipe gss_cat to mutate(). Within mutate(), use partyid = fct_recode(partyid, "Republican, weak" = "Not str republican") to change partyid.

gss_cat |>
  mutate(
    partyid = ...(partyid,
      "Republican, weak"      = ...
    )
  )

Note how the second and seventh values for partyid have been changed from "Not str republican" to "Republican, weak". fct_recode() is the easiest way to change the value for a given factor level. Sometimes, as here, we change the value "in place," that is, we replace partyid with partyid. Other times, we use mutate() to create a new variable.

Exercise 3

Let's change all the values for partyid. Here is the mapping from new values to old values:

```{verbatim echo = TRUE} "Republican, strong" = "Strong republican", "Republican, weak" = "Not str republican", "Independent, near rep" = "Ind,near rep", "Independent, near dem" = "Ind,near dem", "Democrat, weak" = "Not str democrat", "Democrat, strong" = "Strong democrat"

Use these within the call to `fct_recode()`.


```r

gss_cat |>
  mutate(
    partyid = fct_recode(partyid,
      "Republican, strong"    = "Strong republican",
      "Republican, weak"      = "Not str republican",
      "Independent, near rep" = "Ind,near rep",
      "Independent, near dem" = ...,
      ...       = "Not str democrat",
      "Democrat, strong"      = "Strong democrat"
    )
  )

fct_recode() will leave the levels that aren’t explicitly mentioned as they are, and will warn you if you accidentally refer to a level that doesn’t exist.

Exercise 4

To combine groups, you can assign multiple old levels to the same new level. With the same pipe as above, use this mapping:

```r

gss_cat |>
  mutate(
    partyid = fct_recode(partyid,
      ...
    )
  )

Use this technique with care: if you group together categories that are truly different, you will end up with misleading results.

Exercise 5

Continue the pipe to count(partyid) to confirm that the recoding has worked.

... |> 
  count(partyid)

gss_cat |>
  mutate(
    partyid = fct_recode(partyid,
      "Republican, strong"    = "Strong republican",
      "Republican, weak"      = "Not str republican",
      "Independent, near rep" = "Ind,near rep",
      "Independent, near dem" = "Ind,near dem",
      "Democrat, weak"        = "Not str democrat",
      "Democrat, strong"      = "Strong democrat",
      "Other"                 = "No answer",
      "Other"                 = "Don't know",
      "Other"                 = "Other party")) |>
  count(partyid)

Read the help page for fct_recode() for more details.

Exercise 6

If you want to collapse a lot of levels, fct_collapse() is a useful variant of fct_recode(). For each new variable, you can provide a vector of old levels. Replace the call to fct_recode() in the previous pipe with this:

```{verbatim echo = TRUE} fct_collapse(partyid, "other" = c("No answer", "Don't know", "Other party"), "rep" = c("Strong republican", "Not str republican"), "ind" = c("Ind,near rep", "Independent", "Ind,near dem"), "dem" = c("Not str democrat", "Strong democrat") )

```r

gss_cat |>
  ...(
    partyid = fct_collapse(...,
      "other" = c("No answer", "Don't know", "Other party"),
      "rep" = c("Strong republican", "Not str republican"),
      "ind" = c("Ind,near rep", "Independent", "Ind,near dem"),
      "dem" = c("Not str democrat", "Strong democrat")
    )
  ) |> 
  ...(partyid)

Read the help page for fct_collapse() for more details. The other_level argument is sometimes useful.

Exercise 7

Sometimes you just want to lump together the small groups to make a plot or table simpler. That’s the job of the fct_lump_*() family of functions.

Pipe gss_cat to mutate() with relig = fct_lump_lowfreq(relig) as its argument.

gss_cat |>
  mutate(relig = ...)

fct_lump_lowfreq() is a simple starting point that progressively lumps the smallest groups categories into “Other”, always keeping “Other” as the smallest category.

Exercise 8

Continue the pipe to the function count() with relig as its argument.

... |>
  count(...)

gss_cat |>
  mutate(relig = fct_lump_lowfreq(relig)) |>
  count(relig)

In this case it’s not very helpful: it is true that the majority of Americans in this survey are Protestant, but we’d probably like to see some more details!

Exercise 9

Instead, we can use the fct_lump_n() to specify that we want exactly 10 groups. Pipe gss_cat to mutate() with relig = fct_lump_n(relig, n = 10) as its argument.

gss_cat |>
  mutate(relig = ...)

fct_lump_n() is particularly useful when you have a factor with many levels, but you're only interested in analyzing the most common ones. Without it, analyses can become cluttered and difficult to interpret.

Exercise 10

Continue the pipe to count(). Add relig as an argument, as well as sort = TRUE to count().

... |>
  count(relig, ...)

gss_cat |>
  mutate(relig = fct_lump_n(relig, n = 10)) |>
  count(relig, sort = TRUE)

Read the documentation to learn about fct_lump_min() and fct_lump_prop() which are useful in other cases.

Ordered factors

Ordered factors, created with ordered(), imply a strict ordering and equal distance between levels: the first level is “less than” the second level by the same amount that the second level is “less than” the third level, and so on.

Exercise 1

Run this code

ordered(c("a", "b", "c"))

ordered(c("a", "b", "c"))

You can recognize ordered factors when printing because they use < between the factor levels. We don't recommend using ordered factors unless you have a compelling reason for doing so.

Summary

This tutorial covered Chapter 16: Factors from R for Data Science (2e) by Hadley Wickham, Mine Çetinkaya-Rundel, and Garrett Grolemund. forcats is the core Tidyverse package for working with categorical variables, called "factors" in R. Key commands include fct() for creating factors, fct_reorder() for changing the order of the levels, and fct_recode() for recoding factors.

If you want to learn more about factors, read "Wrangling categorical data in R)" by Amelia McNamara and Nicholas Horton.

Any scripts or data that you put into this service are public.

r4ds.tutorials documentation built on April 3, 2025, 5:50 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

r4ds.tutorials Tutorials for "R for Data Science"

Factors In r4ds.tutorials: Tutorials for "R for Data Science"

Introduction

Factor basics

Exercise 1

Exercise 2

Exercise 3

Exercise 4

Exercise 5

Exercise 6

Exercise 7

Exercise 8

Exercise 9

Exercise 10

Exercise 11

Exercise 12

Exercise 13

Exercise 14

Exercise 15

Exercise 16

Exercise 17

Exercise 18

Exercise 19

Exercise 20

General Social Survey

Exercise 1

Exercise 2

Exercise 3

Modifying factor order

Exercise 1

Exercise 2

Exercise 3

Exercise 4

Exercise 5

Exercise 6

Exercise 7

Exercise 8

Exercise 9

Exercise 10

Exercise 11

Exercise 12

Exercise 13

Exercise 14

Exercise 15

Exercise 16

Exercise 17

Exercise 18

Exercise 19

Modifying factor levels

Exercise 1

Exercise 2

Exercise 3

Exercise 4

Exercise 5

Exercise 6

Exercise 7

Exercise 8

Exercise 9

Exercise 10

Ordered factors

Exercise 1

Summary

Try the r4ds.tutorials package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

r4ds.tutorials
Tutorials for "R for Data Science"

Factors
In r4ds.tutorials: Tutorials for "R for Data Science"