mutate: Create or transform variables

Description Usage Arguments Value Useful functions available in calculations of variables Grouped tibbles Scoped mutation and transmutation Tidy data See Also Examples

Description

mutate() adds new variables and preserves existing ones; transmute() adds new variables and drops existing ones. Both functions preserve the number of rows of the input. New variables overwrite existing variables of the same name.

Usage

1
2
3

Arguments

.data

A tbl. All main verbs are S3 generics and provide methods for tbl_df(), dtplyr::tbl_dt() and dbplyr::tbl_dbi().

...

Name-value pairs of expressions, each with length 1 or the same length as the number of rows in the group (if using group_by()) or in the entire input (if not using groups). The name of each argument will be the name of a new variable, and the value will be its corresponding value. Use a NULL value in mutate to drop a variable. New variables overwrite existing variables of the same name.

The arguments in ... are automatically quoted and evaluated in the context of the data frame. They support unquoting and splicing. See vignette("programming") for an introduction to these concepts.

Value

An object of the same class as .data.

Useful functions available in calculations of variables

Grouped tibbles

Because mutating expressions are computed within groups, they may yield different results on grouped tibbles. This will be the case as soon as an aggregating, lagging, or ranking function is involved. Compare this ungrouped mutate:

1
2
3
starwars %>%
  mutate(mass / mean(mass, na.rm = TRUE)) %>%
  pull()

With the grouped equivalent:

1
2
3
4
starwars %>%
  group_by(gender) %>%
  mutate(mass / mean(mass, na.rm = TRUE)) %>%
  pull()

The former normalises mass by the global average whereas the latter normalises by the averages within gender levels.

Note that you can't overwrite a grouping variable within mutate().

mutate() does not evaluate the expressions when the group is empty.

Scoped mutation and transmutation

The three scoped variants of mutate() (mutate_all(), mutate_if() and mutate_at()) and the three variants of transmute() (transmute_all(), transmute_if(), transmute_at()) make it easy to apply a transformation to a selection of variables.

Tidy data

When applied to a data frame, row names are silently dropped. To preserve, convert to an explicit variable with tibble::rownames_to_column().

See Also

Other single table verbs: arrange, filter, select, slice, summarise

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
# Newly created variables are available immediately
mtcars %>% as_tibble() %>% mutate(
  cyl2 = cyl * 2,
  cyl4 = cyl2 * 2
)

# You can also use mutate() to remove variables and
# modify existing variables
mtcars %>% as_tibble() %>% mutate(
  mpg = NULL,
  disp = disp * 0.0163871 # convert to litres
)


# window functions are useful for grouped mutates
mtcars %>%
 group_by(cyl) %>%
 mutate(rank = min_rank(desc(mpg)))
# see `vignette("window-functions")` for more details

# You can drop variables by setting them to NULL
mtcars %>% mutate(cyl = NULL)

# mutate() vs transmute --------------------------
# mutate() keeps all existing variables
mtcars %>%
  mutate(displ_l = disp / 61.0237)

# transmute keeps only the variables you create
mtcars %>%
  transmute(displ_l = disp / 61.0237)


# The mutate operation may yield different results on grouped
# tibbles because the expressions are computed within groups.
# The following normalises `mass` by the global average:
starwars %>%
  mutate(mass / mean(mass, na.rm = TRUE)) %>%
  pull()

# Whereas this normalises `mass` by the averages within gender
# levels:
starwars %>%
  group_by(gender) %>%
  mutate(mass / mean(mass, na.rm = TRUE)) %>%
  pull()

# Note that you can't overwrite grouping variables:
gdf <- mtcars %>% group_by(cyl)
try(mutate(gdf, cyl = cyl * 100))


# Refer to column names stored as strings with the `.data` pronoun:
vars <- c("mass", "height")
mutate(starwars, prod = .data[[vars[[1]]]] * .data[[vars[[2]]]])

# For more complex cases, knowledge of tidy evaluation and the
# unquote operator `!!` is required. See https://tidyeval.tidyverse.org/
#
# One useful and simple tidy eval technique is to use `!!` to
# bypass the data frame and its columns. Here is how to divide the
# column `mass` by an object of the same name:
mass <- 100
mutate(starwars, mass = mass / !!mass)

Example output

Attaching package: 'dplyr'

The following objects are masked from 'package:stats':

    filter, lag

The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union

# A tibble: 32 x 13
     mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb  cyl2  cyl4
   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
 1  21       6  160    110  3.9   2.62  16.5     0     1     4     4    12    24
 2  21       6  160    110  3.9   2.88  17.0     0     1     4     4    12    24
 3  22.8     4  108     93  3.85  2.32  18.6     1     1     4     1     8    16
 4  21.4     6  258    110  3.08  3.22  19.4     1     0     3     1    12    24
 5  18.7     8  360    175  3.15  3.44  17.0     0     0     3     2    16    32
 6  18.1     6  225    105  2.76  3.46  20.2     1     0     3     1    12    24
 7  14.3     8  360    245  3.21  3.57  15.8     0     0     3     4    16    32
 8  24.4     4  147.    62  3.69  3.19  20       1     0     4     2     8    16
 9  22.8     4  141.    95  3.92  3.15  22.9     1     0     4     2     8    16
10  19.2     6  168.   123  3.92  3.44  18.3     1     0     4     4    12    24
# ... with 22 more rows
# A tibble: 32 x 10
     cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb
   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
 1     6  2.62   110  3.9   2.62  16.5     0     1     4     4
 2     6  2.62   110  3.9   2.88  17.0     0     1     4     4
 3     4  1.77    93  3.85  2.32  18.6     1     1     4     1
 4     6  4.23   110  3.08  3.22  19.4     1     0     3     1
 5     8  5.90   175  3.15  3.44  17.0     0     0     3     2
 6     6  3.69   105  2.76  3.46  20.2     1     0     3     1
 7     8  5.90   245  3.21  3.57  15.8     0     0     3     4
 8     4  2.40    62  3.69  3.19  20       1     0     4     2
 9     4  2.31    95  3.92  3.15  22.9     1     0     4     2
10     6  2.75   123  3.92  3.44  18.3     1     0     4     4
# ... with 22 more rows
# A tibble: 32 x 12
# Groups:   cyl [3]
     mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb  rank
   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <int>
 1  21       6  160    110  3.9   2.62  16.5     0     1     4     4     2
 2  21       6  160    110  3.9   2.88  17.0     0     1     4     4     2
 3  22.8     4  108     93  3.85  2.32  18.6     1     1     4     1     8
 4  21.4     6  258    110  3.08  3.22  19.4     1     0     3     1     1
 5  18.7     8  360    175  3.15  3.44  17.0     0     0     3     2     2
 6  18.1     6  225    105  2.76  3.46  20.2     1     0     3     1     6
 7  14.3     8  360    245  3.21  3.57  15.8     0     0     3     4    11
 8  24.4     4  147.    62  3.69  3.19  20       1     0     4     2     7
 9  22.8     4  141.    95  3.92  3.15  22.9     1     0     4     2     8
10  19.2     6  168.   123  3.92  3.44  18.3     1     0     4     4     5
# ... with 22 more rows
    mpg  disp  hp drat    wt  qsec vs am gear carb
1  21.0 160.0 110 3.90 2.620 16.46  0  1    4    4
2  21.0 160.0 110 3.90 2.875 17.02  0  1    4    4
3  22.8 108.0  93 3.85 2.320 18.61  1  1    4    1
4  21.4 258.0 110 3.08 3.215 19.44  1  0    3    1
5  18.7 360.0 175 3.15 3.440 17.02  0  0    3    2
6  18.1 225.0 105 2.76 3.460 20.22  1  0    3    1
7  14.3 360.0 245 3.21 3.570 15.84  0  0    3    4
8  24.4 146.7  62 3.69 3.190 20.00  1  0    4    2
9  22.8 140.8  95 3.92 3.150 22.90  1  0    4    2
10 19.2 167.6 123 3.92 3.440 18.30  1  0    4    4
11 17.8 167.6 123 3.92 3.440 18.90  1  0    4    4
12 16.4 275.8 180 3.07 4.070 17.40  0  0    3    3
13 17.3 275.8 180 3.07 3.730 17.60  0  0    3    3
14 15.2 275.8 180 3.07 3.780 18.00  0  0    3    3
15 10.4 472.0 205 2.93 5.250 17.98  0  0    3    4
16 10.4 460.0 215 3.00 5.424 17.82  0  0    3    4
17 14.7 440.0 230 3.23 5.345 17.42  0  0    3    4
18 32.4  78.7  66 4.08 2.200 19.47  1  1    4    1
19 30.4  75.7  52 4.93 1.615 18.52  1  1    4    2
20 33.9  71.1  65 4.22 1.835 19.90  1  1    4    1
21 21.5 120.1  97 3.70 2.465 20.01  1  0    3    1
22 15.5 318.0 150 2.76 3.520 16.87  0  0    3    2
23 15.2 304.0 150 3.15 3.435 17.30  0  0    3    2
24 13.3 350.0 245 3.73 3.840 15.41  0  0    3    4
25 19.2 400.0 175 3.08 3.845 17.05  0  0    3    2
26 27.3  79.0  66 4.08 1.935 18.90  1  1    4    1
27 26.0 120.3  91 4.43 2.140 16.70  0  1    5    2
28 30.4  95.1 113 3.77 1.513 16.90  1  1    5    2
29 15.8 351.0 264 4.22 3.170 14.50  0  1    5    4
30 19.7 145.0 175 3.62 2.770 15.50  0  1    5    6
31 15.0 301.0 335 3.54 3.570 14.60  0  1    5    8
32 21.4 121.0 109 4.11 2.780 18.60  1  1    4    2
    mpg cyl  disp  hp drat    wt  qsec vs am gear carb  displ_l
1  21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4 2.621932
2  21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4 2.621932
3  22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1 1.769804
4  21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1 4.227866
5  18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2 5.899347
6  18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1 3.687092
7  14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4 5.899347
8  24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2 2.403984
9  22.8   4 140.8  95 3.92 3.150 22.90  1  0    4    2 2.307300
10 19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4 2.746474
11 17.8   6 167.6 123 3.92 3.440 18.90  1  0    4    4 2.746474
12 16.4   8 275.8 180 3.07 4.070 17.40  0  0    3    3 4.519556
13 17.3   8 275.8 180 3.07 3.730 17.60  0  0    3    3 4.519556
14 15.2   8 275.8 180 3.07 3.780 18.00  0  0    3    3 4.519556
15 10.4   8 472.0 205 2.93 5.250 17.98  0  0    3    4 7.734700
16 10.4   8 460.0 215 3.00 5.424 17.82  0  0    3    4 7.538055
17 14.7   8 440.0 230 3.23 5.345 17.42  0  0    3    4 7.210313
18 32.4   4  78.7  66 4.08 2.200 19.47  1  1    4    1 1.289663
19 30.4   4  75.7  52 4.93 1.615 18.52  1  1    4    2 1.240502
20 33.9   4  71.1  65 4.22 1.835 19.90  1  1    4    1 1.165121
21 21.5   4 120.1  97 3.70 2.465 20.01  1  0    3    1 1.968088
22 15.5   8 318.0 150 2.76 3.520 16.87  0  0    3    2 5.211090
23 15.2   8 304.0 150 3.15 3.435 17.30  0  0    3    2 4.981671
24 13.3   8 350.0 245 3.73 3.840 15.41  0  0    3    4 5.735477
25 19.2   8 400.0 175 3.08 3.845 17.05  0  0    3    2 6.554830
26 27.3   4  79.0  66 4.08 1.935 18.90  1  1    4    1 1.294579
27 26.0   4 120.3  91 4.43 2.140 16.70  0  1    5    2 1.971365
28 30.4   4  95.1 113 3.77 1.513 16.90  1  1    5    2 1.558411
29 15.8   8 351.0 264 4.22 3.170 14.50  0  1    5    4 5.751864
30 19.7   6 145.0 175 3.62 2.770 15.50  0  1    5    6 2.376126
31 15.0   8 301.0 335 3.54 3.570 14.60  0  1    5    8 4.932510
32 21.4   4 121.0 109 4.11 2.780 18.60  1  1    4    2 1.982836
    displ_l
1  2.621932
2  2.621932
3  1.769804
4  4.227866
5  5.899347
6  3.687092
7  5.899347
8  2.403984
9  2.307300
10 2.746474
11 2.746474
12 4.519556
13 4.519556
14 4.519556
15 7.734700
16 7.538055
17 7.210313
18 1.289663
19 1.240502
20 1.165121
21 1.968088
22 5.211090
23 4.981671
24 5.735477
25 6.554830
26 1.294579
27 1.971365
28 1.558411
29 5.751864
30 2.376126
31 4.932510
32 1.982836
 [1]  0.7912704  0.7707179  0.3288397  1.3975685  0.5035357  1.2331487
 [7]  0.7707179  0.3288397  0.8632041  0.7912704  0.8632041         NA
[13]  1.1509388  0.8220991  0.7604417 13.9551329  0.7912704  1.1303863
[19]  0.1746961  0.7707179  0.8036019  1.4386735  1.1612150  0.8118229
[25]  0.8118229  0.8529279         NA         NA  0.2055248  0.6987843
[31]  0.9145853  0.9248615         NA  0.6782318  0.8426516         NA
[37]         NA         NA  0.4110496         NA         NA  0.8220991
[43]         NA  0.5651932  0.4624308         NA  0.6679556  0.8632041
[49]  0.8426516  0.8940328         NA  0.5138120         NA         NA
[55]  0.8220991         NA  0.8734803         NA         NA  0.8220991
[61]  0.5775246  0.5138120         NA  0.8220991         NA  0.8118229
[67]  0.5651932  1.0481764  0.9043091         NA         NA  0.1541436
[73]         NA  0.4932595         NA  0.5857456  1.6339220  1.3975685
[79]  0.8118229  0.4932595  0.8220991         NA         NA         NA
[85]         NA         NA  0.4624308
 [1] 0.9505639 1.6187050 0.6906475 1.6789181 0.9070715 1.4813984 1.3883747
 [8] 0.6906475 1.0369788 0.9505639 1.0369788        NA 1.3826385 0.9875989
[15] 0.9135290 1.0000000 0.9505639 1.3579485 0.2098648 0.9258740 0.9653779
[22] 1.0000000 1.3949834 0.9752539 0.9752539 1.0246339        NA        NA
[29] 0.2468997 0.8394591 1.0987038 1.1110488        NA 0.8147691 1.0122889
[36]        NA        NA        NA 0.4937995        NA        NA 0.9875989
[43]        NA 1.0181414 0.5555244        NA 0.8024241 1.0369788 1.0122889
[50] 1.0740138        NA 0.9255831        NA        NA 0.9875989        NA
[57] 1.0493238        NA        NA 0.9875989 1.0403554 0.9255831        NA
[64] 0.9875989        NA 0.9752539 1.0181414 1.2591886 1.0863588        NA
[71]        NA 0.1851748        NA 0.5925593        NA 1.0551648 1.9628528
[78] 1.6789181 0.9752539 0.8885598 0.9875989        NA        NA        NA
[85]        NA        NA 0.8330248
Error in mutate_impl(.data, dots) : 
  Column `cyl` can't be modified because it's a grouping variable
# A tibble: 87 x 14
   name  height  mass hair_color skin_color eye_color birth_year gender
   <chr>  <int> <dbl> <chr>      <chr>      <chr>          <dbl> <chr> 
 1 Luke~    172    77 blond      fair       blue            19   male  
 2 C-3PO    167    75 <NA>       gold       yellow         112   <NA>  
 3 R2-D2     96    32 <NA>       white, bl~ red             33   <NA>  
 4 Dart~    202   136 none       white      yellow          41.9 male  
 5 Leia~    150    49 brown      light      brown           19   female
 6 Owen~    178   120 brown, gr~ light      blue            52   male  
 7 Beru~    165    75 brown      light      blue            47   female
 8 R5-D4     97    32 <NA>       white, red red             NA   <NA>  
 9 Bigg~    183    84 black      light      brown           24   male  
10 Obi-~    182    77 auburn, w~ fair       blue-gray       57   male  
# ... with 77 more rows, and 6 more variables: homeworld <chr>, species <chr>,
#   films <list>, vehicles <list>, starships <list>, prod <dbl>
# A tibble: 87 x 13
   name  height  mass hair_color skin_color eye_color birth_year gender
   <chr>  <int> <dbl> <chr>      <chr>      <chr>          <dbl> <chr> 
 1 Luke~    172  0.77 blond      fair       blue            19   male  
 2 C-3PO    167  0.75 <NA>       gold       yellow         112   <NA>  
 3 R2-D2     96  0.32 <NA>       white, bl~ red             33   <NA>  
 4 Dart~    202  1.36 none       white      yellow          41.9 male  
 5 Leia~    150  0.49 brown      light      brown           19   female
 6 Owen~    178  1.2  brown, gr~ light      blue            52   male  
 7 Beru~    165  0.75 brown      light      blue            47   female
 8 R5-D4     97  0.32 <NA>       white, red red             NA   <NA>  
 9 Bigg~    183  0.84 black      light      brown           24   male  
10 Obi-~    182  0.77 auburn, w~ fair       blue-gray       57   male  
# ... with 77 more rows, and 5 more variables: homeworld <chr>, species <chr>,
#   films <list>, vehicles <list>, starships <list>

dplyr documentation built on July 4, 2019, 5:08 p.m.