filter: Return rows with matching conditions

Description Usage Arguments Details Value Useful filter functions Grouped tibbles Tidy data Scoped filtering See Also Examples

View source: R/manip.r

Description

Use filter() to choose rows/cases where conditions are true. Unlike base subsetting with [, rows where the condition evaluates to NA are dropped.

Usage

1
filter(.data, ..., .preserve = FALSE)

Arguments

.data

A tbl. All main verbs are S3 generics and provide methods for tbl_df(), dtplyr::tbl_dt() and dbplyr::tbl_dbi().

...

Logical predicates defined in terms of the variables in .data. Multiple conditions are combined with &. Only rows where the condition evaluates to TRUE are kept.

The arguments in ... are automatically quoted and evaluated in the context of the data frame. They support unquoting and splicing. See vignette("programming") for an introduction to these concepts.

.preserve

when FALSE (the default), the grouping structure is recalculated based on the resulting data, otherwise it is kept as is.

Details

Note that dplyr is not yet smart enough to optimise filtering optimisation on grouped datasets that don't need grouped calculations. For this reason, filtering is often considerably faster on ungroup()ed data.

Value

An object of the same class as .data.

Useful filter functions

Grouped tibbles

Because filtering expressions are computed within groups, they may yield different results on grouped tibbles. This will be the case as soon as an aggregating, lagging, or ranking function is involved. Compare this ungrouped filtering:

1
starwars %>% filter(mass > mean(mass, na.rm = TRUE))

With the grouped equivalent:

1
starwars %>% group_by(gender) %>% filter(mass > mean(mass, na.rm = TRUE))

The former keeps rows with mass greater than the global average whereas the latter keeps rows with mass greater than the gender average.

It is valid to use grouping variables in filter expressions.

When applied on a grouped tibble, filter() automatically rearranges the tibble by groups for performance reasons.

Tidy data

When applied to a data frame, row names are silently dropped. To preserve, convert to an explicit variable with tibble::rownames_to_column().

Scoped filtering

The three scoped variants (filter_all(), filter_if() and filter_at()) make it easy to apply a filtering condition to a selection of variables.

See Also

filter_all(), filter_if() and filter_at().

Other single table verbs: arrange, mutate, select, slice, summarise

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
filter(starwars, species == "Human")
filter(starwars, mass > 1000)

# Multiple criteria
filter(starwars, hair_color == "none" & eye_color == "black")
filter(starwars, hair_color == "none" | eye_color == "black")

# Multiple arguments are equivalent to and
filter(starwars, hair_color == "none", eye_color == "black")


# The filtering operation may yield different results on grouped
# tibbles because the expressions are computed within groups.
#
# The following filters rows where `mass` is greater than the
# global average:
starwars %>% filter(mass > mean(mass, na.rm = TRUE))

# Whereas this keeps rows with `mass` greater than the gender
# average:
starwars %>% group_by(gender) %>% filter(mass > mean(mass, na.rm = TRUE))


# Refer to column names stored as strings with the `.data` pronoun:
vars <- c("mass", "height")
cond <- c(80, 150)
starwars %>%
  filter(
    .data[[vars[[1]]]] > cond[[1]],
    .data[[vars[[2]]]] > cond[[2]]
  )

# For more complex cases, knowledge of tidy evaluation and the
# unquote operator `!!` is required. See https://tidyeval.tidyverse.org/
#
# One useful and simple tidy eval technique is to use `!!` to bypass
# the data frame and its columns. Here is how to filter the columns
# `mass` and `height` relative to objects of the same names:
mass <- 80
height <- 150
filter(starwars, mass > !!mass, height > !!height)

Example output

Attaching package: 'dplyr'

The following objects are masked from 'package:stats':

    filter, lag

The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union

# A tibble: 35 x 13
   name  height  mass hair_color skin_color eye_color birth_year gender
   <chr>  <int> <dbl> <chr>      <chr>      <chr>          <dbl> <chr> 
 1 Luke~    172    77 blond      fair       blue            19   male  
 2 Dart~    202   136 none       white      yellow          41.9 male  
 3 Leia~    150    49 brown      light      brown           19   female
 4 Owen~    178   120 brown, gr~ light      blue            52   male  
 5 Beru~    165    75 brown      light      blue            47   female
 6 Bigg~    183    84 black      light      brown           24   male  
 7 Obi-~    182    77 auburn, w~ fair       blue-gray       57   male  
 8 Anak~    188    84 blond      fair       blue            41.9 male  
 9 Wilh~    180    NA auburn, g~ fair       blue            64   male  
10 Han ~    180    80 brown      fair       brown           29   male  
# ... with 25 more rows, and 5 more variables: homeworld <chr>, species <chr>,
#   films <list>, vehicles <list>, starships <list>
# A tibble: 1 x 13
  name  height  mass hair_color skin_color eye_color birth_year gender homeworld
  <chr>  <int> <dbl> <chr>      <chr>      <chr>          <dbl> <chr>  <chr>    
1 Jabb~    175  1358 <NA>       green-tan~ orange           600 herma~ Nal Hutta
# ... with 4 more variables: species <chr>, films <list>, vehicles <list>,
#   starships <list>
# A tibble: 9 x 13
  name  height  mass hair_color skin_color eye_color birth_year gender homeworld
  <chr>  <int> <dbl> <chr>      <chr>      <chr>          <dbl> <chr>  <chr>    
1 Nien~    160    68 none       grey       black             NA male   Sullust  
2 Gasg~    122    NA none       white, bl~ black             NA male   Troiken  
3 Kit ~    196    87 none       green      black             NA male   Glee Ans~
4 Plo ~    188    80 none       orange     black             22 male   Dorin    
5 Lama~    229    88 none       grey       black             NA male   Kamino   
6 Taun~    213    NA none       grey       black             NA female Kamino   
7 Shaa~    178    57 none       red, blue~ black             NA female Shili    
8 Tion~    206    80 none       grey       black             NA male   Utapau   
9 BB8       NA    NA none       none       black             NA none   <NA>     
# ... with 4 more variables: species <chr>, films <list>, vehicles <list>,
#   starships <list>
# A tibble: 38 x 13
   name  height  mass hair_color skin_color eye_color birth_year gender
   <chr>  <int> <dbl> <chr>      <chr>      <chr>          <dbl> <chr> 
 1 Dart~    202   136 none       white      yellow          41.9 male  
 2 Gree~    173    74 <NA>       green      black           44   male  
 3 IG-88    200   140 none       metal      red             15   none  
 4 Bossk    190   113 none       green      red             53   male  
 5 Lobot    175    79 none       light      blue            37   male  
 6 Ackb~    180    83 none       brown mot~ orange          41   male  
 7 Nien~    160    68 none       grey       black           NA   male  
 8 Nute~    191    90 none       mottled g~ red             NA   male  
 9 Jar ~    196    66 none       orange     orange          52   male  
10 Roos~    224    82 none       grey       orange          NA   male  
# ... with 28 more rows, and 5 more variables: homeworld <chr>, species <chr>,
#   films <list>, vehicles <list>, starships <list>
# A tibble: 9 x 13
  name  height  mass hair_color skin_color eye_color birth_year gender homeworld
  <chr>  <int> <dbl> <chr>      <chr>      <chr>          <dbl> <chr>  <chr>    
1 Nien~    160    68 none       grey       black             NA male   Sullust  
2 Gasg~    122    NA none       white, bl~ black             NA male   Troiken  
3 Kit ~    196    87 none       green      black             NA male   Glee Ans~
4 Plo ~    188    80 none       orange     black             22 male   Dorin    
5 Lama~    229    88 none       grey       black             NA male   Kamino   
6 Taun~    213    NA none       grey       black             NA female Kamino   
7 Shaa~    178    57 none       red, blue~ black             NA female Shili    
8 Tion~    206    80 none       grey       black             NA male   Utapau   
9 BB8       NA    NA none       none       black             NA none   <NA>     
# ... with 4 more variables: species <chr>, films <list>, vehicles <list>,
#   starships <list>
# A tibble: 10 x 13
   name  height  mass hair_color skin_color eye_color birth_year gender
   <chr>  <int> <dbl> <chr>      <chr>      <chr>          <dbl> <chr> 
 1 Dart~    202   136 none       white      yellow          41.9 male  
 2 Owen~    178   120 brown, gr~ light      blue            52   male  
 3 Chew~    228   112 brown      unknown    blue           200   male  
 4 Jabb~    175  1358 <NA>       green-tan~ orange         600   herma~
 5 Jek ~    180   110 brown      fair       blue            NA   male  
 6 IG-88    200   140 none       metal      red             15   none  
 7 Bossk    190   113 none       green      red             53   male  
 8 Dext~    198   102 none       brown      yellow          NA   male  
 9 Grie~    216   159 none       brown, wh~ green, y~       NA   male  
10 Tarf~    234   136 brown      brown      blue            NA   male  
# ... with 5 more variables: homeworld <chr>, species <chr>, films <list>,
#   vehicles <list>, starships <list>
# A tibble: 25 x 13
# Groups:   gender [3]
   name  height  mass hair_color skin_color eye_color birth_year gender
   <chr>  <int> <dbl> <chr>      <chr>      <chr>          <dbl> <chr> 
 1 C-3PO    167    75 <NA>       gold       yellow         112   <NA>  
 2 Dart~    202   136 none       white      yellow          41.9 male  
 3 Owen~    178   120 brown, gr~ light      blue            52   male  
 4 Beru~    165    75 brown      light      blue            47   female
 5 Bigg~    183    84 black      light      brown           24   male  
 6 Anak~    188    84 blond      fair       blue            41.9 male  
 7 Chew~    228   112 brown      unknown    blue           200   male  
 8 Jek ~    180   110 brown      fair       blue            NA   male  
 9 Bossk    190   113 none       green      red             53   male  
10 Ackb~    180    83 none       brown mot~ orange          41   male  
# ... with 15 more rows, and 5 more variables: homeworld <chr>, species <chr>,
#   films <list>, vehicles <list>, starships <list>
# A tibble: 21 x 13
   name  height  mass hair_color skin_color eye_color birth_year gender
   <chr>  <int> <dbl> <chr>      <chr>      <chr>          <dbl> <chr> 
 1 Dart~    202   136 none       white      yellow          41.9 male  
 2 Owen~    178   120 brown, gr~ light      blue            52   male  
 3 Bigg~    183    84 black      light      brown           24   male  
 4 Anak~    188    84 blond      fair       blue            41.9 male  
 5 Chew~    228   112 brown      unknown    blue           200   male  
 6 Jabb~    175  1358 <NA>       green-tan~ orange         600   herma~
 7 Jek ~    180   110 brown      fair       blue            NA   male  
 8 IG-88    200   140 none       metal      red             15   none  
 9 Bossk    190   113 none       green      red             53   male  
10 Ackb~    180    83 none       brown mot~ orange          41   male  
# ... with 11 more rows, and 5 more variables: homeworld <chr>, species <chr>,
#   films <list>, vehicles <list>, starships <list>
# A tibble: 21 x 13
   name  height  mass hair_color skin_color eye_color birth_year gender
   <chr>  <int> <dbl> <chr>      <chr>      <chr>          <dbl> <chr> 
 1 Dart~    202   136 none       white      yellow          41.9 male  
 2 Owen~    178   120 brown, gr~ light      blue            52   male  
 3 Bigg~    183    84 black      light      brown           24   male  
 4 Anak~    188    84 blond      fair       blue            41.9 male  
 5 Chew~    228   112 brown      unknown    blue           200   male  
 6 Jabb~    175  1358 <NA>       green-tan~ orange         600   herma~
 7 Jek ~    180   110 brown      fair       blue            NA   male  
 8 IG-88    200   140 none       metal      red             15   none  
 9 Bossk    190   113 none       green      red             53   male  
10 Ackb~    180    83 none       brown mot~ orange          41   male  
# ... with 11 more rows, and 5 more variables: homeworld <chr>, species <chr>,
#   films <list>, vehicles <list>, starships <list>

dplyr documentation built on March 13, 2020, 2:02 a.m.