slice | R Documentation |
slice()
lets you index rows by their (integer) locations. It allows you
to select, remove, and duplicate rows. It is accompanied by a number of
helpers for common use cases:
slice_head()
and slice_tail()
select the first or last rows.
slice_sample()
randomly selects rows.
slice_min()
and slice_max()
select rows with the smallest or largest
values of a variable.
If .data
is a grouped_df, the operation will be performed on each group,
so that (e.g.) slice_head(df, n = 5)
will select the first five rows in
each group.
slice(.data, ..., .by = NULL, .preserve = FALSE)
slice_head(.data, ..., n, prop, by = NULL)
slice_tail(.data, ..., n, prop, by = NULL)
slice_min(
.data,
order_by,
...,
n,
prop,
by = NULL,
with_ties = TRUE,
na_rm = FALSE
)
slice_max(
.data,
order_by,
...,
n,
prop,
by = NULL,
with_ties = TRUE,
na_rm = FALSE
)
slice_sample(.data, ..., n, prop, by = NULL, weight_by = NULL, replace = FALSE)
.data |
A data frame, data frame extension (e.g. a tibble), or a lazy data frame (e.g. from dbplyr or dtplyr). See Methods, below, for more details. |
... |
For Provide either positive values to keep, or negative values to drop. The values provided must be either all positive or all negative. Indices beyond the number of rows in the input are silently ignored. For |
.by , by |
< |
.preserve |
Relevant when the |
n , prop |
Provide either A negative value of |
order_by |
< |
with_ties |
Should ties be kept together? The default, |
na_rm |
Should missing values in |
weight_by |
< |
replace |
Should sampling be performed with ( |
Slice does not work with relational databases because they have no
intrinsic notion of row order. If you want to perform the equivalent
operation, use filter()
and row_number()
.
For slice_sample()
, note that the weights provided in weight_by
are
passed through to the prob
argument of base::sample.int()
. This means
they cannot be used to reconstruct summary statistics from the underlying
population. See this discussion
for more details.
An object of the same type as .data
. The output has the following
properties:
Each row may appear 0, 1, or many times in the output.
Columns are not modified.
Groups are not modified.
Data frame attributes are preserved.
These function are generics, which means that packages can provide implementations (methods) for other classes. See the documentation of individual methods for extra arguments and differences in behaviour.
Methods available in currently loaded packages:
slice()
: \Sexpr[stage=render,results=rd]{dplyr:::methods_rd("slice")}.
slice_head()
: \Sexpr[stage=render,results=rd]{dplyr:::methods_rd("slice_head")}.
slice_tail()
: \Sexpr[stage=render,results=rd]{dplyr:::methods_rd("slice_tail")}.
slice_min()
: \Sexpr[stage=render,results=rd]{dplyr:::methods_rd("slice_min")}.
slice_max()
: \Sexpr[stage=render,results=rd]{dplyr:::methods_rd("slice_max")}.
slice_sample()
: \Sexpr[stage=render,results=rd]{dplyr:::methods_rd("slice_sample")}.
Other single table verbs:
arrange()
,
filter()
,
mutate()
,
reframe()
,
rename()
,
select()
,
summarise()
# Similar to head(mtcars, 1):
mtcars %>% slice(1L)
# Similar to tail(mtcars, 1):
mtcars %>% slice(n())
mtcars %>% slice(5:n())
# Rows can be dropped with negative indices:
slice(mtcars, -(1:4))
# First and last rows based on existing order
mtcars %>% slice_head(n = 5)
mtcars %>% slice_tail(n = 5)
# Rows with minimum and maximum values of a variable
mtcars %>% slice_min(mpg, n = 5)
mtcars %>% slice_max(mpg, n = 5)
# slice_min() and slice_max() may return more rows than requested
# in the presence of ties.
mtcars %>% slice_min(cyl, n = 1)
# Use with_ties = FALSE to return exactly n matches
mtcars %>% slice_min(cyl, n = 1, with_ties = FALSE)
# Or use additional variables to break the tie:
mtcars %>% slice_min(tibble(cyl, mpg), n = 1)
# slice_sample() allows you to random select with or without replacement
mtcars %>% slice_sample(n = 5)
mtcars %>% slice_sample(n = 5, replace = TRUE)
# You can optionally weight by a variable - this code weights by the
# physical weight of the cars, so heavy cars are more likely to get
# selected.
mtcars %>% slice_sample(weight_by = wt, n = 5)
# Group wise operation ----------------------------------------
df <- tibble(
group = rep(c("a", "b", "c"), c(1, 2, 4)),
x = runif(7)
)
# All slice helpers operate per group, silently truncating to the group
# size, so the following code works without error
df %>% group_by(group) %>% slice_head(n = 2)
# When specifying the proportion of rows to include non-integer sizes
# are rounded down, so group a gets 0 rows
df %>% group_by(group) %>% slice_head(prop = 0.5)
# Filter equivalents --------------------------------------------
# slice() expressions can often be written to use `filter()` and
# `row_number()`, which can also be translated to SQL. For many databases,
# you'll need to supply an explicit variable to use to compute the row number.
filter(mtcars, row_number() == 1L)
filter(mtcars, row_number() == n())
filter(mtcars, between(row_number(), 5, n()))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.