fslice | R Documentation |
dplyr::slice()
When there are lots of groups, the fslice()
functions are much faster.
fslice(data, ..., .by = NULL, keep_order = FALSE, sort_groups = TRUE)
fslice_head(
data,
...,
n,
prop,
.by = NULL,
keep_order = FALSE,
sort_groups = TRUE
)
fslice_tail(
data,
...,
n,
prop,
.by = NULL,
keep_order = FALSE,
sort_groups = TRUE
)
fslice_min(
data,
order_by,
...,
n,
prop,
.by = NULL,
with_ties = TRUE,
na_rm = FALSE,
keep_order = FALSE,
sort_groups = TRUE
)
fslice_max(
data,
order_by,
...,
n,
prop,
.by = NULL,
with_ties = TRUE,
na_rm = FALSE,
keep_order = FALSE,
sort_groups = TRUE
)
fslice_sample(
data,
n,
replace = FALSE,
prop,
.by = NULL,
keep_order = FALSE,
sort_groups = TRUE,
weights = NULL,
seed = NULL
)
data |
Data frame |
... |
See |
.by |
(Optional). A selection of columns to group by for this operation. Columns are specified using tidy-select. |
keep_order |
Should the sliced data frame be returned in its original order?
The default is |
sort_groups |
If |
n |
Number of rows. |
prop |
Proportion of rows. |
order_by |
Variables to order by. |
with_ties |
Should ties be kept together? The default is |
na_rm |
Should missing values in |
replace |
Should |
weights |
Probability weights used in |
seed |
Seed number defining RNG state.
If supplied, this is only applied locally within the function
and the seed state isn't retained after sampling.
To clarify, whatever seed state was in place before the function call,
is restored to ensure seed continuity.
If left |
fslice()
and friends allow for more flexibility in how you order the by-group slicing.
Furthermore, you can control whether the returned data frame is sliced in
the order of the supplied row indices, or whether the
original order is retained (like dplyr::filter()
).
In fslice()
, when length(n) == 1
, an optimised method is implemented
that internally uses list_subset()
, a fast function for extracting
single elements from single-level lists that contain vectors of the same
type, e.g. integer.
fslice_head()
and fslice_tail()
are very fast with large numbers of groups.
fslice_sample()
is arguably more intuitive as it by default
resamples each entire group without replacement, without having to specify a
maximum group size like in dplyr::slice_sample()
.
A data.frame
of specified rows.
library(timeplyr)
library(dplyr)
library(nycflights13)
flights <- flights %>%
group_by(origin, dest)
# First row repeated for each group
flights %>%
fslice(1, 1)
# First row per group
flights %>%
fslice_head(n = 1)
# Last row per group
flights %>%
fslice_tail(n = 1)
# Earliest flight per group
flights %>%
fslice_min(time_hour, with_ties = FALSE)
# Last flight per group
flights %>%
fslice_max(time_hour, with_ties = FALSE)
# Random sample without replacement by group
# (or stratified random sampling)
flights %>%
fslice_sample()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.