slice_top: Subset the rows of the data by top package

View source: R/slice.R

slice_topR Documentation

Subset the rows of the data by top package

Description

This function allow to easily subset the full temporal data by an aggregate statistic across (subset of) the temporal variable. For example, we have the daily download count for each package from 2012 to 2020 but we want to subset the data based on the top n packages, where top is determined by the total downloads over 2018-2020.

Usage

slice_top(
  .data,
  order_by = "n_unique",
  n,
  prop,
  with_ties = TRUE,
  .fun = sum,
  rank = "package",
  from = Sys.Date() - 365,
  to = Sys.Date()
)

Arguments

.data

A data frame, consisting of a column date, that rank a category based on some metric for specified range of dates

order_by

The name of the column to order the ranking by.

n

The number of top packages to filter the data by.

prop

The proportion of the the top package to filter the data by. Currently not implemented.

with_ties

Whether to include ties or not. Currently not implemented.

Examples

library(ggplot2)
ctvExperimentalDesign %>% 
  slice_top(n = 10) %>% 
  ggplot(aes(date, n_unique, group = package)) + 
  geom_line() + 
  facet_grid(package ~ .)

numbats/cranscrub documentation built on July 1, 2022, 4:34 p.m.