pick_top_k: Build an optree pipeline that selects up to the top k rows...
In rquery: Relational Query Generator for Data Manipulation at Scale

pick_top_k

R Documentation

Build an optree pipeline that selects up to the top k rows from each group in the given order.

Description

This is an example of building up a desired pre-prepared pipeline fragment from relop nodes.

Usage

pick_top_k(
  source,
  ...,
  partitionby = NULL,
  orderby = NULL,
  reverse = NULL,
  k = 1L,
  order_expression = "row_number()",
  order_column = "row_number",
  keep_order_column = TRUE,
  env = parent.frame()
)

Arguments

`source`	relop tree or data.frame source.
`...`	force later arguments to bind by name.
`partitionby`	partitioning (window function) column names.
`orderby`	character, ordering (in window function) column names.
`reverse`	character, reverse ordering (in window function) of these column names.
`k`	integer, number of rows to limit to in each group.
`order_expression`	character, command to compute row-order/rank.
`order_column`	character, column name to write per-group rank in (no ties).
`keep_order_column`	logical, if TRUE retain the order column in the result.
`env`	environment to look for values in.

Examples


# by hand logistic regression example
scale <- 0.237
d <- mk_td("survey_table",
           c("subjectID", "surveyCategory", "assessmentTotal"))
optree <- d %.>%
  extend(.,
             probability %:=%
               exp(assessmentTotal * scale))  %.>%
  normalize_cols(.,
                 "probability",
                 partitionby = 'subjectID') %.>%
  pick_top_k(.,
             partitionby = 'subjectID',
             orderby = c('probability', 'surveyCategory'),
             reverse = c('probability', 'surveyCategory')) %.>%
  rename_columns(., 'diagnosis' %:=% 'surveyCategory') %.>%
  select_columns(., c('subjectID',
                      'diagnosis',
                      'probability')) %.>%
  orderby(., 'subjectID')
cat(format(optree))

rquery documentation built on Aug. 20, 2023, 9:06 a.m.

rquery index

Package overview README.md Assignment Partitioner Parameterized rquery Pipeable SQL Query Generation R mapping rquery Introduction rquery Many Columns SQL quoting

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

rquery
Relational Query Generator for Data Manipulation at Scale

pick_top_k: Build an optree pipeline that selects up to the top k rows...
In rquery: Relational Query Generator for Data Manipulation at Scale

Build an optree pipeline that selects up to the top k rows from each group in the given order.

Description

Usage

Arguments

Examples

Related to pick_top_k in rquery...

R Package Documentation

Browse R Packages

We want your feedback!

rquery Relational Query Generator for Data Manipulation at Scale

pick_top_k: Build an optree pipeline that selects up to the top k rows... In rquery: Relational Query Generator for Data Manipulation at Scale

Build an optree pipeline that selects up to the top k rows from each group in the given order.

Description

Usage

Arguments

Examples

Related to pick_top_k in rquery...

R Package Documentation

Browse R Packages

We want your feedback!

rquery
Relational Query Generator for Data Manipulation at Scale

pick_top_k: Build an optree pipeline that selects up to the top k rows...
In rquery: Relational Query Generator for Data Manipulation at Scale