Description Usage Arguments Details Value Examples
Partitions from by values in grouping column, applies a generic transform to each group and then binds the groups back together. Only advised for a moderate number of groups and better if grouping column is an index. This is powerful enough to implement "The Split-Apply-Combine Strategy for Data Analysis" https://www.jstatsoft.org/article/view/v040i01
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
df |
remote dplyr data item |
gcolumn |
grouping column |
f |
transform function or pipeline |
... |
force later values to be bound by name |
ocolumn |
ordering column (optional) |
decreasing |
logical, if TRUE sort in decreasing order by ocolumn |
partitionMethod |
method to partition the data, one of 'group_by' (depends on f being dplyr compatible), 'split' (only works over local data frames), or 'extract' |
bindrows |
logical, if TRUE bind the rows back into a data item, else return split list |
maxgroups |
maximum number of groups to work over (intentionally not enforced if |
eagerCompute |
logical, if TRUE call compute on split results |
restoreGroup |
logical, if TRUE restore group column after apply when |
tempNameGenerator |
temp name generator produced by |
Note this is a fairly expensive operator, so it only makes sense to use
in situations where f
itself is fairly complicated and/or expensive.
transformed frame
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 | d <- data.frame(
group = c(1, 1, 2, 2, 2),
order = c(.1, .2, .3, .4, .5),
values = c(10, 20, 2, 4, 8)
)
# User supplied window functions. They depend on known column names and
# the data back-end matching function names (as cumsum).
cumulative_sum <- function(d) {
dplyr::mutate(d, cv = cumsum(values))
}
rank_in_group <- function(d) {
d %.>%
dplyr::mutate(., constcol = 1) %.>%
dplyr::mutate(., rank = cumsum(constcol)) %.>%
dplyr::select(., -constcol)
}
for (partitionMethod in c('group_by', 'split', 'extract')) {
print(partitionMethod)
print('cumulative sum example')
print(
gapply(
d,
'group',
cumulative_sum,
ocolumn = 'order',
partitionMethod = partitionMethod
)
)
print('ranking example')
print(
gapply(
d,
'group',
rank_in_group,
ocolumn = 'order',
partitionMethod = partitionMethod
)
)
print('ranking example (decreasing)')
print(
gapply(
d,
'group',
rank_in_group,
ocolumn = 'order',
decreasing = TRUE,
partitionMethod = partitionMethod
)
)
}
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.