Description Usage Arguments Details See Also Examples
Takes a dataframe column you want to group by and returns a hash table. The keys are the unique values of the group by column and the values are the row numbers where each key is found. This is parallelized across all available cores on your CPU and is a direct and much faster replacement of split(df, df$group_by).
1 | hashcol(X, n.cores = parallel::detectCores() - 1)
|
X |
A dataframe column you want to group by. IE: |
n.cores |
An integer value that indicates the number of cores you want to run the process on. The default is 1 less than the total number of available cores on the CPU for UNIX flavored OSs, while the only option (currently) on Windows OS is 1. |
Check the OS and chooses the correct package to use for mclapply. The pkg parallelsugar
can be used for Windows (...but it's currently not) while parallel
is used for everything else.
WARNING FOR WINDOWS USERS: not paralellized; only runs lapply
instead of mclapply
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | asd <- data.frame(
id = rep(letters, times = 5)
, service = sample(
c('ps1', 'ps2', 'ps3', 'ps4', 'ps5', 'ps6', 'ps7')
, size = 26 * 5
, replace = TRUE
)
, stringsAsFactors = FALSE
)
h <- hashcol(asd$id, n.cores = 1)
h
hash::keys(h)
hash::values(h)
h[hash::keys(h)[26]] # key value pair
h[[hash::keys(h)[26]]] # value accessor method; same as next line
hash::values(h)[ , 26] # value accessor method; same as previous line
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.