summarise_sub | R Documentation |
Sometimes, when one is working with data frames that have data frames nested within
them (see tibble-package
or nest
), one
will want to extract summary statistics or key aspects of information from the embedded
data frames and move them to columns in the top level. This function applies summary
functions to the nested data frames and pulls them out into columns of the higher-level data frame.
summarise_sub(df, data_col_name, ..., handle_nulls = FALSE, scoped_in = TRUE)
df |
A data frame |
data_col_name |
The column name of the nested data frames, bare or as a string. |
... |
the name-value pairs of summary functions (see |
handle_nulls |
A boolean indicating whether rows with NULL values for the nested column should throw an error ( |
scoped_in |
A boolean indicating whether the summary functions are scoped within the nested data frames alone ( |
A data frame / tibble
d <- mtcars %>% dplyr::mutate(Name=row.names(mtcars)) %>% as_tibble() %>% tidyr::nest(-cyl) d %>% summarise_sub(data, mean_mpg = mean(mpg), sd_hp = sd(hp), n=n()) # Here we can see that if we set `scoped_in` to `FALSE`, `n()` will access the number of rows of the higher-level data frame instead of the nested ones. This could be useful in some circumstances, I just can't think of any. d %>% summarise_sub(data, n=n(), scoped_in = FALSE) # If there's a NULL value in the nested column, by default it will throw an error # If `handle_nulls` is `TRUE`, then rows with NULL values will return NAs d[2,]$data <- list(NULL) ## Not run: d %>% summarise_sub(data, mean_mpg = mean(mpg), n=n()) ## End(Not run) d %>% summarise_sub(data, mean_mpg = mean(mpg), n=n(), handle_nulls = TRUE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.