ff_summ_bygroup: Summarize one variable in a dataset, by another categorical...
In FanWangEcon/REconTools: R Tools for Panel Data and Optimization

Description Usage Arguments Value Author(s) References Examples

Generate distributional and other statistics for a particular continuous variable, categorized by some discrete variables. Wage by gender for example.

ff_summ_bygroup(
  df,
  vars.group,
  var.numeric,
  str.stats.group = "main",
  ar.perc = c(0.01, 0.05, 0.1, 0.25, 0.5, 0.75, 0.9, 0.95, 0.99),
  str.stats.specify = NULL,
  boo.overall.stats = TRUE
)

`df`	dataframe input dataframe of interest
`vars.group`	list of strings containing grouping variables, could be gender and age groups for example
`var.numeric`	string variable name of continuous quantitative variable to summarize
`str.stats.group`	string what type of statistics to consider see line 31 and below
`ar.perc`	array of percentiles to calculate, only calculated if str.stats.group = 'mainperc'

a list of various variables

df_table_grp_stats - A dataframe where each row is a combination of categories, and columns are categories and statistics
df_row_grp_stats - A single row with all statistics
df_overall_stats - A dataframe with non-grouped overall summaries
df_row_stats_all - A named list of all statistics generated

Fan Wang, http://fanwangecon.github.io

https://fanwangecon.github.io/REconTools/reference/ff_summ_bygroup.html https://fanwangecon.github.io/REconTools/articles/fv_summ_bygroup.html https://github.com/FanWangEcon/REconTools/blob/master/R/ff_summ_bygroup.R

data(mtcars)
df_mtcars <- mtcars
df <- df_mtcars
vars.group <- c('am', 'vs')
var.numeric <- 'mpg'
str.stats.group <- 'allperc'
ar.perc <- c(0.01, 0.05, 0.10, 0.25, 0.5, 0.75, 0.9, 0.95, 0.99)
ls_summ_by_group <- ff_summ_bygroup(df, vars.group, var.numeric, str.stats.group, ar.perc)
df_table_grp_stats <- ls_summ_by_group$df_table_grp_stats
df_row_grp_stats <- ls_summ_by_group$df_row_grp_stats
df_overall_stats <- ls_summ_by_group$df_overall_stats
df_row_stats_all <- ls_summ_by_group$df_row_stats_all
print(df_table_grp_stats)
print(df_row_grp_stats)
print(df_overall_stats)
print(df_row_stats_all)