aggr_neat: Aggregation, descriptives

View source: R/aggr_neat.R

aggr_neatR Documentation

Aggregation, descriptives

Description

Returns aggregated values per group for given variable. Serves as argument in the table_neat function.

Usage

aggr_neat(
  dat,
  values,
  method = mean,
  group_by = NULL,
  filt = NULL,
  sep = "_",
  prefix = NULL,
  new_name = NULL,
  round_to = 2
)

Arguments

dat

A data frame (or a data.table, or the name of either as string).

values

The vector of numbers from which the statistics are to be calculated, or the name of the column in the dat data frame, that contains the vector.

method

Function of string. If function, uses the values to calculate the returned value for the given function (e.g. means, as per default, using the mean function). Such a function may return a vector of results as well; see Examples. If string, one of two internal functions will be used. If the string end with "+sd", e.g., "mean+sd", the function preceding the "+" sign will be calculated along with the standard deviation, displayed in a single column, rounded as set in the round_to argument. (This is primarily for use in the table_neat function for summary tables.) If the string does not end with "+sd", a ratio for the occurrences of given elements will be calculated. Multiple elements can by given as a vector of strings. The number of occurrences of these elements will be the numerator (dividend), while the entire column length (i.e., number of all elements) will be the denominator (divisor). For example, if a column contains elements "correct", "incorrect", "tooslow", the ratio of "correct" to all other elements (i.e., including elements "correct", "incorrect", and "tooslow") can be written simply as method = "correct". The complementary ratio, of "incorrect" and "tooslow", can be written as method = "incorrect, tooslow". (Hint: filter to get ratios of subgroups, e.g. to include only "correct" and "incorrect" elements, and calculate their ratio; see below.)

group_by

String, or vector of strings: the name(s) of the column(s) in the dat data frame, containing the vector(s) of factors by which the statistics are grouped.

filt

An expression to filter, by column values, the entire dat data frame before performing the aggregation. The expression should use column names alone; see Examples.

sep

String (underscore "_" by default) for separating group names (and prefix, if given).

prefix

NULL (default) or string. String specifies a prefix for each group type under the group column.

new_name

NULL (default) or string. String specifies new name for the variable to be used as column title. If NULL, the name will be "aggr_value" (or, if used with table_neat, the input variable name is used).

round_to

Number of digits after the decimal point to round to, when using "+sd" in method.

Value

A data.table with the statistics per group, with a single column ("aggr_group") indicating the grouping.

See Also

table_neat to create full tables using multiple variables

Examples

data("mtcars") # load base R example dataset

# overall means and SDs for wt (Weight)
aggr_neat(mtcars, wt)

# rename column
aggr_neat(mtcars, wt, new_name = 'weight')

# grouped by cyl (Number of cylinders)
aggr_neat(mtcars, wt, group_by = 'cyl')

# grouped by cyl and gear
aggr_neat(mtcars, wt, group_by = c('cyl', 'gear'))

# prefix for group names
aggr_neat(mtcars, wt, group_by = 'cyl', prefix = 'cyl')

# filter to only have cyl larger than  4
aggr_neat(mtcars, wt, group_by = 'cyl', filt = cyl > 4)

# filter to only have hp (Gross horsepower) smaller than  200
aggr_neat(mtcars, wt, group_by = 'cyl', filt = hp < 200)

# combine two filters above, and add prefix
aggr_neat(
    mtcars,
    wt,
    group_by = 'cyl',
    filt = (hp < 200 & cyl > 4),
    prefix = 'filtered'
)

# add SD (and round output numbers to 2)
aggr_neat(mtcars,
          wt,
          group_by = 'cyl',
          method = 'mean+sd',
          round_to = 2)

# now medians instead of means
aggr_neat(mtcars, wt, group_by = 'cyl', method = median)

# with SD
aggr_neat(mtcars,
          wt,
          group_by = 'cyl',
          method = 'median+sd',
          round_to = 1)

# overall ratio of gear 4 (Number of gears)
aggr_neat(mtcars, gear, method = '4')

# overall ratio of gear 4 and 5
aggr_neat(mtcars, gear, method = '4, 5')

# same ratio calculated per each cyl
aggr_neat(mtcars, gear, group_by = 'cyl', method = '4, 5')

# per each cyl and per vs (engine type)
aggr_neat(mtcars,
          gear,
          group_by = c('cyl', 'vs'),
          method = '4, 5')

# ratio of gear 3 per gear 3 and 5
aggr_neat(
    mtcars,
    gear,
    group_by = 'cyl',
    method = '3',
    filt = gear %in% c(3, 5)
)


neatStats documentation built on Dec. 8, 2022, 1:13 a.m.