descriptives: Compute descriptive statistics on columns of a data frame

descriptivesR Documentation

Compute descriptive statistics on columns of a data frame

Description

The user can specify an unlimited number of functions to evaluate and the types of data that each set of functions will be applied to (including the default; see "Details").

Usage

descriptives(
    data,
    f_all = NULL,
    f_numeric = NULL,
    numeric_types = "numeric",
    f_categorical = NULL,
    categorical_types = "factor",
    f_other = NULL,
    useNA = c("ifany", "no", "always"),
    round = 2,
    na_string = "(missing)"
)

Arguments

data

A data.frame.

f_all

A list of functions to evaluate on all columns.

f_numeric

A list of functions to evaluate on numeric_types columns.

numeric_types

Character vector of data types that should be evaluated by f_numeric.

f_categorical

A list of functions to evaluate on categorical_types columns.

categorical_types

Character vector of data types that should be evaluated by f_categorical.

f_other

A list of functions to evaluate on remaining columns.

useNA

See table for details. Defaults to "ifany".

round

Digit to round numeric data. Defaults to 2.

na_string

String to fill in NA names.

Details

The following fun_key's are available by default for the specified types:

  • ALL: length, missing, available, class, unique

  • Numeric: mean, sd, min, q1, median, q3, max, iqr, range

  • Categorical: count, proportion, percent

Value

A tibble::tibble with the following columns:

  • fun_eval: Column types function was applied to

  • fun_key: Name of function that was evaluated

  • col_ind: Index from input dataset

  • col_lab: Label of the column

  • val_ind: Index of the value within the function result

  • val_lab: Label extracted from the result with names

  • val_dbl: Numeric result

  • val_chr: Non-numeric result

  • val_cbn: Combination of (rounded) numeric and non-numeric values

Author(s)

Alex Zajichek

Examples

#Default
heart_disease %>%
    descriptives()

#Allow logicals as categorical
heart_disease %>%
    descriptives(
        categorical_types = c("logical", "factor")
    ) %>%
    
    #Extract info from the column
    dplyr::filter(
        col_lab == "BloodSugar"
    ) 

#Nothing treated as numeric
heart_disease %>%
    descriptives(
        numeric_types = NULL
    )

#Evaluate a custom function
heart_disease %>%
    descriptives(
        f_numeric = 
            list(
                cv = function(x) sd(x, na.rm = TRUE)/mean(x, na.rm = TRUE)
            )
    ) %>%
    
    #Extract info from the custom function
    dplyr::filter(
        fun_key == "cv"
    ) 


cheese documentation built on Jan. 7, 2023, 1:17 a.m.