proc_freq: View and return the frequency distribution of a variable.

Description Usage Arguments Details Value See Also Examples

Description

For continuous variables, the user can optionally specify to discretize the variable into a fixed number of equal width bins, or into custom bins of the user's choice. This is useful for larger datasets with many unique observed values

Usage

1
proc_freq(dat, var, bins = 0)

Arguments

dat

a tbl

var

character string giving the name of the desired variable, or a single number giving the position of the desired variable

bins

if 0, no discretization is performed. If a positive integer then var is binned into bins equal width ranges, and the frequency distribution of those ranges is computed. If a length > 1 numeric vector, then var is binned into ranges with cutpoints defined by the unique entries of bins

Details

R has many one-line solutions to getting the frequency distribution of a variable; this function provides a unified approach that makes use of the efficient data types and computation provided by the dplyr package, and as a bonus, makes it easy to explore the distribution of a continuous variable with many unique observations by automating discretization. The name is intended to make the function more portable for SAS users who are not comfortable outside their native habitat.

Value

a tbl containing 3 columns: level gives the unique values or bins, count gives the count in each level of level and percent gives the percentage of total observations in each level. proc_freq also automatically sends the frequency distribution to the viewer, using utils::View

See Also

Other descriptive: get_top_corrs

Examples

1
2
3
proc_freq(faithful,"eruptions")
proc_freq(faithful,"eruptions",bins = 4)
proc_freq(faithful,"eruptions",bins = c(1,2,3,4,5))

awstringer/modellingTools documentation built on May 11, 2019, 4:11 p.m.