summary_suggestions: Summary suggestions

Description Usage Arguments Value Examples

View source: R/summary_suggestions.R

Description

Takes a dataframe object and returns a nested list object comprising of three lists. The first element of the output list corresponds to the descriptive statistics of numeric variables, the second element displays a list of summary data for the categorical variables and the final element calculates the count and proportion of distinct values in each categorical column. The last object of the output list can be used to determine which categorical variables to drop due to high proportion of unique values based on an input threshold value.

Usage

1
summary_suggestions(df, threshold = 0.8)

Arguments

df

The dataframe on which the function will operate

threshold

A float value that sets the threshold for the proportion of unique values

Value

list

Examples

1
2
3
4
5
6
7
library(palmerpenguins)
summary_suggestions(penguins)

"summary statistics for numeric variables,
summary statistics for categorical variables,
percentage of unique values for categorical variables,
list of variables with percentage of unique values higher than the threshold"

UBC-MDS/reasyeda documentation built on Feb. 6, 2022, 7 a.m.