View source: R/profile_table.R
| profile_table | R Documentation |
Calculate pre-determind per-column stats
profile_table(tbl, pivot = TRUE)
tbl |
A data.frame or data.table. |
pivot |
Output in cross-tabular format if |
Useful to check high-level column statistics prior to e.g. creating a table schema.
By default, a data.table with the following fields:
field_name: factor; input field names.
CLASS: chr; the class of the field. If multiple classes, these are
collapsed with a semicolon delimiter.
MAYBE_NUMBER: logi; does the field contain only numbers, such that even
upon coercion of non-numeric fields, no NA values would result?
Always TRUE for numeric (or integer) fields
FRAC_COMPLETE: numeric; the fraction of rows that are not NA
NCHAR_MAX_LEN: integer; the maximum character length of the field, after
coercing to character.
UNIQUEN: integer; the distinct count of values, excluding NA
INTEGRAL_DUPE_FCTR; integer; the fraction of duplicate values, only if
the result of dividing the distinct count of non-NA values by the
row count is an integral value. NA if this is not true, i.e. if the
modulo of the calculation != 0.
factor columns are treated as character via as.character()
set.seed(10)
int_sample <- sample(1:10L, 100, replace = TRUE)
test_df <- data.frame(
num_col = rnorm(100),
chr_col = sample(LETTERS, 100, replace = TRUE),
int_col = int_sample,
int_as_factor = as.factor(int_sample),
int_as_chr = as.character(int_sample),
all_NA_chr = NA_character_,
posix_ct_t = as.POSIXct(as.Date("2001-01-01")),
stringsAsFactors = FALSE
)
profile_table(test_df, pivot = TRUE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.