View source: R/profile_table.R
profile_table | R Documentation |
Calculate pre-determind per-column stats
profile_table(tbl, pivot = TRUE)
tbl |
A data.frame or data.table. |
pivot |
Output in cross-tabular format if |
Useful to check high-level column statistics prior to e.g. creating a table schema.
By default, a data.table with the following fields:
field_name: factor
; input field names.
CLASS: chr
; the class of the field. If multiple classes, these are
collapsed with a semicolon delimiter.
MAYBE_NUMBER: logi
; does the field contain only numbers, such that even
upon coercion of non-numeric fields, no NA
values would result?
Always TRUE
for numeric
(or integer
) fields
FRAC_COMPLETE: numeric
; the fraction of rows that are not NA
NCHAR_MAX_LEN: integer
; the maximum character length of the field, after
coercing to character
.
UNIQUEN: integer
; the distinct count of values, excluding NA
INTEGRAL_DUPE_FCTR; integer
; the fraction of duplicate values, only if
the result of dividing the distinct count of non-NA
values by the
row count is an integral value. NA
if this is not true, i.e. if the
modulo of the calculation != 0
.
factor
columns are treated as character
via as.character()
set.seed(10) int_sample <- sample(1:10L, 100, replace = TRUE) test_df <- data.frame( num_col = rnorm(100), chr_col = sample(LETTERS, 100, replace = TRUE), int_col = int_sample, int_as_factor = as.factor(int_sample), int_as_chr = as.character(int_sample), all_NA_chr = NA_character_, posix_ct_t = as.POSIXct(as.Date("2001-01-01")), stringsAsFactors = FALSE ) profile_table(test_df, pivot = TRUE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.