summarizeNumerics: Extracts numeric variables and presents an summary in a...

View source: R/summarize.R

summarizeNumericsR Documentation

Extracts numeric variables and presents an summary in a workable format.

Description

Finds the numeric variables, and ignores the others. (See summarizeFactors for a function that handles non-numeric variables). It will provide quantiles (specified probs as well as other summary statistics, specified stats. Results are returned in a data frame. The main benefits from this compared to R's default summary are 1) more summary information is returned for each variable (dispersion), 2) the results are returned in a form that is easy to use in further analysis, 3) the variables in the output may be alphabetized.

Usage

summarizeNumerics(
  dat,
  alphaSort = FALSE,
  probs = c(0, 0.5, 1),
  stats = c("mean", "sd", "skewness", "kurtosis", "nobs", "nmiss"),
  na.rm = TRUE,
  unbiased = TRUE,
  digits = 2
)

Arguments

dat

a data frame or a matrix

alphaSort

If TRUE, the columns are re-organized in alphabetical order. If FALSE, they are presented in the original order.

probs

Controls calculation of quantiles (see the R quantile function's probs argument). If FALSE or NULL, no quantile estimates are provided. Default is c("min" = 0, "med" = 0.5, "max" = 1.0), which will appear in output as c("min", "med", "max"). Other values between 0 and 1 are allowed. For example, c(0.3, 0.7) will appear in output as pctile_30% and pctile_70%.

stats

A vector including any of these: c("min", "med", "max", "mean", "sd", "var", "skewness", "kurtosis", "nobs", "nmiss"). Default includes all except var. "nobs" means number of observations with non-missing, finite scores (not NA, NaN, -Inf, or Inf). "nmiss" is the number of cases with values of NA. If FALSE or NULL, provide none of these.

na.rm

default TRUE. Should missing data be removed to calculate summaries?

unbiased

If TRUE (default), skewness and kurtosis are calculated with biased corrected (N-1) divisor in the standard deviation.

digits

Number of digits reported after decimal point. Default is 2

Value

a data.frame with one column per summary element (rows are the variables).

Author(s)

Paul E. Johnson pauljohn@ku.edu

See Also

summarize and summarizeFactors


rockchalk documentation built on Aug. 6, 2022, 5:05 p.m.