describeBy: Descriptive statistics

View source: R/describeBy.R

describeByR Documentation

Descriptive statistics

Description

Descriptive statistics and univariable association tests

Usage

describeBy(
  data,
  var.names,
  var.labels = var.names,
  by1,
  by2 = NULL,
  total = c("top", "bottom", "none"),
  Missing = TRUE,
  test = TRUE,
  digits = 0,
  total.digits = 0,
  p.digits = 3,
  bold_pval = FALSE,
  sig.level = 0.05,
  dispersion = c("sd", "se"),
  stats = c("parametric", "non-parametric"),
  per = "col",
  simulate.p.value = FALSE,
  B = 2000,
  bold_var = TRUE,
  fill = FALSE
)

Arguments

data

data.frame to produce descriptive statistics

var.names

variable names of interest in data

var.labels

variable descriptions. Uses var.names by default.

by1

factor to split other variables by in data

by2

optional second factor to split other variables by

total

add a row showing the total counts of each by1 level at the top or bottom of the table. Setting none hides the total row.

Missing

logical; if TRUE, shows missing value counts, if they exist

test

logical; if TRUE, univariable tests are performed and a PValue column is added to the end of the table.

digits

number of digits to round descriptive statistics. Supply a single value to round all variables to the same number of digits, or a vector of values to supply different rounding per variable.

total.digits

number of digits to round the total count percentages

p.digits

number of digits to round univariable test p-value

bold_pval

logical; if TRUE, p-values are bolded if statistically significant at sig.level

sig.level

significance level; default 0.05

dispersion

measure of variability, either "sd" (default) or "se".

stats

either "parametric" (default) or "non-parametric" univariable tests are performed for continuous variables. We use the parametric one-way test or the non-parametric Kruskal-Wallis test.

per

print column ("col") or row ("row") percentages. Suppress percentages with "none".

simulate.p.value

passed to chisq.test. Only relevant for categorical variables.

B

passed to chisq.test. Only relevant for categorical variables.

bold_var

logical; if TRUE, the Variable names are wrapped in double asterisks. If the table is parsed by pandoc the variable names are in bold.

fill

logical; if TRUE, the Variable and PValue columns are repeated for every row it pertains to. If FALSE, the value is only shown when it changes.

Details

Takes variables from data and returns descriptive statistics split on factor by1.

Value

A table with descriptive statistics for continuous and categorical variables and relevant univariable association tests

Author(s)

Aline Talhouk

Examples

mtcars$cyl <- as.factor(mtcars$cyl)
mtcars$vs <- as.character(mtcars$vs)
Amisc::describeBy(data = mtcars, var.names = c("vs", "hp"), by1 = "cyl",
Missing = TRUE, dispersion = "sd", stats = "parametric")

AlineTalhouk/Amisc documentation built on May 26, 2023, 3:40 p.m.