do_summary: Do numerical summaries by groups

Description Usage Arguments Value Examples

View source: R/do_summary.R

Description

Do numerical summaries by groups with formaula interface. Missing values are automatically removed.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
do_summary(
  y,
  data = NULL,
  stat = c("n", "missing", "mean", "trimmed", "sd", "variance", "min", "Q1", "median",
    "Q3", "max", "mad", "IQR", "range", "cv", "se", "skewness", "kurtosis"),
  trim = 0.1,
  type = 3,
  na.rm = TRUE
)

## S3 method for class 'num_summaries'
print(x, ..., digits = NA, format = "f", digits_sk = 2)

Arguments

y

formula with variable names to summarize. See more in examples.

data

data set

stat

(character) Descriptive statistics to compute. Currently supported statistics:

  • "n" - number of non-missing observations,

  • "missing" - number of missing observations,,

  • "mean" - arithmetic mean,

  • "sd" - standard deviation,

  • "variance" - variance,

  • "trimmed" - trimmed mean,

  • "min" - minimum value,

  • "Q1" - 1-st quartile,

  • "Md" - median,

  • "Q3" - 3-rd quartile,

  • "max" - maximum value,

  • "mad" - median absolute deviation from median (more details mad),

  • "IQR" - interquartile range,

  • "range" - range,

  • "cv" - coefficient of variation,

  • "se" - standard error of mean,

  • "skewness" - skewness,

  • "kurtosis" - excess kurtosis.

trim

The fraction (0 to 0.5) of observations to be trimmed from each end of sorted variable before the mean is computed. Values of trim outside that range are taken as the nearest endpoint.

type

(integer: 1, 2, 3) The type of skewness and kurtosis estimate. See psych::describe() and psych::mardia() for details.

na.rm

(logical) Flag to remove missing values. Default is TRUE.

x

object to print

...

further arguments to methods.

digits

Number of digits for descriptive statistics.

format

(character) "f", "g", "e", "fg". Either one value or a vector of values for each column. Each value will be passed to fun separately.
"f" gives numbers in the usual xxx.xxx format;
"e" and "E" give n.ddde+nn or n.dddE+nn (scientific format);
"g" and "G" put number into scientific format only if it saves space to do so.
"fg" uses fixed format as "f", but digits as the minimum number of significant digits. This can lead to quite long result strings

digits_sk

Number of digits for skweness and kurtosis.

Value

Data frame with summary satatistics.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
library(biostat)
data(cabbages, package = "MASS")

do_summary(~VitC, data = cabbages) %>%
  print(digits = 2)

do_summary(VitC ~ Cult, data = cabbages) %>%
  print(digits = 2)

do_summary(VitC ~ Cult + Date, data = cabbages, stat = "mean") %>%
  print(digits = 2)

do_summary(HeadWt + VitC ~ Cult + Date,
  data = cabbages,
  stat = c("n", "mean", "sd")
) %>%
  print(digits = 1)


# TODO:
# 1. First argument should be a data frame
#

GegznaV/BioStat documentation built on Aug. 14, 2020, 9:30 p.m.