tally: Tabulate categorical data

View source: R/tally.R

tallyR Documentation

Tabulate categorical data

Description

Tabulate categorical data

Usage

tally(x, ...)

## S3 method for class 'tbl'
mosaic_tally(x, wt, sort = FALSE, ..., envir = parent.frame())

## S3 method for class 'data.frame'
mosaic_tally(x, wt, sort = FALSE, ..., envir = parent.frame())

## S3 method for class 'formula'
mosaic_tally(
  x,
  data = parent.frame(),
  format = c("count", "proportion", "percent", "data.frame", "sparse", "default"),
  margins = FALSE,
  quiet = TRUE,
  subset,
  groups = NULL,
  useNA = "ifany",
  groups.first = FALSE,
  ...
)

Arguments

x

an object

...

additional arguments passed to table()

wt

for weighted tallying, see dplyr::tally() in dplyr

sort

a logical, see dplyr::tally() in dplyr

envir

an environment in which to evaluate

data

a data frame or environment in which evaluation occurs. Note that the default is data=parent.frame(). This makes it convenient to use this function interactively by treating the working envionment as if it were a data frame. But this may not be appropriate for programming uses. When programming, it is best to use an explicit data argument – ideally supplying a data frame that contains the variables mentioned

format

a character string describing the desired format of the results. One of 'default', 'count', 'proportion', 'percent', 'data.frame', 'sparse', or 'default'. In case of 'default', counts are used unless there is a condition, in which case proportions are used instead. Note that prior to version 0.9.3, 'default' was the default, now it is 'count'. 'data.frame' converts the table to a data frame with one row per cell; 'sparse' additionally removes any rows with 0 counts.

margins

a logical indicating whether marginal distributions should be displayed.

quiet

a logical indicating whether messages about order in which marginal distributions are calculated should be suppressed. See stats::addmargins().

subset

an expression evaluating to a logical vector used to select a subset of data

groups

used to specify a condition as an alternative to using a formula with a condition.

useNA

as in table(), but the default here is "ifany".

groups.first

a logical indicating whether groups should be inserted ahead of the condition (else after).

Details

The dplyr package also exports a dplyr::tally() function. If x inherits from class "tbl" or "data frame", then dplyr's dplyr::tally() is called. This makes it easier to have the two packages coexist.

Otherwise, tally() is designed as an alternative to table() and xtabs(). The primary use case it to describe a (possibly multi-dimensional) table using a formula. For a table of counts, each component of the formula becomes one of the dimensions of the cross table. For tables of proportions or percents, conditional proportions and percents are computed, conditioned on each level of all "secondary" (i.e., conditioning) variables, defined as everything other than the left hand side, if there is a left hand side to the formula; and everything except the right hand side if the left hand side of the formula is empty. Note that groups is folded into the formula prior to this determination and becomes part of the conditioning.

When marginal totals are added, they are added for all of the conditioning dimensions, and proportions should sum to 1 for each level of the conditioning variables. This can be useful to make it clear which conditional proportions are being computed.

See the examples for some typical use cases.

Value

A object of class "table", unless passing through to dplyr or converted to a data frame because format is "data.frame" or "sparse".

Note

The current implementation when format = "sparse" first creates the full data frame and then removes the unneeded rows. So the savings is in terms of space, not time.

Examples

if (require(mosaicData)) {
tally( ~ substance, data = HELPrct)
tally( ~ substance + sex , data = HELPrct)
tally( sex ~ substance, data = HELPrct)   # equivalent to tally( ~ sex | substance, ... )
tally( ~ substance | sex , data = HELPrct)
tally( ~ substance | sex , data = HELPrct, format = 'count', margins = TRUE)
tally( ~ substance + sex , data = HELPrct, format = 'percent', margins = TRUE)
tally( ~ substance | sex , data = HELPrct, format = 'percent', margins = TRUE)
# force NAs to show up
tally( ~ sex, data = HELPrct, useNA = "always")
# show NAs if any are there
tally( ~ link, data = HELPrct)
# ignore the NAs
tally( ~ link, data = HELPrct, useNA = "no")
}

mosaicCore documentation built on Nov. 5, 2023, 9:06 a.m.