qtab: Fast (Weighted) Cross Tabulation

View source: R/qtab.R

qtabR Documentation

Fast (Weighted) Cross Tabulation

Description

A versatile and computationally more efficient replacement for table. Notably, it also supports tabulations with frequency weights, and computation of a statistic over combinations of variables.

Usage

qtab(..., w = NULL, wFUN = NULL, wFUN.args = NULL,
     dnn = "auto", sort = .op[["sort"]], na.exclude = TRUE,
     drop = FALSE, method = "auto")

qtable(...) # Long-form. Use set_collapse(mask = "table") to replace table()

Arguments

...

atomic vectors or factors spanning the table dimensions, (optionally) with tags for the dimension names, or a data frame / list of these. See Examples.

w

a single vector to aggregate over the table dimensions e.g. a vector of frequency weights.

wFUN

a function used to aggregate w over the table dimensions. The default NULL computes the sum of the non-missing weights via an optimized internal algorithm. Fast Statistical Functions also receive vectorized execution.

wFUN.args

a list of (optional) further arguments passed to wFUN. See Examples.

dnn

the names of the table dimensions. Either passed directly as a character vector or list (internally unlist'ed), a function applied to the ... list (e.g. names, or vlabels), or one of the following options:

  • "auto" constructs names based on the ... arguments, or calls names if a single list is passed as input.

  • "namlab" does the same as "auto", but also calls vlabels on the list and appends the names by the variable labels.

dnn = NULL will return a table without dimension names.

sort, na.exclude, drop, method

arguments passed down to qF:

  • sort = FALSE orders table dimensions in first-appearance order of items in the data (can be more efficient if vectors are not factors already). Note that for factors this option will both recast levels in first-appearance order and drop unused levels.

  • na.exclude = FALSE includes NA's in the table (equivalent to table's useNA = "ifany").

  • drop = TRUE removes any unused factor levels (= zero frequency rows or columns).

  • method %in% c("radix", "hash") provides additional control over the algorithm used to convert atomic vectors to factors.

Value

An array of class 'qtab' that inherits from 'table'. Thus all 'table' methods apply to it.

See Also

descr, Summary Statistics, Fast Statistical Functions, Collapse Overview

Examples

## Basic use
qtab(iris$Species)
with(mtcars, qtab(vs, am))
qtab(mtcars[.c(vs, am)])

library(magrittr)
iris %$% qtab(Sepal.Length > mean(Sepal.Length), Species)
iris %$% qtab(AMSL = Sepal.Length > mean(Sepal.Length), Species)

## World after 2015
wlda15 <- wlddev |> fsubset(year >= 2015) |> collap(~ iso3c)

# Regions and income levels (country frequency)
wlda15 %$% qtab(region, income)
wlda15 %$% qtab(region, income, dnn = vlabels)
wlda15 %$% qtab(region, income, dnn = "namlab")

# Population (millions)
wlda15 %$% qtab(region, income, w = POP) |> divide_by(1e6)

# Life expectancy (years)
wlda15 %$% qtab(region, income, w = LIFEEX, wFUN = fmean)

# Life expectancy (years), weighted by population
wlda15 %$% qtab(region, income, w = LIFEEX, wFUN = fmean,
                  wFUN.args = list(w = POP))

# GDP per capita (constant 2010 US$): median
wlda15 %$% qtab(region, income, w = PCGDP, wFUN = fmedian,
                  wFUN.args = list(na.rm = TRUE))

# GDP per capita (constant 2010 US$): median, weighted by population
wlda15 %$% qtab(region, income, w = PCGDP, wFUN = fmedian,
                  wFUN.args = list(w = POP))

# Including OECD membership
tab <- wlda15 %$% qtab(region, income, OECD)
tab

# Various 'table' methods
tab |> addmargins()
tab |> marginSums(margin = c("region", "income"))
tab |> proportions()
tab |> proportions(margin = "income")
as.data.frame(tab) |> head(10)
ftable(tab, row.vars = c("region", "OECD"))

# Other options
tab |> fsum(TRA = "%")    # Percentage table (on a matrix use fsum.default)
tab %/=% (sum(tab)/100)    # Another way (division by reference, preserves integers)
tab

rm(tab, wlda15)

collapse documentation built on Nov. 3, 2024, 9:08 a.m.