summary-statistics: Summary Statistics

summary-statisticsR Documentation

Summary Statistics

Description

collapse provides the following functions to efficiently summarize and examine data:

  • qsu, shorthand for quick-summary, is an extremely fast summary command inspired by the (xt)summarize command in the STATA statistical software. It computes a set of 7 statistics (nobs, mean, sd, min, max, skewness and kurtosis) using a numerically stable one-pass method. Statistics can be computed weighted, by groups, and also within-and between entities (for multilevel / panel data).

  • qtab, shorthand for quick-table, is a faster and more versatile alternative to table. Notably, it also supports tabulations with frequency weights, as well as computing a statistic over combinations of variables. 'qtab's inherit the 'table' class, allowing for seamless application of 'table' methods.

  • descr computes a concise and detailed description of a data frame, including (sorted) frequency tables for categorical variables and various statistics and quantiles for numeric variables. It is inspired by Hmisc::describe, but about 10x faster.

  • pwcor, pwcov and pwnobs compute (weighted) pairwise correlations, covariances and observation counts on matrices and data frames. Pairwise correlations and covariances can be computed together with observation counts and p-values. The elaborate print method displays all of these statistics in a single correlation table.

  • varying very efficiently checks for the presence of any variation in data (optionally) within groups (such as panel-identifiers). A variable is variant if it has at least 2 distinct non-missing data points.

Table of Functions

Function / S3 Generic Methods Description
qsu default, matrix, data.frame, grouped_df, pseries, pdata.frame, sf Fast (grouped, weighted, panel-decomposed) summary statistics
qtab No methods, for data frames or vectors Fast (weighted) cross tabulation
descr default, grouped_df (default method handles most objects) Detailed statistical description of data frame
pwcor No methods, for matrices or data frames Pairwise (weighted) correlations
pwcov No methods, for matrices or data frames Pairwise (weighted) covariances
pwnobs No methods, for matrices or data frames Pairwise observation counts
varying default, matrix, data.frame, pseries, pdata.frame, grouped_df Fast variation check

See Also

Collapse Overview, Fast Statistical Functions


SebKrantz/collapse documentation built on Dec. 16, 2024, 7:26 p.m.