knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) library(qacr)
Getting summary statistics for a quantitative variable is a very common task in data analysis. Unfortunately, R makes it surprisingly difficult.
The qstats
function is an attempt to rectify the situation by making it simple to get any number of descriptive statistics for a numeric variable and to break these statistics down by the levels of one or more categorical variables (groups).
The general format is
qstats(data, variable, grouping variables, statistics, other options)
Note that variable names do not have to be quoted.
By default the sample size, mean, and standard deviation are provided.
Let's take a look at fuel efficiencies for 11,914 automobiles
in the cardata
data frame.
# simple summary statistics qstats(cardata, highway_mpg) # summary statistics by vehicle_size qstats(cardata, highway_mpg, vehicle_size) # summary statistics by vehicle_size and drive type qstats(cardata, highway_mpg, vehicle_size, driven_wheels)
You can supply a statistics argument with the "stats" parameter. You can pass a single statistic, or multiple statistics as a vector of names.
# single statistic qstats(cardata, highway_mpg, vehicle_size, stats = "median") # multiple statistics qstats(cardata, highway_mpg, vehicle_size, stats = c("median", "min", "max"))
User-defined functions can also be used as a statistics. The only requirement is that the function returns a single number.
#custom statistics p25 <- function(x) quantile(x, probs=.25) p75 <- function(x) quantile(x, probs=.75) #calling the built in and custom statistics qstats(cardata, highway_mpg, vehicle_size, stats = c("min", "p25", "p75", "max"))
Other options include
qstats(cardata, highway_mpg, vehicle_size, stats=c("n", "mean","median","sd"), na.rm=FALSE, digits=2)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.