The goal of cursory is to make it easier to summarize data and look at
your variables. It builds off dplyr
and
purrr
. It is also compatible with
dbplyr
and remote data.
You can install the released version of cursory from CRAN with:
install.packages("cursory")
And the development version from GitHub with:
# install.packages("devtools")
devtools::install_github("halpo/cursory")
This is a basic example which shows you how to solve a common problem:
library(dplyr)
library(cursory)
data(iris)
## basic summary statistics for each variable in a data frame.
cursory_all(group_by(iris, Species), lst(mean, median)) %>% ungroup()
| Variable | Species | mean | median | | :----------- | :--------- | ----: | -----: | | Sepal.Length | setosa | 5.006 | 5.00 | | Sepal.Length | versicolor | 5.936 | 5.90 | | Sepal.Length | virginica | 6.588 | 6.50 | | Sepal.Width | setosa | 3.428 | 3.40 | | Sepal.Width | versicolor | 2.770 | 2.80 | | Sepal.Width | virginica | 2.974 | 3.00 | | Petal.Length | setosa | 1.462 | 1.50 | | Petal.Length | versicolor | 4.260 | 4.35 | | Petal.Length | virginica | 5.552 | 5.55 | | Petal.Width | setosa | 0.246 | 0.20 | | Petal.Width | versicolor | 1.326 | 1.30 | | Petal.Width | virginica | 2.026 | 2.00 |
## summary statistics for only numeric variables.
cursory_if(iris, is.numeric, lst(Mean = mean, 'Std. Deviation' = sd))
| Variable | Mean | Std. Deviation | | :----------- | -------: | -------------: | | Sepal.Length | 5.843333 | 0.8280661 | | Sepal.Width | 3.057333 | 0.4358663 | | Petal.Length | 3.758000 | 1.7652982 | | Petal.Width | 1.199333 | 0.7622377 |
## summary statistics for specific variables.
cursory_at(iris, vars(ends_with("Length")), var)
| Variable | var | | :----------- | --------: | | Sepal.Length | 0.6856935 | | Petal.Length | 3.1162779 |
table_1
The cursory
package also provides a table_1
function that allows for
describing variables of a dataset for different subsets automatically.
This is useful in creating the very common demographics “table 1”.
table_1(iris, Species)
| Variable | Level | (All) | setosa | versicolor | virginica | | :----------- | :----- | :---- | :----- | :--------- | :-------- | | Sepal.Length | Min | 4.300 | 4.300 | 4.900 | 4.900 | | | Median | 5.800 | 5.000 | 5.900 | 6.500 | | | Mean | 5.843 | 5.006 | 5.936 | 6.588 | | | Max | 7.900 | 5.800 | 7.000 | 7.900 | | | SD | 0.828 | 0.352 | 0.516 | 0.636 | | Sepal.Width | Min | 2.000 | 2.300 | 2.000 | 2.200 | | | Median | 3.000 | 3.400 | 2.800 | 3.000 | | | Mean | 3.057 | 3.428 | 2.770 | 2.974 | | | Max | 4.400 | 4.400 | 3.400 | 3.800 | | | SD | 0.436 | 0.379 | 0.314 | 0.322 | | Petal.Length | Min | 1.000 | 1.000 | 3.000 | 4.500 | | | Median | 4.300 | 1.500 | 4.300 | 5.500 | | | Mean | 3.758 | 1.462 | 4.260 | 5.552 | | | Max | 6.900 | 1.900 | 5.100 | 6.900 | | | SD | 1.765 | 0.174 | 0.470 | 0.552 | | Petal.Width | Min | 0.100 | 0.100 | 1.000 | 1.400 | | | Median | 1.300 | 0.200 | 1.300 | 2.000 | | | Mean | 1.199 | 0.246 | 1.326 | 2.026 | | | Max | 2.500 | 0.600 | 1.800 | 2.500 | | | SD | 0.762 | 0.105 | 0.198 | 0.275 |
The table_1()
function also tags the Variable column as a dontrepeat
class column which make repeating values in columns not appear when
formatted, so that tables are easier to read.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.