knitr::opts_chunk$set( collapse = TRUE, comment = "#>", out.width = "100%", fig.path = "man/figures/" )
This a collection of convenience functions for data handling, plotting, and analysis within the tidyverse
syntax. Most of these functions are wrappers for tasks I perform regularly (e. g., describe variables, add mean indices) and wanted to integrate them into %>%
pipes with just one line.
This is all very much work in progress.
Install from GitHub:
install.packages("devtools") devtools::install_github("joon-e/jutilities")
jutilities
are to be used with the tidyverse
. Most functions will return a tibble and can thus be easily integrated into %>%
pipes.
library(tidyverse) library(jutilities)
describe()
computes several measures of central tendency and variability for all specified variables:
diamonds %>% describe(x, y, z)
If no variables are specified, all numeric variables are described:
diamonds %>% describe()
describe_groups()
outputs the same statistics for one variable, grouped by one or more grouping variables.
diamonds %>% describe_groups(price, cut, color)
cat_ftable()
outputs a frequency table including relative, valid, and cumulative frequencies for one categorical variable.
diamonds %>% cat_ftable(cut)
cat_xtable
outputs contigency tables for one column variable and one or more row variables:
diamonds %>% cat_xtable(cut, color, clarity)
Setting the argument percentages = TRUE
changes the output to relative frequencies:
diamonds %>% cat_xtable(cut, color, percentages = TRUE)
A Chi² test can be optionally computed by setting the argument chisq = TRUE
. Test results will be displayed in a console message:
diamonds %>% cat_xtable(cut, color, chisq = TRUE)
add_index()
adds a rowwise mean index columns of the specified variables to the dataset. The second argument (first argument if used in a pipe) should be the name of the index column:
diamonds %>% add_index(meanxyz, x, y, z)
Set the argument type = "sum"
to create a sum index instead:
diamonds %>% add_index(sumxyz, x, y, z, type = "sum")
add_label
adds a text label column of a numeric variable to the dataset, for example to be used as labels in plots.
diamonds %>% add_label(labelz, z)
By default, add_label
rounds to two decimal places. You can change this by setting the decimal.places
argument:
diamonds %>% add_label(labelz, z, decimal.places = 0)
correlate()
computes correlations coefficients and p-values for all combinations of the specified variables. By default, Pearson correlation coefficients are computed. Set the argument type = "spearman"
to compute Spearman ranked correlations coefficicents instead.
diamonds %>% correlate(x, y, z)
ttest
computes t-Tests for one grouping variable and one or more test variables. Output statistics include descriptives, t-values, degrees of freedom, p-values, and Cohen's d:
diamonds %>% filter(cut == "Ideal" | cut == "Fair") %>% droplevels() %>% ttest(cut, price, x, y, z)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.