Univariate analysis of continuous and categorical variables"

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

The first step in data exploration usually consists of univariate, descriptive analysis of all variables of interest. Tidycomm offers three basic functions to quickly output relevant statistics:

library(tidycomm)

For demonstration purposes, we will use sample data from the Worlds of Journalism 2012-16 study included in tidycomm.

WoJ

Describe continuous variables

describe() outputs several measures of central tendency and variability for all variables named in the function call:

WoJ %>%  
  describe(autonomy_selection, autonomy_emphasis, work_experience)

If no variables are passed to describe(), all numeric variables in the data are described:

WoJ %>% 
  describe()

Data can be grouped before describing:

WoJ %>%  
  dplyr::group_by(country) %>% 
  describe(autonomy_emphasis, autonomy_selection)

Describe categorical variables

describe_cat() outputs a short summary of categorical variables (number of unique values, mode, N of mode) of all variables named in the function call:

WoJ %>% 
  describe_cat(reach, employment, temp_contract)

If no variables are passed to describe_cat(), all categorical variables (i.e., character and factor variables) in the data are described:

WoJ %>% 
  describe_cat()

Data can be grouped before describing:

WoJ %>% 
  dplyr::group_by(reach) %>% 
  describe_cat(country, employment)

Tabulate frequencies of categorical variables

tab_frequencies() outputs absolute and relative frequencies of all unique values of one or more categorical variables:

WoJ %>%  
  tab_frequencies(employment)

Passing more than one variable will compute relative frequencies based on all combinations of unique values:

WoJ %>%  
  tab_frequencies(employment, country)

You can also group your data before. This will lead to within-group relative frequencies:

WoJ %>% 
  dplyr::group_by(country) %>%  
  tab_frequencies(employment)

(Compare the columns percent, cum_n and cum_percent with the output above.)



Try the tidycomm package in your browser

Any scripts or data that you put into this service are public.

tidycomm documentation built on July 6, 2021, 5:07 p.m.