# Univariate analysis of continuous and categorical variables" In tidycomm: Data Modification and Analysis for Communication Research

```knitr::opts_chunk\$set(
collapse = TRUE,
comment = "#>"
)
```

The first step in data exploration usually consists of univariate, descriptive analysis of all variables of interest. Tidycomm offers three basic functions to quickly output relevant statistics:

• `describe()` for continuous variables
• `describe_cat()` for categorical variables
• `tab_frequencies()` for categorical variables
```library(tidycomm)
```

For demonstration purposes, we will use sample data from the Worlds of Journalism 2012-16 study included in tidycomm.

```WoJ
```

## Describe continuous variables

`describe()` outputs several measures of central tendency and variability for all variables named in the function call:

```WoJ %>%
describe(autonomy_selection, autonomy_emphasis, work_experience)
```

If no variables are passed to `describe()`, all numeric variables in the data are described:

```WoJ %>%
describe()
```

Data can be grouped before describing:

```WoJ %>%
dplyr::group_by(country) %>%
describe(autonomy_emphasis, autonomy_selection)
```

## Describe categorical variables

`describe_cat()` outputs a short summary of categorical variables (number of unique values, mode, N of mode) of all variables named in the function call:

```WoJ %>%
describe_cat(reach, employment, temp_contract)
```

If no variables are passed to `describe_cat()`, all categorical variables (i.e., `character` and `factor` variables) in the data are described:

```WoJ %>%
describe_cat()
```

Data can be grouped before describing:

```WoJ %>%
dplyr::group_by(reach) %>%
describe_cat(country, employment)
```

## Tabulate frequencies of categorical variables

`tab_frequencies()` outputs absolute and relative frequencies of all unique values of one or more categorical variables:

```WoJ %>%
tab_frequencies(employment)
```

Passing more than one variable will compute relative frequencies based on all combinations of unique values:

```WoJ %>%
tab_frequencies(employment, country)
```

You can also group your data before. This will lead to within-group relative frequencies:

```WoJ %>%
dplyr::group_by(country) %>%
tab_frequencies(employment)
```

(Compare the columns `percent`, `cum_n` and `cum_percent` with the output above.)

## Try the tidycomm package in your browser

Any scripts or data that you put into this service are public.

tidycomm documentation built on July 6, 2021, 5:07 p.m.