# Univariate analysis of continuous and categorical variables

The first step in data exploration usually consists of univariate, descriptive analysis of all variables of interest. Tidycomm offers three basic functions to quickly output relevant statistics:

• `describe()` for continuous variables
• `describe_cat()` for categorical variables
• `tab_frequencies()` for categorical variables
```library(tidycomm)
```

For demonstration purposes, we will use sample data from the Worlds of Journalism 2012-16 study included in tidycomm.

```WoJ
```

## Describe continuous variables

`describe()` outputs several measures of central tendency and variability for all variables named in the function call:

```WoJ %>%
describe(autonomy_selection, autonomy_emphasis, work_experience)
```

If no variables are passed to `describe()`, all numeric variables in the data are described:

```WoJ %>%
describe()
```

Data can be grouped before describing:

```WoJ %>%
dplyr::group_by(country) %>%
describe(autonomy_emphasis, autonomy_selection)
```

## Describe categorical variables

`describe_cat()` outputs a short summary of categorical variables (number of unique values, mode, N of mode) of all variables named in the function call:

```WoJ %>%
describe_cat(reach, employment, temp_contract)
```

If no variables are passed to `describe_cat()`, all categorical variables (i.e., `character` and `factor` variables) in the data are described:

```WoJ %>%
describe_cat()
```

Data can be grouped before describing:

```WoJ %>%
dplyr::group_by(reach) %>%
describe_cat(country, employment)
```

## Tabulate frequencies of categorical variables

`tab_frequencies()` outputs absolute and relative frequencies of all unique values of one or more categorical variables:

```WoJ %>%
tab_frequencies(employment)
```

Passing more than one variable will compute relative frequencies based on all combinations of unique values:

```WoJ %>%
tab_frequencies(employment, country)
```

You can also group your data before. This will lead to within-group relative frequencies:

```WoJ %>%
dplyr::group_by(country) %>%
tab_frequencies(employment)
```

(Compare the columns `percent`, `cum_n` and `cum_percent` with the output above.)

