recode_missings: Recode pre-defined missing values as NA

Description Usage Arguments Details Value Examples

View source: R/recode_missings.R

Description

This function is not needed any more, please see the details section.

Usage

1
2
3
4
5
recode_missings(ess_data, missing_codes)

recode_numeric_missing(x, missing_codes)

recode_strings_missing(y, missing_codes)

Arguments

ess_data

data frame or tibble with data from the European Social Survey. This data frame should come either from import_rounds, import_country or read with read_dta or read_spss. This is the case because it identifies missing values using labelled classes.

missing_codes

a character vector with values 'Not applicable', 'Refusal', 'Don't Know', 'No answer' or 'Not available'. By default all values are chosen. Note that the wording is case sensitive.

x

a labelled numeric

y

a character vector

Details

Data from the European Social Survey is always accompanied by a script that recodes the categories 'Not applicable', 'Refusal', 'Don't Know', 'No answer' and 'Not available' to missing. This function recodes these categories to NA

The European Social Survey now provides these values recoded automatically in Stata data files. These missing categories are now read as missing values by read_dta, reading the missing categories correctly from Stata.For an example on how these values are coded, see here.

Old details:

When downloading data directly from the European Social Survey's website, the downloaded .zip file contains a script that recodes some categories as missings in Stata and SPSS formats.

For recoding numeric variables recode_numeric_missings uses the labels provided by the labelled class to delete the labels matched in missing_codes. For the character variables matching is done with the underlying number assigned to each category, namely 6, 7, 8, 9 and 9 for 'Not applicable', Refusal', 'Don't Know', No answer' and 'Not available'.

The functions are a direct translation of the Stata script that comes along when downloading one of the rounds. The Stata script is the same for all rounds and all countries, meaning that these functions work for all rounds.

Value

The same data frame or tibble but with values 'Not applicable', 'Refusal', 'Don't Know', 'No answer' and 'Not available' recoded as NA.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
## Not run: 
seven <- import_rounds(7, your_email)

attr(seven$tvtot, "labels")
mean(seven$tvtot, na.rm = TRUE)

names(table(seven$lnghom1))
# First three are actually missing values

seven_recoded <- recode_missings(seven)

attr(seven_recoded$tvtot, "labels")
# All missings have been removed
mean(seven_recoded$tvtot, na.rm = TRUE)

names(table(seven_recoded$lnghom1))
# All missings have been removed

# If you want to operate on specific variables
# you can use other recode_*_missing 

seven$tvtot <- recode_numeric_missing(seven$tvtot)

# Recode only 'Don't know' and 'No answer' to missing
seven$tvpol <- recode_numeric_missing(seven$tvpol, c("Don't know", "No answer"))


# The same can be done with recode_strings_missing

## End(Not run)

ropensci/essurvey documentation built on Jan. 10, 2022, 3:20 p.m.