removeFaultyAndUncodableAnswers_And_PrepareForAnalysis: Data Preparation

View source: R/removeFaultyAndUncodableAnswers_And_PrepareForAnalysis.R

removeFaultyAndUncodableAnswers_And_PrepareForAnalysisR Documentation

Data Preparation

Description

Prepare data (i.e. columns 'id', 'ans', and 'code' are appended to the dataset - only these columns will be used later) and remove answer that we cannot use (i.e. answers that have non-ASCII characters after preprocessing and answers that are at most one character long). During data preparation you should make sure that nothing important is lost here.

Usage

removeFaultyAndUncodableAnswers_And_PrepareForAnalysis(
  data,
  colNames = c("answer", "code"),
  allowed.codes,
  allowed.codes.titles = 1:length(allowed.codes)
)

Arguments

allowed.codes

a vector of allowed codes from the classification.

allowed.codes.titles

Labels for allowed.codes. Should have the same length as allowed.codes.

answers

a character vector of answers

codes

a vector of classification codes having the same length as answers. Will be transformed to character.

Details

The 2010 German classification is available at https://www.klassifikationsserver.de/.

Value

a data.table with attributes classification and overview_tally

See Also

createDescriptives

Examples

occupations <- data.table(answers = c("LEITER VERTRIEB", "Kfz-Schlossermeister", "Aushilfe im Hotel(Bereich Housekeeping)"),
                   codes = c("61194", "25213", "63221"))
(allowed.codes <- c("11101", "61194", "25213", "63221", "..."))
(allowed.codes.titles <- c("Berufe in der Landwirtschaft (ohne Spezialisierung) - Helfer-/Anlernt\xe4tigkeiten", "Berufe in der Kraftfahrzeugtechnik - komplexe Spezialistent\xe4tigkeiten", "F\xfchrungskrÀfte - Einkauf und Vertrieb", "Berufe im Hotelservice - Helfer-/Anlernt\xe4tigkeiten", "many more category labels from the classification"))
removeFaultyAndUncodableAnswers_And_PrepareForAnalysis(occupations, colNames = c("answers", "codes"), allowed.codes, allowed.codes.titles)

data(occupations)
allowed.codes <- c("71402", "71403", "63302", "83112", "83124", "83131", "83132", "83193", "83194", "-0004", "-0030")
allowed.codes.titles <- c("Office clerks and secretaries (without specialisation)-skilled tasks", "Office clerks and secretaries (without specialisation)-complex tasks", "Gastronomy occupations (without specialisation)-skilled tasks",
 "Occupations in child care and child-rearing-skilled tasks", "Occupations in social work and social pedagogics-highly complex tasks", "Pedagogic specialists in social care work and special needs education-unskilled/semiskilled tasks", "Pedagogic specialists in social care work and special needs education-skilled tasks", "Supervisors in education and social work, and of pedagogic specialists in social care work", "Managers in education and social work, and of pedagogic specialists in social care work",
 "Not precise enough for coding", "Student assistants")
removeFaultyAndUncodableAnswers_And_PrepareForAnalysis(occupations, colNames = c("orig_answer", "orig_code"), allowed.codes, allowed.codes.titles)

## we could also paste both answers together
occupations[, answer_combined := paste(orig_answer, orig_answer2)]
removeFaultyAndUncodableAnswers_And_PrepareForAnalysis(occupations, colNames = c("answer_combined", "orig_code"), allowed.codes, allowed.codes.titles)

malsch/occupationCoding documentation built on March 14, 2024, 8:09 a.m.