trainCreecysMemoryBasedReasoning: Train Creecys Memory-based reaoning model

View source: R/trainCreecysMemoryBasedReasoning.R

trainCreecysMemoryBasedReasoningR Documentation

Train Creecys Memory-based reaoning model

Description

The function does some preprocessing and calculates the importance of various features.

Usage

trainCreecysMemoryBasedReasoning(
  data,
  preprocessing = list(stopwords = character(0), stemming = NULL, strPreprocessing =
    TRUE, removePunct = TRUE)
)

Arguments

data

a data.table created with removeFaultyAndUncodableAnswers_And_PrepareForAnalysis

preprocessing

a list with elements

stopwords

a character vector, use tm::stopwords("de") for German stopwords.

stemming

NULL for no stemming and "de" for stemming using the German porter stemmer.

strPreprocessing

TRUE if stringPreprocessing shall be used.

removePunct

TRUE if removePunctuation shall be used.

Value

a processed feature matrix to be used in predictCreecysMemoryBasedReasoning

See Also

predictCreecysMemoryBasedReasoning

Creecy, R. H., Masand, B. M., Smith, S. J., Waltz, D. L. (1992). Trading MIPS and Memory for Knowledge Engineering. Comm. ACM 35(8). pp. 48–65.

Examples

# set up data
data(occupations)
allowed.codes <- c("71402", "71403", "63302", "83112", "83124", "83131", "83132", "83193", "83194", "-0004", "-0030")
allowed.codes.titles <- c("Office clerks and secretaries (without specialisation)-skilled tasks", "Office clerks and secretaries (without specialisation)-complex tasks", "Gastronomy occupations (without specialisation)-skilled tasks",
 "Occupations in child care and child-rearing-skilled tasks", "Occupations in social work and social pedagogics-highly complex tasks", "Pedagogic specialists in social care work and special needs education-unskilled/semiskilled tasks", "Pedagogic specialists in social care work and special needs education-skilled tasks", "Supervisors in education and social work, and of pedagogic specialists in social care work", "Managers in education and social work, and of pedagogic specialists in social care work",
 "Not precise enough for coding", "Student assistants")
proc.occupations <- removeFaultyAndUncodableAnswers_And_PrepareForAnalysis(occupations, colNames = c("orig_answer", "orig_code"), allowed.codes, allowed.codes.titles)

# Recommended configuration (and commonly used in this package)
memModel <- trainCreecysMemoryBasedReasoning(proc.occupations,
                 preprocessing = list(stopwords = character(0), stemming = NULL, strPreprocessing = TRUE, removePunct = FALSE))

malsch/occupationCoding documentation built on March 14, 2024, 8:09 a.m.