knitr::opts_chunk$set(echo = TRUE)

Purpose

The following notebook documents the look-up table used to replace text encoding errors.

Import

Libraries

  if (!require(pacman)) {install.packages('pacman')}
  p_load(
    dplyr
  )

Define Table

The values for the encoding error and pattern replacement vectors must be entered in the same order.

# Define known encoding errors
error_encoding <- c(
  "’", "鈥檚", "鈥�", "â¼", "录", "
  perf01’1)lailc’p", "perf01鈥�1)lailc鈥橮"
)

print(error_encoding)

# Define encoding error replacement patters
pattern_replacement <- c(
  "'",  "'s", "'", "=", "=",
  "performance", "performance"
)

print(pattern_replacement)

Export to Package System Data

error_file_name <- "encoding_errors.RDS"

pattern_file_name <- "replacement_patterns.RDS"


canfielder/CausalityExtraction documentation built on Jan. 5, 2022, 10:55 a.m.