cleanData: Rename categories of one or more variables in the dataset...

Description Usage Arguments Value Examples

Description

Rename categories of one or more variables in the dataset using an external match file

Usage

1
cleanData(Data, varsClean, matchFileName, ...)

Arguments

Data

data frame to be processed.

varsClean

a character vector of variables to be regrouped.

matchFileName

a character string naming a match file. The match file is an excel file with one or multiple sheets. Sheetname must be corresponding to variable names in varsClean. Each sheet must contain the original value of the variable in the 1st column, and their new values in those next columns. If the sheetname is "varNames", rename the vairables of the dataset given in the sheet.

...

other arguments passed on to read.xlsx for reading the match file.

Value

a data frame of regrouped variables.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
data(QueenslandRaw)

matchFileName <- system.file("extdata","QueenslandMatch.xlsx", package = "pplpredict")
# View the first six lines of the sheet varNames in "QueenslandMatch.xlsx"
head(xlsx::read.xlsx(matchFileName, sheetName = "varNames", stringsAsFactors = FALSE, header = FALSE))

cleanData(QueenslandRaw, c("genderRaw","studentRaw","educationRaw"), matchFileName)
cleanData(QueenslandRaw, c("genderRaw","studentRaw","educationRaw"), matchFileName)

# Rename variables in a dataset
cleanData(QueenslandRaw, "varNames", matchFileName)

varsClean <- c("varNames", "genderRaw", "educationRaw", "studentRaw", "birthYearRaw", "industryRaw",
				  "polInterestRaw", "polConsumptionRaw", "religionRaw", "selfPlacementRaw", "birthplaceRaw",
				  "incomeRaw", "voteChoiceRaw", "voteChoiceLeaningRaw")
cleanData(QueenslandRaw, varsClean, "QueenslandMatch.xlsx")

uyenhoang/pplpredict documentation built on May 3, 2019, 2:41 p.m.