hierarchyMerge: Merge UNSPSC hierarchies based on frequency in data

Description Usage Arguments Examples

View source: R/hierarchyMerge.R

Description

Uses UNSPSC codebook and codes to classify data into useful mutually exclusive categories.

Usage

1
hierarchyMerge(level.vars, cutoff = 0.005, codebook)

Arguments

level.vars
cutoff
codebook

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
##---- Should be DIRECTLY executable !! ----
##-- ==>  Define data, use random,
##--	or do  help(data=index)  for the standard data sets.

## The function is currently defined as
function (level.vars, cutoff = 0.005, codebook) 
{
    merged.cat <- rep(NA, length(level.vars[, 1]))
    oth.top <- names(which(prop.table(table(level.vars[, 1])) < 
        cutoff))
    merged.cat[level.vars[, 1] %in% oth.top] <- "Oth"
    for (i in 2:ncol(level.vars)) {
        at.level <- names(which(prop.table(table(level.vars[, 
            i])) < cutoff))
        merged.cat[level.vars[, i] %in% at.level & is.na(merged.cat)] <- level.vars[, 
            (i - 1)][level.vars[, i] %in% at.level & is.na(merged.cat)]
    }
    merged.cat[is.na(merged.cat)] <- level.vars[is.na(merged.cat), 
        ncol(level.vars)]
    all.labs <- unique(merged.cat)
    for (i in all.labs) {
        if (i != "Oth") {
            current.code <- str_pad(i, width = 8, side = "right", 
                pad = "0")
            library(stringr)
            current.lab <- codebook[codebook$code == current.code, 
                "label"]
            current.lab <- paste0(current.code, ": ", current.lab)
            if (sum(grepl(i, all.labs)) > 1) {
                current.lab <- paste(current.lab, "(other)")
            }
            merged.cat[merged.cat == i] <- current.lab
        }
    }
    return(merged.cat)
  }

jon-mellon/procureClassify documentation built on May 19, 2019, 7:26 p.m.