best_categories_approximate: best_categories_approximate

Description Usage Arguments Value Examples

View source: R/best_categories_approximate.R

Description

Find the best pairing between values and categories based on a set of probabilities.

Usage

1
2
best_categories_approximate(df, category_probabilities,
  encode_cols = NULL)

Arguments

df

data frame (or matrix) to be encoded

category_probabilities

matrix or dataframe with rownames containing keys to be looked up, ith column containing probabilities of being in category i

encode_cols

which columns should be encoded (others are left alone)

Value

A dataframe with encode_cols replaced by data encoded into categories from caegory_probabilities

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
dict2 <- rep("consonant",26)
names(dict2) <- letters
dict2[c("a","e","i","o","u")] <- "vowel"
probs <- matrix(0,nrow = 26, ncol = 2)
colnames(probs) <- c("vowel","consonant")
rownames(probs) <- letters
probs[,1] <- abs((dict2 == "vowel") -.001)
probs[25,1] <- 0.25
probs[23,1] <- 0.05
probs[,2] <- 1-probs[,1]
mat <- matrix(c("a","w","x","y","c","w","r","r"),nrow = 4, ncol = 2, byrow=T)
mat
best_categories_approximate(mat,probs)

jveech/recolumnize documentation built on Dec. 11, 2019, 2:05 a.m.