nominalCoding: Coding Schemes for Nominal Variables
In rettopnivek/utilityf: Useful Functions for Modeling and Plotting

Description Usage Arguments Details Value Examples

Creates a numeric coding scheme for a nominal variable with two or more categories.

1	nominalCoding(x, type = "Effects", levels = NULL, label = NULL, weights = NULL)

`x`	a vector of levels.
`type`	the type of coding to use, either 'Dummy', 'Effects', or 'Intercept'.
`levels`	an optional value used to specify the reference group with dummy, simple, and effects coding schemes, or more generally, a vector with the unique levels used to map a desired order when creating the design matrix.
`label`	an optional character string giving the label for the nominal variable.
`weights`	an optional vector of weights to be assigned to each level of the nominal variable when applying the coefficient coding scheme.

With dummy coding, each additional category in a nominal variable is compared against a reference category. For K categories, K - 1 dummy variables are created, coded as 1 for the presence of a category and 0 otherwise. The intercept term is interpreted as the cell mean for the reference category.

With simple coding, each additional category in a nominal variable is compared against a reference category. For K categories, K - 1 dummy variables are created, coded as (K-1)/K for the presence of a category, -1/K otherwise. The intercept term is interpreted as the grand mean (the mean of the cell means).

With effects coding (also known as deviation coding), each additional category in a nominal variable is compared against the grand mean. For K categories, K - 1 dummy variables are created, coded as 1 for the presence of a category, -1 for the presence of the reference category, and 0 otherwise. The intercept term is interpreted as the grand mean.

With intercept coding, a separate dummy variable is specified for each category in the nominal variable, coded as 1 when the category is present and 0 otherwise. This coding scheme requires that the model have no intercept term. Instead, predictions for the dependent variable for each category are estimated separately. Therefore, for K categories, K dummy variables are created.

With coefficient coding, a single dummy variable is specified, and a desired weight is assigned for each level of the nominal variable. This is useful for testing hypothesized ordered relationships.

Given K levels for the inputted variable, returns a matrix with K - 1 columns (dummy, simple, and effects coding schemes), K columns (intercept coding schemes), or 1 column (coefficient coding schemes) with the new numerical coding scheme.

x = rep( c('low','med','high'), each = 2 ) # 3 levels
data.frame( x, nominalCoding( x, type = 'Dummy' ) )
data.frame( x, nominalCoding( x, type = 'Simple' ) )
data.frame( x, nominalCoding( x, type = 'Effects' ) )
data.frame( x, nominalCoding( x, type = 'Intercept' ) )
data.frame( x, nominalCoding( x, type = 'Coefficient', weights = rnorm(3) ) )