Description Usage Arguments Details Value Examples
View source: R/utility_functions.R
Creates a numeric coding scheme for a nominal variable with two or more categories.
1 |
x |
a vector of levels. |
type |
the type of coding to use, either 'Dummy', 'Effects', or 'Intercept'. |
levels |
an optional value used to specify the reference group with dummy, simple, and effects coding schemes, or more generally, a vector with the unique levels used to map a desired order when creating the design matrix. |
label |
an optional character string giving the label for the nominal variable. |
weights |
an optional vector of weights to be assigned to each level of the nominal variable when applying the coefficient coding scheme. |
With dummy coding, each additional category in a nominal variable is compared against a reference category. For K categories, K - 1 dummy variables are created, coded as 1 for the presence of a category and 0 otherwise. The intercept term is interpreted as the cell mean for the reference category.
With simple coding, each additional category in a nominal variable is compared against a reference category. For K categories, K - 1 dummy variables are created, coded as (K-1)/K for the presence of a category, -1/K otherwise. The intercept term is interpreted as the grand mean (the mean of the cell means).
With effects coding (also known as deviation coding), each additional category in a nominal variable is compared against the grand mean. For K categories, K - 1 dummy variables are created, coded as 1 for the presence of a category, -1 for the presence of the reference category, and 0 otherwise. The intercept term is interpreted as the grand mean.
With intercept coding, a separate dummy variable is specified for each category in the nominal variable, coded as 1 when the category is present and 0 otherwise. This coding scheme requires that the model have no intercept term. Instead, predictions for the dependent variable for each category are estimated separately. Therefore, for K categories, K dummy variables are created.
With coefficient coding, a single dummy variable is specified, and a desired weight is assigned for each level of the nominal variable. This is useful for testing hypothesized ordered relationships.
Given K levels for the inputted variable, returns a matrix with K - 1 columns (dummy, simple, and effects coding schemes), K columns (intercept coding schemes), or 1 column (coefficient coding schemes) with the new numerical coding scheme.
1 2 3 4 5 6 | x = rep( c('low','med','high'), each = 2 ) # 3 levels
data.frame( x, nominalCoding( x, type = 'Dummy' ) )
data.frame( x, nominalCoding( x, type = 'Simple' ) )
data.frame( x, nominalCoding( x, type = 'Effects' ) )
data.frame( x, nominalCoding( x, type = 'Intercept' ) )
data.frame( x, nominalCoding( x, type = 'Coefficient', weights = rnorm(3) ) )
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.