Description Usage Arguments Details Value Examples
A convenience function for implementing common coding schemes used with categorical variables (e.g., dummy/treatment coding, effect/sum coding, etc.).
1 2 3 4 5 6 7 8 |
dm |
An object of class |
type |
A keyword (can be capitalized) indicating the type of coding scheme to apply. Currently the function implements 5 types:
|
variables |
The subset of grouping variables to consider when implementing the coding scheme. |
columns |
An optional vector indicating which columns of the design matrix to update. |
index |
An optional value/vector for changing the reference group/order when implement dummy/effect coding or linear trends. Cannot exceed the number of levels for the subset of grouping variables. |
start |
An optional value indicating the starting column in the design matrix to begin updating. |
For K levels, K - 1 dichotomous
variables are created where each level of the subset of
grouping variables is contrasted against a reference level.
The intercept has a specific interpretation - the mean of the
reference level. Coefficients associated with the K - 1
dichotomous variables indicate the difference in means of the
given level relative to the reference level. As an example,
consider the data set PlantGrowth
, which as a single
grouping variable 'group', with three levels: 'ctrl', 'trt1',
and 'trt2'. Dummy coding is useful here, setting the 'ctrl'
level as the reference and creating 2 dichotomous variables
to estimate the difference between 'ctrl' and 'trt1' and 'trt2'
respectively.
For K levels, K - 1 dichotomous variables are created where each level of the subset of grouping variables is contrasted against the grand mean. One level must be specified as a reference, with a value fixed to -1 across the dichotomous variables. The intercept has a specific interpretation - the grand mean of the sample. Coefficients associated with the K - 1 dichotomous variables indicate the difference in means of the given level relative to the grand mean. The difference between the grand mean and the reference level is the negative of the sum of the coefficients. This is the typical coding scheme used in the linear model underlying analysis of variance (i.e., ANOVA).
If one can assume the levels of the predictor are evenly spaced (i.e., an interval variable), a linear trend can be specified. There are several ways to specify a linear trend - here, the trend is specified as to be orthogonal, by setting the values as 1 to K and then centering them (i.e., subtracting the mean). This means that the intercept can be interpreted as the grand mean.
A matrix, the subset of columns in the summary matrix
in the designmatrix
object. Note the subset<-
method can infer which columns should be updated based on the
output of the coding
function.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 | # Identity coding
dm = designmatrix( PlantGrowth, list( 'weight', 'group' ) )
# Update summary matrix
subset( dm ) = coding( dm, type = 'I' )
# Update full design matrix
dm = designmatrix( dm ); print( dm )
# Dummy coding
dm = designmatrix( PlantGrowth, list( 'weight', 'group' ) )
subset( dm ) = coding( dm, type = 'DC' )
# Update full design matrix
dm = designmatrix( dm ); print( dm )
# Effect coding
dm = designmatrix( ToothGrowth, list( 'len', c( 'supp', 'dose' ) ) )
# Specify coding separately for each variable
subset( dm ) = coding( dm, type = 'EC', variables = 'supp' )
# Second row already has coding for variable 'supp'
subset( dm ) = coding( dm, type = 'EC', variables = 'dose', start = 3 )
# Update full design matrix
dm = designmatrix( dm ); print( dm )
# Linear trend
dm = designmatrix( ToothGrowth, list( 'len', c( 'supp', 'dose' ) ) )
# Implement linear trend only for 'dose' variable
subset( dm ) = coding( dm, type = 'L', variables = 'dose' )
# Different coding schemes can be mixed and matched
subset( dm ) = coding( dm, type = 'EC', variables = 'supp', start = 3 )
# Update full design matrix
dm = designmatrix( dm ); print( dm )
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.