dummyVars_MSqRob: Create A Full Set of Dummy Variables

dummyVars_MSqRobR Documentation

Create A Full Set of Dummy Variables

Description

dummyVars_MSqRob creates a full set of dummy variables (i.e. less than full rank parameterization)

Usage

dummyVars_MSqRob(formula, ...)

## Default S3 method:
dummyVars_MSqRob(formula, data, sep = ".",
  levelsOnly = FALSE, fullRank = FALSE, ...)

## S3 method for class 'dummyVars_MSqRob'
predict(object, newdata, na.action = na.pass, ...)

contr.ltfr_MSqRob(n, contrasts = TRUE, sparse = FALSE)

Arguments

formula

An appropriate R model formula, see References

...

additional arguments to be passed to other methods

data

A data frame with the predictors of interest

sep

An optional separator between factor variable names and their levels. Use sep = NULL for no separator (i.e. normal behavior of model.matrix as shown in the Details section)

levelsOnly

A logical; TRUE means to completely remove the variable names from the column names

fullRank

A logical; should a full rank or less than full rank parameterization be used? If TRUE, factors are encoded to be consistent with model.matrix and the resulting there are no linear dependencies induced between the columns.

object

An object of class dummyVars_MSqRob

newdata

A data frame with the required columns

na.action

A function determining what should be done with missing values in newdata. The default is to predict NA.

n

A vector of levels for a factor, or the number of levels.

contrasts

A logical indicating whether contrasts should be computed.

sparse

A logical indicating if the result should be sparse.

x

A factor vector.

Details

Most of the contrasts functions in R produce full rank parameterizations of the predictor data. For example, contr.treatment creates a reference cell in the data and defines dummy variables for all factor levels except those in the reference cell. For example, if a factor with 5 levels is used in a model formula alone, contr.treatment creates columns for the intercept and all the factor levels except the first level of the factor. For the data in the Example section below, this would produce:

 (Intercept) dayTue dayWed dayThu dayFri daySat daySun 1 1 1 0
0 0 0 0 2 1 1 0 0 0 0 0 3 1 1 0 0 0 0 0 4 1 0 0 1 0 0 0 5 1 0 0 1 0 0 0 6 1
0 0 0 0 0 0 7 1 0 1 0 0 0 0 8 1 0 1 0 0 0 0 9 1 0 0 0 0 0 0 

In some situations, there may be a need for dummy variables for all the levels of the factor. For the same example:

 dayMon dayTue
dayWed dayThu dayFri daySat daySun 1 0 1 0 0 0 0 0 2 0 1 0 0 0 0 0 3 0 1 0 0
0 0 0 4 0 0 0 1 0 0 0 5 0 0 0 1 0 0 0 6 1 0 0 0 0 0 0 7 0 0 1 0 0 0 0 8 0 0
1 0 0 0 0 9 1 0 0 0 0 0 0 

Given a formula and initial data set, the class dummyVars_MSqRob gathers all the information needed to produce a full set of dummy variables for any data set. It uses contr.ltfr_MSqRob as the base function to do this.

class2ind is most useful for converting a factor outcome vector to a matrix of dummy variables.

Value

The output of dummyVars_MSqRob is a list of class 'dummyVars_MSqRob' with elements

call

the function call

form

the model formula

vars

names of all the variables in the model

facVars

names of all the factor variables in the model

lvls

levels of any factor variables

sep

NULL or a character separator

terms

the terms.formula object

levelsOnly

a logical

The predict function produces a data frame.

contr.ltfr_MSqRob generates a design matrix.

Author(s)

contr.ltfr_MSqRob is a small modification of contr.treatment by Max Kuhn

References

https://cran.r-project.org/doc/manuals/R-intro.html#Formulae-for-statistical-models

See Also

model.matrix, contrasts, formula

Examples



when <- data.frame(time = c("afternoon", "night", "afternoon",
                            "morning", "morning", "morning",
                            "morning", "afternoon", "afternoon"),
                   day = c("Mon", "Mon", "Mon",
                           "Wed", "Wed", "Fri",
                           "Sat", "Sat", "Fri"))

levels(when$time) <- list(morning="morning",
                          afternoon="afternoon",
                          night="night")
levels(when$day) <- list(Mon="Mon", Tue="Tue", Wed="Wed", Thu="Thu",
                         Fri="Fri", Sat="Sat", Sun="Sun")

## Default behavior:
model.matrix(~day, when)

mainEffects <- dummyVars_MSqRob(~ day + time, data = when)
mainEffects
predict(mainEffects, when[1:3,])

when2 <- when
when2[1, 1] <- NA
predict(mainEffects, when2[1:3,])
predict(mainEffects, when2[1:3,], na.action = na.omit)


interactionModel <- dummyVars_MSqRob(~ day + time + day:time,
                              data = when,
                              sep = ".")
predict(interactionModel, when[1:3,])

noNames <- dummyVars_MSqRob(~ day + time + day:time,
                     data = when,
                     levelsOnly = TRUE)
predict(noNames, when)


statOmics/MSqRob documentation built on Dec. 8, 2022, 6 a.m.