make.dummies: Create a Dummy Matrix

View source: R/tool_misc.R

make.dummiesR Documentation

Create a Dummy Matrix

Description

Contrast-coded dummy matrix (treatment coding) created from a factor

Usage

make.dummies(x, ...)

## Default S3 method:
make.dummies(x, base = 1L, base.add = TRUE, ...)

## S3 method for class 'data.frame'
make.dummies(x, col, base = 1L, base.add = TRUE, ...)

## S3 method for class 'pdata.frame'
make.dummies(x, col, base = 1L, base.add = TRUE, ...)

Arguments

x

a factor from which the dummies are created (x is coerced to factor if not yet a factor) for the default method or a data data frame/pdata.frame for the respective method.

...

further arguments.

base

integer or character, specifies the reference level (base), if integer it refers to position in levels(x), if character the name of a level,

base.add

logical, if TRUE the reference level (base) is added to the return value as first column, if FALSE the reference level is not included.

col

character (only for the data frame and pdata.frame methods), to specify the column which is used to derive the dummies from,

Details

This function creates a matrix of dummies from the levels of a factor in treatment coding. In model estimations, it is usually preferable to not create the dummy matrix prior to estimation but to simply specify a factor in the formula and let the estimation function handle the creation of the dummies.

This function is merely a convenience wrapper around stats::contr.treatment to ease the dummy matrix creation process shall the dummy matrix be explicitly required. See Examples for a use case in LSDV (least squares dummy variable) model estimation.

The default method uses a factor as main input (or something coercible to a factor) to derive the dummy matrix from. Methods for data frame and pdata.frame are available as well and have the additional argument col to specify the the column from which the dummies are created; both methods merge the dummy matrix to the data frame/pdata.frame yielding a ready-to-use data set. See also Examples for use cases.

Value

For the default method, a matrix containing the contrast-coded dummies (treatment coding), dimensions are n x n where n = length(levels(x)) if argument base.add = TRUE or n = length(levels(x)-1) if base.add = FALSE; for the data frame and pdata.frame method, a data frame or pdata.frame, respectively, with the dummies appropriately merged to the input as last columns (column names are derived from the name of the column used to create the dummies and its levels).

Author(s)

Kevin Tappe

See Also

stats::contr.treatment(), stats::contrasts()

Examples

library(plm)
data("Grunfeld", package = "plm")
Grunfeld <- Grunfeld[1:100, ] # reduce data set (down to 5 firms)

## default method
make.dummies(Grunfeld$firm) # gives 5 x 5 matrix (5 firms, base level incl.)
make.dummies(Grunfeld$firm, base = 2L, base.add = FALSE) # gives 5 x 4 matrix

## data frame method
Grun.dummies <- make.dummies(Grunfeld, col = "firm")

## pdata.frame method
pGrun <- pdata.frame(Grunfeld)
pGrun.dummies <- make.dummies(pGrun, col = "firm")

## Model estimation:
## estimate within model (individual/firm effects) and LSDV models (firm dummies)
# within model:
plm(inv ~ value + capital, data = pGrun, model = "within")

## LSDV with user-created dummies by make.dummies:
form_dummies <- paste0("firm", c(1:5), collapse = "+")
form_dummies <- formula(paste0("inv ~ value + capital + ", form_dummies))
plm(form_dummies, data = pGrun.dummies, model = "pooling") # last dummy is dropped

# LSDV via factor(year) -> let estimation function generate dummies:
plm(inv ~ value + capital + factor(firm), data = pGrun, model = "pooling")

ycroissant/plm documentation built on July 8, 2024, 3:59 a.m.