View source: R/expandCategorical.R
| expandCategorical | R Documentation |
Expands the rows of a data frame by re-expressing observations of a
categorical variable specified by catvar, such that the
column(s) corresponding to catvar are replaced by a factor
specifying the possible categories for each observation and a vector
of 0/1 counts over these categories.
expandCategorical(data, catvar, sep = ".", countvar = "count",
idvar = "id", as.ordered = FALSE, group = TRUE)
data |
a data frame. |
catvar |
a character vector specifying factors in |
sep |
a character string used to separate the concatenated
values of |
countvar |
(optional) a character string to be used for the name of the new count variable. |
idvar |
(optional) a character string to be used for the name of the new factor identifying the original rows (cases). |
as.ordered |
logical - whether the new interaction factor should
be of class |
group |
logical: whether or not to group individuals with common values over all covariates. |
Each row of the data frame is replicated c times, where c
is the number of levels of the interaction of the factors specified by
catvar. In the expanded data frame, the columns specified by
catvar are replaced by a factor specifying the r possible
categories for each case, named by the concatenated values of
catvar separated by sep. The ordering of factor levels
will be preserved in the creation of the new factor, but this factor
will not be of class "ordered" unless the argument
as.ordered = TRUE. A variable with name countvar is added
to the data frame which is equal to 1 for the observed category in each
case and 0 elsewhere. Finally a factor with name idvar is added
to index the cases.
The expanded data frame as described in Details.
Re-expressing categorical data in this way allows a multinomial response to be modelled as a poisson response, see examples.
Heather Turner
Anderson, J. A. (1984) Regression and Ordered Categorical Variables. J. R. Statist. Soc. B, 46(1), 1-30.
gnm, multinom,
reshape
### Example from help(multinom, package = "nnet")
library(MASS)
example(birthwt)
library(nnet)
bwt.mu <- multinom(low ~ ., data = bwt)
## Equivalent using gnm - include unestimable main effects in model so
## that interactions with low0 automatically set to zero, else could use
## 'constrain' argument.
bwtLong <- expandCategorical(bwt, "low", group = FALSE)
bwt.po <- gnm(count ~ low*(. - id), eliminate = id, data = bwtLong, family =
"poisson")
summary(bwt.po) # same deviance; df reflect extra id parameters
### Example from ?backPain
set.seed(1)
summary(backPain)
backPainLong <- expandCategorical(backPain, "pain")
## Fit models described in Table 5 of Anderson (1984)
noRelationship <- gnm(count ~ pain, eliminate = id,
family = "poisson", data = backPainLong)
oneDimensional <- update(noRelationship,
~ . + Mult(pain, x1 + x2 + x3))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.