View source: R/makeDesignMatrix.R
makeDesignMatrix | R Documentation |
Create a sparse model matrix which respects the basis functions of the original data on which it was created. Primarily for internal use but may be of some independent interest.
makeDesignMatrix(formula, origData, newData, test = TRUE, sparse = TRUE)
formula |
the formula describing the design matrix. Any responses will be deleted |
origData |
the original dataset as a dataframe |
newData |
a dataframe containing any of the variables in the formula. This will provide the data in the returned model matrix. |
test |
when set to TRUE runs a test that the matrix was constructed correctly see details for more. |
sparse |
by default returns a sparse matrix using |
This functions is designed to be used in settings where we need
to make a prediction using a model matrix. The practical challenge
here is ensuring that the representation of the data lines up
with the original representation. This becomes challenging for
functions that produce a different representation depending on their
inputs. A simple conceptual example is factor variables. If we run
our original model using a factor with levels c("A","B", "C")
then when we try to make predictions for data having only levels
c("A","C")
we need to adjust for the missing level. Base
R functions like predict.lm
in stats handle
this gracefully and this function is essentially a version of
predict.lm
that only constructs the model matrix.
Beyond factors the key use case for this are basis functions like
splines. For a function like this to work it must either depend only
on the observation it is transforming (e.g. log
) or it must
have a generic for predict
and makepredictcall
.
The spline wrapper s
has both and so should work.
When a function lacks these methods it will still produce a design matrix
but the values will be wrong. To catch these settings we implement a quick
test when test=TRUE
as it is by default. To test we simply split
the original data in half and ensure that looking at each half separately produces
the same values as the complete original data.
fitNewDocuments
foo <- data.frame(response=rnorm(30),
predictor=as.factor(rep(c("A","B","C"),10)),
predictor2=rnorm(30))
foo.new <- data.frame(predictor=as.factor(c("A","C","C")),
predictor2=foo$predictor2[1:3])
makeDesignMatrix(~predictor + s(predictor2), foo, foo.new)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.