predict.brokenstick | R Documentation |
brokenstick
modelThe predictions from a broken stick model coincide with the
group-conditional means of the random effects. This function takes
an object of class brokenstick
and returns predictions
in one of several formats. The user can calculate predictions
for new persons, i.e., for persons who are not part of
the fitted model, through the x
and y
arguments.
## S3 method for class 'brokenstick'
predict(
object,
newdata = NULL,
...,
x = NULL,
y = NULL,
group = NULL,
hide = c("right", "left", "boundary", "internal", "none"),
shape = c("long", "wide", "vector"),
include_data = TRUE,
strip_data = TRUE,
whatknots = "all"
)
object |
A |
newdata |
Optional. A data frame in which to look for variables with
which to predict. The training data are used if omitted and
if |
... |
Not used, but required for extensibility. |
x |
Optional. A numeric vector with values of the predictor. It could
also be the special keyword |
y |
Optional. A numeric vector with measurements. |
group |
A vector with group identifications |
hide |
Should output for knots be hidden in get, print, summary and plot
functions? Can be |
shape |
A string: |
include_data |
A logical indicating whether the observed data
from |
strip_data |
Deprecated. Use |
whatknots |
Deprecated. Use |
The function predict()
calculates predictions for every row in
newdata
. If the user specifies no newdata
argument, then the
function sets newdata
equal to the training data (object$data
if object$light
is FALSE
). For a light object without a
newdata
argument, the function throws the warning
"Argument 'newdata' is required for a light brokenstick object." and
returns NULL
.
It is possible to tailor the behaviour of predict()
through the
x
, y
and group
arguments. What exactly happens depends on
which of these arguments is specified:
If the user specifies x
, but no y
and group
, the function
returns - for every group in newdata
- predictions at the
specified x
values. This method will use the data from newdata
.
If the user specifies x
and y
but no group
, the function
forms a hypothetical new group with the x
and y
values. This
method uses no information from newdata
, and also works for
a light brokenstick
object.
If the user specifies group
, but no x
or y
, the function
searches for the relevant data in newdata
and limits its
predictions to those groups. This is useful if the user needs
a prediction for only one or a few groups. This does not work for
a light brokenstick
object.
If the user specifies x
and group
, but no y
, the function
will create new values for x
in each group
, search for the relevant
data in newdata
and provide predictions at values of x
in those
groups.
If the user specifies x
, y
and group
, the function
assumes that these vectors contain additional data on top on what is
already available in newdata
. The lengths of x
,
y
and group
must match.
For a light brokenstick
object, case effectively becomes
case 6. See below.
As case 5, but now without newdata
available. All data are
specified through x
, y
and group
and form a data frame.
Matching to newdata
is attempted, but as long as group id's are
different from the training sample effectively new cases will be
made.
If shape == "long"
a long data.frame
of predictions. If x
, y
and group
are not specified, the number of rows in the data frame is guaranteed to
be the same as the number of rows in newdata
.
If shape == "wide"
a wide data.frame
of predictions, one record per group. Note
that this format could be inefficient if observations times vary between
subjects.
If shape == "vector"
a vector of predicted values, of all x-values and groups.
If the function finds no data, it throws a warnings and returns NULL
.
library("dplyr")
# -- Data
train <- smocc_200[1:1198, ]
test <- smocc_200[1199:1940, ]
## Not run:
# -- Fit model
fit <- brokenstick(hgt_z ~ age | id, data = train, knots = 0:2, seed = 1)
fit_light <- brokenstick(hgt_z ~ age | id,
data = train, knots = 0:2,
light = TRUE, seed = 1
)
# -- Predict, standard cases
# Use train data, return column with predictions
pred <- predict(fit)
identical(nrow(train), nrow(pred))
# Predict without newdata, not possible for light object
predict(fit_light)
# Use test data
pred <- predict(fit, newdata = test)
identical(nrow(test), nrow(pred))
# Predict, same but using newdata with the light object
pred_light <- predict(fit_light, newdata = test)
identical(pred, pred_light)
# -- Predict, special cases
# -- Case 1: x, -y, -group
# Case 1: x as "knots", standard estimates, train sample (n = 124)
z <- predict(fit, x = "knots", shape = "wide")
head(z, 3)
# Case 1: x as values, linearly interpolated, train sample (n = 124)
z <- predict(fit, x = c(0.5, 1, 1.5), shape = "wide", include_data = FALSE)
head(z, 3)
# Case 1: x as values, linearly interpolated, test sample (n = 76)
z <- predict(fit, test, x = c(0.5, 1, 1.5), shape = "wide", include_data = FALSE)
head(z, 3)
# Case 1: x, not possible for light object
z <- predict(fit_light, x = "knots")
# -- Case 2: x, y, -group
# Case 2: form one new group with id = 0
predict(fit, x = "knots", y = c(1, 1, 0.5, 0), shape = "wide")
# Case 2: works also for a light object
predict(fit_light, x = "knots", y = c(1, 1, 0.5, 0), shape = "wide")
# -- Case 3: -x, -y, group
# Case 3: Predict at observed age for subset of groups, training sample
pred <- predict(fit, group = c(10001, 10005, 10022))
head(pred, 3)
# Case 3: Of course, we cannot do this for light objects
pred_light <- predict(fit_light, group = c(10001, 10005, 10022))
# Case 3: We can use another sample. Note there is no child 999
pred <- predict(fit, test, group = c(11045, 11120, 999))
tail(pred, 3)
# Case 3: Works also for a light object
pred_light <- predict(fit_light, test, group = c(11045, 11120, 999))
identical(pred, pred_light)
# -- Case 4: x, -y, group
# Case 4: Predict at specified x, only in selected groups, train sample
pred <- predict(fit, x = c(0.5, 1, 1.25), group = c(10001, 10005, 10022),
include_data = FALSE)
pred
# Case 4: Same, but include observed data and sort
pred_all <- predict(fit,
x = c(0.5, 1, 1.25), group = c(10001, 10005, 10022)) %>%
dplyr::arrange(id, age)
# Case 4: Applies also to test sample
pred <- predict(fit, test, x = c(0.5, 1, 1.25), group = c(11045, 11120, 999),
include_data = FALSE)
pred
# Case 4: Works also with light object
pred_light <- predict(fit_light, test, x = c(0.5, 1, 1.25),
group = c(11045, 11120, 999), include_data = FALSE)
identical(pred_light, pred)
# -- Case 5: x, y, group
# Case 5: Add new data to training sample, and refreshes broken stick
# estimate at age x.
# Note that novel child (not in train) 999 has one data point
predict(fit,
x = c(0.9, 0.9, 0.9), y = c(1, 1, 1),
group = c(10001, 10005, 999), include_data = FALSE)
# Case 5: Same, but now for test sample. Novel child 899 has two data points
predict(fit, test,
x = c(0.5, 0.9, 0.6, 0.9),
y = c(0, 0.5, 0.5, 0.6), group = c(11045, 11120, 899, 899),
include_data = FALSE)
# Case 5: Also works for light object
predict(fit_light, test,
x = c(0.5, 0.9, 0.6, 0.9),
y = c(0, 0.5, 0.5, 0.6), group = c(11045, 11120, 899, 899),
include_data = FALSE)
# -- Case 6: As Case 5, but without previous data
# Case 6: Same call as last, but now without newdata = test
# All children are de facto novel as they do not occur in the training
# or test samples.
# Note: Predictions for 11045 and 11120 differ from prediction in Case 5.
predict(fit,
x = c(0.5, 0.9, 0.6, 0.9),
y = c(0, 0.5, 0.5, 0.6), group = c(11045, 11120, 899, 899))
# This also work for the light brokenstick object
predict(fit_light,
x = c(0.5, 0.9, 0.6, 0.9),
y = c(0, 0.5, 0.5, 0.6), group = c(11045, 11120, 899, 899))
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.