Description Usage Arguments Details Value Global Options References See Also Examples
Baselearners for fitting basemodels in the generic implementation of
componentwise gradient boosting in function mboost
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56  ## linear baselearner
bols(..., by = NULL, index = NULL, intercept = TRUE, df = NULL,
lambda = 0, contrasts.arg = "contr.treatment")
## smooth Pspline baselearner
bbs(..., by = NULL, index = NULL, knots = 20, boundary.knots = NULL,
degree = 3, differences = 2, df = 4, lambda = NULL, center = FALSE,
cyclic = FALSE, constraint = c("none", "increasing", "decreasing"),
deriv = 0)
## bivariate Pspline baselearner
bspatial(..., df = 6)
## radial basis functions baselearner
brad(..., by = NULL, index = NULL, knots = 100, df = 4, lambda = NULL,
covFun = fields::stationary.cov,
args = list(Covariance="Matern", smoothness = 1.5, theta=NULL))
## (genetic) pathwaybased kernel baselearner
bkernel(..., df = 4, lambda = NULL, kernel = c("lin", "sia", "net"),
pathway = NULL, knots = NULL, args = list())
## random effects baselearner
brandom(..., by = NULL, index = NULL, df = 4, lambda = NULL,
contrasts.arg = "contr.dummy")
## tree based baselearner
btree(..., by = NULL, nmax = Inf, tree_controls = partykit::ctree_control(
teststat = "quad", testtype = "Teststatistic",
mincriterion = 0, minsplit = 10, minbucket = 4,
maxdepth = 1, saveinfo = FALSE))
## constrained effects baselearner
bmono(...,
constraint = c("increasing", "decreasing", "convex", "concave",
"none", "positive", "negative"),
type = c("quad.prog", "iterative"),
by = NULL, index = NULL, knots = 20, boundary.knots = NULL,
degree = 3, differences = 2, df = 4, lambda = NULL,
lambda2 = 1e6, niter=10, intercept = TRUE,
contrasts.arg = "contr.treatment",
boundary.constraints = FALSE,
cons.arg = list(lambda = 1e+06, n = NULL, diff_order = NULL))
## Markov random field baselearner
bmrf(..., by = NULL, index = NULL, bnd = NULL, df = 4, lambda = NULL,
center = FALSE)
## userspecified baselearner
buser(X, K = NULL, by = NULL, index = NULL, df = 4, lambda = NULL)
## combining single baselearners to form new,
## more complex baselearners
bl1 %+% bl2
bl1 %X% bl2
bl1 %O% bl2

... 
one or more predictor variables or one matrix or data
frame of predictor variables. For smooth baselearners,
the number of predictor variables and the number of
columns in the data frame / matrix must be less than or
equal to 2. If a matrix (with at least 2 columns) is
given to 
by 
an optional variable defining varying coefficients,
either a factor or numeric variable.
If 
index 
a vector of integers for expanding the variables in

df 
trace of the hat matrix for the baselearner defining the
baselearner complexity. Low values of 
lambda 
smoothing penalty, computed from 
knots 
either the number of knots or a vector of the positions
of the interior knots (for more details see below). For multiple
predictor variables, 
boundary.knots 
boundary points at which to anchor the Bspline basis
(default the range of the data). A vector (of length 2)
for the lower and the upper boundary knot can be specified.This is
only advised for 
degree 
degree of the regression spline. 
differences 
a nonnegative integer, typically 1, 2 or 3. If 
intercept 
if 
center 
if 
cyclic 
if 
covFun 
the covariance function (i.e. radial basis)
needed to compute the basis functions. Per
default 
args 
a named list of arguments to be passed to

kernel 
one of 
pathway 
name of pathway; Pathway needs to be contained in the GWAS data set. 
contrasts.arg 
a named list of characters suitable for input to
the 
nmax 
integer, maximal number of
bins in the predictor variables. Use 
tree_controls 
an object of class 
constraint 
type of constraint to be used. For 
type 
determines how the constrained least squares problem should be
solved. If 
lambda2 
penalty parameter for the (monotonicity) constraint. 
niter 
maximum number of iterations used to compute constraint estimates. Increase this number if a warning is displayed. 
boundary.constraints 
a logical indicating whether additional constraints on the boundaries of the spline should be applied (default: FALSE). This is still experimental. 
cons.arg 
a named list with additional arguments for boundary
constraints. The element 
bnd 
Object of class 
X 
design matrix as it should be used in the penalized least
squares estimation. Effect modifiers do not need to be included here
( 
K 
penalty matrix as it should be used in the penalized least
squares estimation. If 
deriv 
an integer; the derivative of the spline of the given order at the data is computed, defaults to zero. Note that this argument is only used to set up the design matrix and cannot be used in the fitting process. 
bl1 
a linear baselearner or a list of linear baselearners. 
bl2 
a linear baselearner or a list of linear baselearners. 
bols
refers to linear baselearners (potentially estimated with
a ridge penalty), while bbs
provide penalized regression
splines. bspatial
fits bivariate surfaces and brandom
defines random effects baselearners. In combination with option
by
, these baselearners can be turned into varying coefficient
terms. The linear baselearners are fitted using Ridge Regression
where the penalty parameter lambda
is either computed from
df
(default for bbs
, bspatial
, and
brandom
) or specified directly (lambda = 0
means no
penalization as default for bols
).
In bols(x)
, x
may be a numeric vector or factor.
Alternatively, x
can be a data frame containing numeric or
factor variables. In this case, or when multiple predictor variables
are specified, e.g., using bols(x1, x2)
, the model is
equivalent to lm(y ~ ., data = x)
or lm(y ~ x1 + x2)
,
respectively. By default, an intercept term is added to the
corresponding design matrix (which can be omitted using
intercept = FALSE
). It is strongly advised to (mean)
center continuous covariates, if no intercept is used in bols
(see Hofner et al., 2011a). If x
is a matrix, it is directly used
as the design matrix and no further preprocessing (such as addition of
an intercept) is conducted. When df
(or lambda
) is
given, a ridge estimator with df
degrees of freedom (see
section ‘Global Options’) is used as baselearner. Note that
all variables are treated as a group, i.e., they enter the model
together if the corresponding baselearner is selected. For ordinal
variables, a ridge penalty for the differences of the adjacent
categories (Gertheiss and Tutz 2009, Hofner et al. 2011a) is applied.
With bbs
, the Pspline approach of Eilers and Marx (1996) is
used. Psplines use a squared kthorder difference penalty
which can be interpreted as an approximation of the integrated squared
kth derivative of the spline. In bbs
the argument
knots
specifies either the number of (equidistant) interior
knots to be used for the regression spline fit or a vector including
the positions of the interior knots. Additionally,
boundary.knots
can be specified. However, this is only advised
if one uses cyclic constraints, where the boundary.knots
specify the points where the function is joined (e.g.,
boundary.knots = c(0, 2 * pi)
for angles as in a sine function
or boundary.knots = c(0, 24)
for hours during the day). For
details on cylcic splines in the context of boosting see Hofner et
al. (2016).
bspatial
implements bivariate tensor product Psplines for the
estimation of either spatial effects or interaction surfaces. Note
that bspatial(x, y)
is equivalent to bbs(x, y, df = 6)
.
For possible arguments and defaults see there. The penalty term is
constructed based on bivariate extensions of the univariate penalties
in x
and y
directions, see Kneib, Hothorn and Tutz
(2009) for details. Note that the dimensions of the penalty matrix
increase (quickly) with the number of knots with strong impact on
computational time. Thus, both should not be chosen to large.
Different knots for x
and y
can be specified by a named
list.
brandom(x)
specifies a random effects baselearner based on a
factor variable x
that defines the grouping structure of the
data set. For each level of x
, a separate random intercept is
fitted, where the random effects variance is governed by the
specification of the degrees of freedom df
or lambda
(see section ‘Global Options’). Note that brandom(...)
is essentially a wrapper to bols(..., df = 4, contrasts.arg =
"contr.dummy")
, i.e., a wrapper that utilizes ridgepenalized
categorical effects. For possible arguments and defaults see bols
.
For all linear baselearners the amount of smoothing is determined by
the trace of the hat matrix, as indicated by df
.
If by
is specified as an additional argument, a varying
coefficients term is estimated, where by
is the interaction
variable and the effect modifier is given by either x
or
x
and y
(specified via ...
). If bbs
is
used, this corresponds to the classical situation of varying
coefficients, where the effect of by
varies over the codomain
of x
. In case of bspatial
as baselearner, the effect of
by
varies with respect to both x
and y
, i.e. an
interaction surface between x
and y
is specified as
effect modifier. For brandom
specification of by
leads
to the estimation of random slopes for covariate by
with
grouping structure defined by factor x
instead of a simple
random intercept. In bbs
, bspatial
and brandom
the computation of the smoothing parameter lambda
for given
df
, or vice versa, might become (numerically) instable if the
values of the interaction variable by
become too large. In this
case, we recommend to rescale the interaction covariate e.g. by
dividing by max(abs(by))
. If bbs
or bspatial
is
specified with an factor variable by
with more than two
factors, the degrees of freedom are shared for the complete
baselearner (i.e., spread over all factor levels). Note that the null
space (see next paragraph) increases, as a separate null space for
each factor level is present. Thus, the minimum degrees of freedom
increase with increasing number of levels of by
(if
center = FALSE
).
For bbs
and bspatial
, option center != FALSE
requests that
the fitted effect is centered around its parametric, unpenalized part
(the so called null space). For example, with second order difference
penalty, a linear effect of x
remains unpenalized by bbs
and therefore the degrees of freedom for the baselearner have to be
larger than two. To avoid this restriction, option center =
TRUE
subtracts the unpenalized linear effect from the fit, allowing
to specify any positive number as df
. Note that in this case
the linear effect x
should generally be specified as an
additional baselearner bols(x)
. For bspatial
and, for
example, second order differences, a linear effect of x
(bols(x)
), a linear effect of y
(bols(y)
), and
their interaction (bols(x*y)
) are subtracted from the effect
and have to be added separately to the model equation. More details on
centering can be found in Kneib, Hothorn and Tutz (2009) and Fahrmeir,
Kneib and Lang (2004). We strongly recommend to consult the latter reference
before using this option.
brad(x)
specifies penalized radial basis functions as used in
Kriging. If knots
is used to specify the number of knots, the
function cover.design
is used to specify the
location of the knots such that they minimize a geometric
spacefilling criterion. Furthermore, knots can be specified directly
via a matrix. The cov.function
allows to specify the
radial basis functions. Per default, the flexible Matern correlation
function is used. This is specified using cov.function =
stationary.cov
with Covariance = "Matern"
specified via
args
. If an effective range theta
is applicable for the
correlation function (e.g., the Matern family) the user can specify
this value. Per default (if theta = NULL
) the effective range is
chosen as θ =
max(x_i  x_j)/c such that the correlation function
ρ(c; theta = 1) = epsilon,
where epsilon = 0.001.
bmrf
builds a base of a Markov random field consisting of
several regions with a neighborhood structure. The input variable is
the observed region. The penalty matrix is either construed from a
boundary object or must be given directly via the option bnd
.
In that case the dimnames
of the matrix have to be the region
names, on the diagonal the number of neighbors have to be given for
each region, and for each neighborhood relation the value in the
matrix has to be 1, else 0. With a boundary object at hand, the
fitted or predicted values can be directly plotted into the map using
drawmap
.
bkernel
can be used to fit linear (kernel = "lin"
),
sizeadjusted (kernel = "sia"
) or network (kernel = "net"
)
kernels based on genetic pathways for genomewide assosiation studies.
For details see Friedrichs et al. (2017) and check the associated package
kangar00.
buser(X, K)
specifies a baselearner with userspecified design
matrix X
and penalty matrix K
, where X
and
K
are used to minimize a (penalized) least squares
criterion with quadratic penalty. This can be used to easily specify
baselearners that are not implemented (yet). See examples
below for details how buser
can be used to mimic existing
baselearners. Note that for predictions you need to set up the
design matrix for the new data manually.
For a categorical covariate with nonobserved categories
bols(x)
and brandom(x)
both assign a zero effect
to these categories. However, the nonobserved categories must be
listed in levels(x)
. Thus, predictions are possible
for new observations if they correspond to this category.
By default, all linear baselearners include an intercept term (which can
be removed using intercept = FALSE
for bols
). In this case,
the respective covariate should be mean centered (if continuous) and an
explicit global intercept term should be added to gamboost
via bols
(see example below). With bols(x, intercept = FALSE)
with categorical covariate x
a separate effect for each group
(mean effect) is estimated (see examples for resulting design matrices).
Smooth estimates with constraints can be computed using the
baselearner bmono()
which specifies Pspline baselearners
with an additional asymmetric penalty enforcing monotonicity or
convexity/concavity (see and Eilers, 2005). For more details in the
boosting context and monotonic effects of ordinal factors see Hofner,
Mueller and Hothorn (2011b). The quadraticprogramming based algorithm
is described in Hofner et al. (2016). Alternative monotonicity
constraints are implemented via Tsplines in bbs()
(Beliakov,
2000). In general it is advisable to use bmono
to fit monotonic splines
as Tsplines show undesirable behaviour if the observed data deviates
from monotonicty.
Two or more linear baselearners can be joined using %+%
. A
tensor product of two or more linear baselearners is returned by
%X%
. When the design matrix can be written as the Kronecker
product of two matrices X = kronecker(X2, X1)
, then bl1
%O% bl2
with design matrices X1 and X2, respectively, can be used
to efficiently compute Ridgeestimates following Currie, Durban,
Eilers (2006). In all cases the overall degrees of freedom of the
combined baselearner increase (additive or multiplicative,
respectively). These three features are experimental and for expert
use only.
btree
fits a stump to one or more variables. Note that
blackboost
is more efficient for boosting stumps. For
further references see Hothorn, Hornik, Zeileis (2006) and Hothorn et
al. (2010).
Note that the baselearners bns
and bss
are deprecated
(and no longer available). Please use bbs
instead, which
results in qualitatively the same models but is computationally much
more attractive.
An object of class blg
(baselearner generator) with a
dpp
function.
The call of dpp
returns an object of class
bl
(baselearner) with a fit
function. The call to
fit
finally returns an object of class bm
(basemodel).
Three global options affect the baselearners:
options("mboost_useMatrix")
defaulting to TRUE
indicates that the baselearner may use sparse matrix techniques
for its computations. This reduces the memory consumption but
might (for smaller sample sizes) require more computing time.
options("mboost_indexmin")
is an integer that specifies the minimum sample size needed to optimize model fitting by automatically taking ties into account (default = 10000).
options("mboost_dftraceS")
FALSE
by default,
indicating how the degrees of freedom should be computed. Per
default
df(λ) = trace(2S  S'S),
with smoother matrix
S = X(X'X + λ
K)^(1) X is used (see Hofner et al., 2011a). If TRUE
, the
trace of the smoother matrix \mathrm{df}(λ) =
\mathrm{trace}(S) is used as degrees of freedom.
Note that these formulae specify the relation of df
and
lambda
as the smoother matrix S depends only on
λ (and the (fixed) design matrix X, the (fixed)
penalty matrix K).
Iain D. Currie, Maria Durban, and Paul H. C. Eilers (2006), Generalized linear array models with applications to multidimensional smoothing. Journal of the Royal Statistical Society, Series B–Statistical Methodology, 68(2), 259–280.
Paul H. C. Eilers (2005), Unimodal smoothing. Journal of Chemometrics, 19, 317–328.
Paul H. C. Eilers and Brian D. Marx (1996), Flexible smoothing with Bsplines and penalties. Statistical Science, 11(2), 89121.
Ludwig Fahrmeir, Thomas Kneib and Stefan Lang (2004), Penalized structured additive regression for spacetime data: a Bayesian perspective. Statistica Sinica, 14, 731761.
Jan Gertheiss and Gerhard Tutz (2009), Penalized regression with ordinal predictors, International Statistical Review, 77(3), 345–365.
D. Goldfarb and A. Idnani (1982), Dual and PrimalDual Methods for Solving Strictly Convex Quadratic Programs. In J. P. Hennart (ed.), Numerical Analysis, SpringerVerlag, Berlin, pp. 226239.
D. Goldfarb and A. Idnani (1983), A numerically stable dual method for solving strictly convex quadratic programs. Mathematical Programming, 27, 1–33.
S. Friedrichs, J. Manitz, P. Burger, C.I. Amos, A. Risch, J.C. ChangClaude, H.E. Wichmann, T. Kneib, H. Bickeboeller, and B. Hofner (2017), PathwayBased Kernel Boosting for the Analysis of GenomeWide Association Studies. Computational and Mathematical Methods in Medicine. 2017(6742763), 117. doi: 10.1155/2017/6742763.
Benjamin Hofner, Torsten Hothorn, Thomas Kneib, and Matthias Schmid (2011a), A framework for unbiased model selection based on boosting. Journal of Computational and Graphical Statistics, 20, 956–971.
Benjamin Hofner, Joerg Mueller, and Torsten Hothorn (2011b), MonotonicityConstrained Species Distribution Models. Ecology, 92, 1895–1901.
Benjamin Hofner, Thomas Kneib and Torsten Hothorn (2016), A Unified Framework of Constrained Regression. Statistics & Computing, 26, 1–14.
Thomas Kneib, Torsten Hothorn and Gerhard Tutz (2009), Variable selection and model choice in geoadditive regression models, Biometrics, 65(2), 626–634.
Torsten Hothorn, Kurt Hornik, Achim Zeileis (2006), Unbiased recursive partitioning: A conditional inference framework. Journal of Computational and Graphical Statistics, 15, 651–674.
Torsten Hothorn, Peter Buehlmann, Thomas Kneib, Matthias Schmid and Benjamin Hofner (2010), Modelbased Boosting 2.0, Journal of Machine Learning Research, 11, 2109–2113.
G. M. Beliakov (2000), Shape Preserving Approximation using Least Squares Splines, Approximation Theory and its Applications, 16(4), 80–98.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321  set.seed(290875)
n < 100
x1 < rnorm(n)
x2 < rnorm(n) + 0.25 * x1
x3 < as.factor(sample(0:1, 100, replace = TRUE))
x4 < gl(4, 25)
y < 3 * sin(x1) + x2^2 + rnorm(n)
weights < drop(rmultinom(1, n, rep.int(1, n) / n))
### set up baselearners
spline1 < bbs(x1, knots = 20, df = 4)
extract(spline1, "design")[1:10, 1:10]
extract(spline1, "penalty")
knots.x2 < quantile(x2, c(0.25, 0.5, 0.75))
spline2 < bbs(x2, knots = knots.x2, df = 5)
ols3 < bols(x3)
extract(ols3)
ols4 < bols(x4)
### compute basemodels
drop(ols3$dpp(weights)$fit(y)$model) ## same as:
coef(lm(y ~ x3, weights = weights))
drop(ols4$dpp(weights)$fit(y)$model) ## same as:
coef(lm(y ~ x4, weights = weights))
### fit model, componentwise
mod1 < mboost_fit(list(spline1, spline2, ols3, ols4), y, weights)
### more convenient formula interface
mod2 < mboost(y ~ bbs(x1, knots = 20, df = 4) +
bbs(x2, knots = knots.x2, df = 5) +
bols(x3) + bols(x4), weights = weights)
all.equal(coef(mod1), coef(mod2))
### grouped linear effects
# center x1 and x2 first
x1 < scale(x1, center = TRUE, scale = FALSE)
x2 < scale(x2, center = TRUE, scale = FALSE)
model < gamboost(y ~ bols(x1, x2, intercept = FALSE) +
bols(x1, intercept = FALSE) +
bols(x2, intercept = FALSE),
control = boost_control(mstop = 50))
coef(model, which = 1) # one baselearner for x1 and x2
coef(model, which = 2:3) # two separate baselearners for x1 and x2
# zero because they were (not yet) selected.
### example for bspatial
x1 < runif(250,pi,pi)
x2 < runif(250,pi,pi)
y < sin(x1) * sin(x2) + rnorm(250, sd = 0.4)
spline3 < bspatial(x1, x2, knots = 12)
Xmat < extract(spline3, "design")
## 12 inner knots + 4 boundary knots = 16 knots per direction
## THUS: 16 * 16 = 256 columns
dim(Xmat)
extract(spline3, "penalty")[1:10, 1:10]
## specify number of knots separately
form1 < y ~ bspatial(x1, x2, knots = list(x1 = 12, x2 = 14))
## decompose spatial effect into parametric part and
## deviation with one df
form2 < y ~ bols(x1) + bols(x2) + bols(x1, by = x2, intercept = FALSE) +
bspatial(x1, x2, knots = 12, center = TRUE, df = 1)
mod1 < gamboost(form1)
plot(mod1)
mod2 < gamboost(form2)
## automated plot function:
plot(mod2)
## plot sum of linear and smooth effects:
library(lattice)
df < expand.grid(x1 = unique(x1), x2 = unique(x2))
df$pred < predict(mod2, newdata = df)
levelplot(pred ~ x1 * x2, data = df)
## specify radial basis function baselearner for spatial effect
## and use dataadaptive effective range (theta = NULL, see 'args')
form3 < y ~ brad(x1, x2)
## Now use different settings, e.g. 50 knots and theta fixed to 0.4
## (not really a good setting)
form4 < y ~ brad(x1, x2, knots = 50, args = list(theta = 0.4))
mod3 < gamboost(form3)
plot(mod3)
dim(extract(mod3, what = "design", which = "brad")[[1]])
knots < attr(extract(mod3, what = "design", which = "brad")[[1]], "knots")
mod4 < gamboost(form4)
dim(extract(mod4, what = "design", which = "brad")[[1]])
plot(mod4)
### random intercept
id < factor(rep(1:10, each = 5))
raneff < brandom(id)
extract(raneff, "design")
extract(raneff, "penalty")
## random intercept with nonobserved category
set.seed(1907)
y < rnorm(50, mean = rep(rnorm(10), each = 5), sd = 0.1)
plot(y ~ id)
# category 10 not observed
obs < c(rep(1, 45), rep(0, 5))
model < gamboost(y ~ brandom(id), weights = obs)
coef(model)
fitted(model)[46:50] # just the grand mean as usual for
# random effects models
### random slope
z < runif(50)
raneff < brandom(id, by = z)
extract(raneff, "design")
extract(raneff, "penalty")
### specify simple interaction model (with main effect)
n < 210
x < rnorm(n)
X < model.matrix(~ x)
z < gl(3, n/3)
Z < model.matrix(~z)
beta < list(c(0,1), c(3,4), c(2, 4))
y < rnorm(length(x), mean = (X * Z[,1]) %*% beta[[1]] +
(X * Z[,2]) %*% beta[[2]] +
(X * Z[,3]) %*% beta[[3]])
plot(y ~ x, col = z)
## specify main effect and interaction
mod_glm < gamboost(y ~ bols(x) + bols(x, by = z),
control = boost_control(mstop = 100))
nd < data.frame(x, z)
nd < nd[order(x),]
nd$pred_glm < predict(mod_glm, newdata = nd)
for (i in seq(along = levels(z)))
with(nd[nd$z == i,], lines(x, pred_glm, col = z))
mod_gam < gamboost(y ~ bbs(x) + bbs(x, by = z, df = 8),
control = boost_control(mstop = 100))
nd$pred_gam < predict(mod_gam, newdata = nd)
for (i in seq(along = levels(z)))
with(nd[nd$z == i,], lines(x, pred_gam, col = z, lty = "dashed"))
### convenience function for plotting
par(mfrow = c(1,3))
plot(mod_gam)
### remove intercept from baselearner
### and add explicit intercept to the model
tmpdata < data.frame(x = 1:100, y = rnorm(1:100), int = rep(1, 100))
mod < gamboost(y ~ bols(int, intercept = FALSE) +
bols(x, intercept = FALSE),
data = tmpdata,
control = boost_control(mstop = 1000))
cf < unlist(coef(mod))
## add offset
cf[1] < cf[1] + mod$offset
signif(cf, 3)
signif(coef(lm(y ~ x, data = tmpdata)), 3)
### much quicker and better with (mean) centering
tmpdata$x_center < tmpdata$x  mean(tmpdata$x)
mod_center < gamboost(y ~ bols(int, intercept = FALSE) +
bols(x_center, intercept = FALSE),
data = tmpdata,
control = boost_control(mstop = 100))
cf_center < unlist(coef(mod_center, which=1:2))
## due to the shift in x direction we need to subtract
## beta_1 * mean(x) to get the correct intercept
cf_center[1] < cf_center[1] + mod_center$offset 
cf_center[2] * mean(tmpdata$x)
signif(cf_center, 3)
signif(coef(lm(y ~ x, data = tmpdata)), 3)
## Not run: ############################################################
## Do not run and check these examples automatically as
## they take some time
### large data set with ties
nunique < 100
xindex < sample(1:nunique, 1000000, replace = TRUE)
x < runif(nunique)
y < rnorm(length(xindex))
w < rep.int(1, length(xindex))
### brute force computations
op < options()
options(mboost_indexmin = Inf, mboost_useMatrix = FALSE)
## data preprocessing
b1 < bbs(x[xindex])$dpp(w)
## model fitting
c1 < b1$fit(y)$model
options(op)
### automatic search for ties, faster
b2 < bbs(x[xindex])$dpp(w)
c2 < b2$fit(y)$model
### manual specification of ties, even faster
b3 < bbs(x, index = xindex)$dpp(w)
c3 < b3$fit(y)$model
all.equal(c1, c2)
all.equal(c1, c3)
## End(Not run and test)
## End(Not run)
### cyclic Psplines
set.seed(781)
x < runif(200, 0,(2*pi))
y < rnorm(200, mean=sin(x), sd=0.2)
newX < seq(0,2*pi, length=100)
### model without cyclic constraints
mod < gamboost(y ~ bbs(x, knots = 20))
### model with cyclic constraints
mod_cyclic < gamboost(y ~ bbs(x, cyclic=TRUE, knots = 20,
boundary.knots=c(0, 2*pi)))
par(mfrow = c(1,2))
plot(x,y, main="bbs (noncyclic)", cex=0.5)
lines(newX, sin(newX), lty="dotted")
lines(newX + 2 * pi, sin(newX), lty="dashed")
lines(newX, predict(mod, data.frame(x = newX)),
col="red", lwd = 1.5)
lines(newX + 2 * pi, predict(mod, data.frame(x = newX)),
col="blue", lwd=1.5)
plot(x,y, main="bbs (cyclic)", cex=0.5)
lines(newX, sin(newX), lty="dotted")
lines(newX + 2 * pi, sin(newX), lty="dashed")
lines(newX, predict(mod_cyclic, data.frame(x = newX)),
col="red", lwd = 1.5)
lines(newX + 2 * pi, predict(mod_cyclic, data.frame(x = newX)),
col="blue", lwd = 1.5)
### use buser() to mimic pspline baselearner:
set.seed(1907)
x < rnorm(100)
y < rnorm(100, mean = x^2, sd = 0.1)
mod1 < gamboost(y ~ bbs(x))
## now extract design and penalty matrix
X < extract(bbs(x), "design")
K < extract(bbs(x), "penalty")
## use X and K in buser()
mod2 < gamboost(y ~ buser(X, K))
max(abs(predict(mod1)  predict(mod2))) # same results
### use buser() to mimic penalized ordinal baselearner:
z < as.ordered(sample(1:3, 100, replace=TRUE))
y < rnorm(100, mean = as.numeric(z), sd = 0.1)
X < extract(bols(z))
K < extract(bols(z), "penalty")
index < extract(bols(z), "index")
mod1 < gamboost(y ~ buser(X, K, df = 1, index = index))
mod2 < gamboost(y ~ bols(z, df = 1))
max(abs(predict(mod1)  predict(mod2))) # same results
### kronecker product for matrixvalued responses
data("volcano", package = "datasets")
layout(matrix(1:2, ncol = 2))
## estimate mean of image treating image as matrix
image(volcano, main = "data")
x1 < 1:nrow(volcano)
x2 < 1:ncol(volcano)
vol < as.vector(volcano)
mod < mboost(vol ~ bbs(x1, df = 3, knots = 10)%O%
bbs(x2, df = 3, knots = 10),
control = boost_control(nu = 0.25))
mod[250]
volf < matrix(fitted(mod), nrow = nrow(volcano))
image(volf, main = "fitted")
## Not run: ############################################################
## Do not run and check these examples automatically as
## they take some time
## the oldfashioned way, a waste of space and time
x < expand.grid(x1, x2)
modx < mboost(vol ~ bbs(Var2, df = 3, knots = 10) %X%
bbs(Var1, df = 3, knots = 10), data = x,
control = boost_control(nu = 0.25))
modx[250]
max(abs(fitted(mod)  fitted(modx)))
## End(Not run and test)
## End(Not run)
### setting contrasts via contrasts.arg
x < as.factor(sample(1:4, 100, replace = TRUE))
## compute baselearners with different reference categories
BL1 < bols(x, contrasts.arg = contr.treatment(4, base = 1)) # default
BL2 < bols(x, contrasts.arg = contr.treatment(4, base = 2))
## compute 'sum to zero contrasts' using character string
BL3 < bols(x, contrasts.arg = "contr.sum")
## extract model matrices to check if it works
extract(BL1)
extract(BL2)
extract(BL3)
### setting contrasts using named lists in contrasts.arg
x2 < as.factor(sample(1:4, 100, replace = TRUE))
BL4 < bols(x, x2,
contrasts.arg = list(x = contr.treatment(4, base = 2),
x2 = "contr.helmert"))
extract(BL4)
### using special contrast: "contr.dummy":
BL5 < bols(x, contrasts.arg = "contr.dummy")
extract(BL5)

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.