stepmented: stepmented relationships in regression models
In segmented: Regression Models with Break-Points / Change-Points Estimation (with Possibly Random Effects)

stepmented

R Documentation

stepmented relationships in regression models

Description

Fits regression models with stepmented (i.e. piecewise-constant) relationships between the response and one or more explanatory variables. Break-point estimates are provided.

Usage

stepmented(obj, seg.Z, psi, npsi, fixed.psi=NULL, control = seg.control(), 
   keep.class=FALSE, var.psi=FALSE, ...)

## S3 method for class 'lm'
stepmented(obj, seg.Z, psi, npsi, fixed.psi=NULL, control = seg.control(),
   keep.class=FALSE, var.psi=FALSE, ...)

## S3 method for class 'glm'
stepmented(obj, seg.Z, psi, npsi, fixed.psi=NULL, control = seg.control(),
    keep.class=FALSE, var.psi=FALSE, ...)

## S3 method for class 'numeric'
stepmented(obj, seg.Z, psi, npsi, fixed.psi=NULL, control = seg.control(),
    keep.class=FALSE, var.psi=FALSE, ..., 
    pertV=0, centerX=FALSE, adjX=NULL, weights=NULL)

## S3 method for class 'ts'
stepmented(obj, seg.Z, psi, npsi, fixed.psi=NULL, control = seg.control(),
    keep.class=FALSE, var.psi=FALSE, ..., 
    pertV=0, centerX=FALSE, adjX=NULL)

Arguments

`obj`	A standard ‘linear’ regression model of class "lm" or "glm". Alternatively, a simple "ts" object or a simple data vector may be supplied.
`seg.Z`	the stepmented variables(s), i.e. the numeric covariate(s) understood to have a piecewise-constant relationship with response. It is a formula with no response variable, such as `seg.Z=~x` or `seg.Z=~x1+x2`. Currently, formulas involving functions, such as `seg.Z=~log(x1)`, or selection operators, such as `seg.Z=~d[,"x1"]` or `seg.Z=~d$x1`, are not allowed. Also, variable names formed by `⁠U⁠` or `⁠V⁠` only (with or without numbers ) are not permitted. If missing, the index variable `id=1,2,..,n` is used. For `stepmented.ts`, `seg.Z` is usually unspecified as the (time) covariate is obtained by the `ts` object itself.
`psi`	starting values for the breakpoints to be estimated. If there is a single stepmented variable specified in `seg.Z`, `psi` can be a numeric vector, and it can be missing when 1 breakpoint has to be estimated (and the median of the stepmented variable is used as a starting value). If `seg.Z` includes several covariates, `psi` has to be specified as a named list of vectors whose names have to match the variables in the `seg.Z` argument. Each vector of such list includes starting values for the break-point(s) for the corresponding variable in `seg.Z`. A `NA` value means that '`K`' quantiles (or equally spaced values) are used as starting values; `K` is fixed via the `seg.control` auxiliary function.
`npsi`	A named vector or list meaning the number (and not locations) of breakpoints to be estimated. The starting values will be internally computed via the quantiles or equally spaced values, as specified in argument `quant` in `seg.control`. `npsi` can be missing and `npsi=1` is assumed for all variables specified in `seg.Z`. If `psi` is provided, `npsi` is ignored.
`fixed.psi`	An optional named list including the breakpoint values to be kept fixed during the estimation procedure. The names should be a subset of (or even the same) variables specified in `seg.Z`. If there is a single variable in `seg.Z`, a simple numeric vector can be specified. Note that, in addition to the values specified here, `stepmented` will estimate additional breakpoints. To keep fixed all breakpoints (to be specified in `psi`) use `it.max=0` in `seg.control`
`control`	a list of parameters for controlling the fitting process. See the documentation for `seg.control` for details.
`keep.class`	logical value indicating if the final fit returned by `stepmented.default` should keep the class '`stepmented`' (along with the class of the original fit `obj`). Ignored by the stepmented methods.
`...`	optional arguments (to be ignored safely). Notice specific arguments relevant to the original call (via `lm` or `glm` for instance), such as `weights` or `offet`, have to be included in the starting model `obj`.
`pertV`	Only for `stepmented.ts` and `stepmented.numeric`.
`centerX`	Only for `stepmented.ts` and `stepmented.numeric`. If `TRUE`, the covariate is centered before fitting.
`adjX`	Only for `stepmented.ts` and `stepmented.numeric`. If the response vector leads to covariate with large values (such as years for ts objects), `adjX=TRUE` will shift the covariate to have a zero origin. Default is `NULL` which means `TRUE` if the minimum of covariate is 1000 or larger.
`var.psi`	logical. If `TRUE`, the estimate covariance matrix is also computed via `vcov.stepmented`, thus the breakpoint standard errors are also included in the `psi` component of the returned object. Default is `FALSE`, as computing the estimate covariance matrix is somewhat time-consuming when the sample size is large.
`weights`	possible weights to include in the estimation process (only for `stepmented.numeric`).

Details

Given a linear regression model (usually of class "lm" or "glm"), stepmented tries to estimate a new regression model having piecewise-constant (i.e. step-function like) relationships with the variables specified in seg.Z. A stepmented relationship is defined by the mean level parameters and the break-points where the mean level changes. The number of breakpoints of each stepmented relationship depends on the psi argument, where initial values for the break-points must be specified. The model is estimated simultaneously yielding point estimates and relevant approximate standard errors of all the model parameters, including the break-points.

stepmented implements the algorithm described in Fasola et al. (2018) along with bootstrap restarting (Wood, 2001) to escape local optima. The procedure turns out to be particularly appealing and efficient when there are two or more covariates exhibiting different change points to be estimated.

Value

The returned object is of class "stepmented" which inherits from the class "lm" or "glm" depending on the class of obj. When only.mean=FALSE, it is a list having two 'stepmented' fits (for the mean and for the dispersion submodels).

An object of class "stepmented" is a list containing the components of the original object obj with additionally the followings:

`psi`	estimated break-points and relevant (approximate) standard errors (on the continuum)
`psi.rounded`	the rounded estimated break-points (see Note, below)
`it`	number of iterations employed
`epsilon`	difference in the objective function when the algorithm stops
`model`	the model frame
`psi.history`	a list or a vector including the breakpoint estimates at each step
`seed`	the integer vector containing the seed just before the bootstrap resampling. Returned only if bootstrap restart is employed
`..`	Other components are not of direct interest of the user

Note

Assuming a single changepoint \psi for the covariate x, the underlying fitted stepmented relationship is \beta_0+\beta_1 \ I(x>\psi), namely the fitted value (on the linear predictor scale) is \beta_0 if x\le \psi, and \beta_0+\beta_1 when x > \psi. While the point estimate \hat\psi returned (in the psi component of the fit object) is a unique real number, actually there exist infinite solutions in the range [a, b) where the extremes are the two closest observed covariate values such hat a \le \hat\psi<b. The component psi.rounded of the fit object includes the rounded changepoint values which can be taken as the final estimates. More specifically, each column of psi.rounded represents a changepoint and the corresponding rows are the extremes of the ‘optimal’ interval [a, b). The first row, i.e. the lower bound of the interval (a in the above example), is taken as point estimate. print.stepmented, print.summary.stepmented, and confint.stepmented return the rounded (lower) value of the interval.

Also:

The algorithm will start if the it.max argument returned by seg.control is greater than zero. If it.max=0 stepmented will estimate a new linear model with break-point(s) fixed at the starting values reported in psi. Alternatively, it is also possible to set h=0 in seg.control(). In this case, bootstrap restarting is unncessary, then to have changepoints at mypsi type

stepmented(.., psi=mypsi, control=seg.control(h=0, n.boot=0, it.max=1))
In the returned fit object, ‘U.’ is put before the name of the stepmented variable to indicate the difference in the mean levels. slope can be used to compute the actual mean levels corresponding to the different intervals.
Currently methods specific to the class "stepmented" are
- print.stepmented
- summary.stepmented
- print.summary.stepmented
- plot.stepmented
- confint.stepmented
- vcov.stepmented
- lines.stepmented
Others are inherited from the class "lm" or "glm" depending on the class of obj.

Author(s)

Vito M. R. Muggeo, vito.muggeo@unipa.it (based on original code by Salvatore Fasola)

References

Fasola S, Muggeo VMR, Kuchenhoff H (2018) A heuristic, iterative algorithm for change-point detection in abrupt change models, Computational Statistics 33, 997–1015

Examples


n=20
x<-1:n/n
mu<- 2+ 1*(x>.6)
y<- mu + rnorm(n)*.8

#fitting via regression model
os <-stepmented(lm(y~1),~x)

y<-ts(y)
os1<- stepmented(y)  #the 'ts' method
os2<- stepmented(y, npsi=2)
#plot(y)
#plot(os1, add=TRUE)
#plot(os2, add=TRUE, col=3:5)


### Example with (poisson) GLM
y<- rpois(n,exp(mu))
o<-stepmented(glm(y~1,family=poisson))
plot(o, res=TRUE)

## Not run: 

## Example using the (well-known) Nile dataset
data(Nile)
plot(Nile)
os<- stepmented(Nile)
plot(os, add=TRUE)


### Example with (binary) GLM (example from the package stepR)
set.seed(1234)
y <- rbinom(200, 1, rep(c(0.1, 0.7, 0.3, 0.9), each=50))
o<-stepmented(glm(y~1,family=binomial), npsi=3)
plot(o, res=TRUE)

### Two stepmented covariates (with 1 and 2 breakpoints); z has also an additional linear effect
n=100
x<-1:n/n
z<-runif(n,2,5)
mu<- 2+ 1*(x>.6)-2*(z>3)+3*(z>4)+z
y<- mu + rnorm(n)*.8

os <-stepmented(lm(y~z),~x+z, npsi=c(x=1,z=2))
os
summary(os)

## see ?plot.stepmented

## End(Not run)

segmented documentation built on April 12, 2025, 1:59 a.m.