stepmented: stepmented relationships in regression models

stepmentedR Documentation

stepmented relationships in regression models

Description

Fits regression models with stepmented (i.e. piecewise-constant) relationships between the response and one or more explanatory variables. Break-point estimates are provided.

Usage

stepmented(obj, seg.Z, psi, npsi, fixed.psi=NULL, control = seg.control(), 
   keep.class=FALSE, var.psi=FALSE, ...)

## S3 method for class 'lm'
stepmented(obj, seg.Z, psi, npsi, fixed.psi=NULL, control = seg.control(),
   keep.class=FALSE, var.psi=FALSE, ...)

## S3 method for class 'glm'
stepmented(obj, seg.Z, psi, npsi, fixed.psi=NULL, control = seg.control(),
    keep.class=FALSE, var.psi=FALSE, ...)

## S3 method for class 'numeric'
stepmented(obj, seg.Z, psi, npsi, fixed.psi=NULL, control = seg.control(),
    keep.class=FALSE, var.psi=FALSE, ..., 
    pertV=0, centerX=FALSE, adjX=NULL, weights=NULL)

## S3 method for class 'ts'
stepmented(obj, seg.Z, psi, npsi, fixed.psi=NULL, control = seg.control(),
    keep.class=FALSE, var.psi=FALSE, ..., 
    pertV=0, centerX=FALSE, adjX=NULL)




Arguments

obj

A standard ‘linear’ regression model of class "lm" or "glm". Alternatively, a simple "ts" object or a simple data vector may be supplied.

seg.Z

the stepmented variables(s), i.e. the numeric covariate(s) understood to have a piecewise-constant relationship with response. It is a formula with no response variable, such as seg.Z=~x or seg.Z=~x1+x2. Currently, formulas involving functions, such as seg.Z=~log(x1), or selection operators, such as seg.Z=~d[,"x1"] or seg.Z=~d$x1, are not allowed. Also, variable names formed by ⁠U⁠ or ⁠V⁠ only (with or without numbers ) are not permitted. If missing, the index variable id=1,2,..,n is used. For stepmented.ts, seg.Z is usually unspecified as the (time) covariate is obtained by the ts object itself.

psi

starting values for the breakpoints to be estimated. If there is a single stepmented variable specified in seg.Z, psi can be a numeric vector, and it can be missing when 1 breakpoint has to be estimated (and the median of the stepmented variable is used as a starting value). If seg.Z includes several covariates, psi has to be specified as a named list of vectors whose names have to match the variables in the seg.Z argument. Each vector of such list includes starting values for the break-point(s) for the corresponding variable in seg.Z. A NA value means that 'K' quantiles (or equally spaced values) are used as starting values; K is fixed via the seg.control auxiliary function.

npsi

A named vector or list meaning the number (and not locations) of breakpoints to be estimated. The starting values will be internally computed via the quantiles or equally spaced values, as specified in argument quant in seg.control. npsi can be missing and npsi=1 is assumed for all variables specified in seg.Z. If psi is provided, npsi is ignored.

fixed.psi

An optional named list including the breakpoint values to be kept fixed during the estimation procedure. The names should be a subset of (or even the same) variables specified in seg.Z. If there is a single variable in seg.Z, a simple numeric vector can be specified. Note that, in addition to the values specified here, stepmented will estimate additional breakpoints. To keep fixed all breakpoints (to be specified in psi) use it.max=0 in seg.control

control

a list of parameters for controlling the fitting process. See the documentation for seg.control for details.

keep.class

logical value indicating if the final fit returned by stepmented.default should keep the class 'stepmented' (along with the class of the original fit obj). Ignored by the stepmented methods.

...

optional arguments (to be ignored safely). Notice specific arguments relevant to the original call (via lm or glm for instance), such as weights or offet, have to be included in the starting model obj.

pertV

Only for stepmented.ts and stepmented.numeric.

centerX

Only for stepmented.ts and stepmented.numeric. If TRUE, the covariate is centered before fitting.

adjX

Only for stepmented.ts and stepmented.numeric. If the response vector leads to covariate with large values (such as years for ts objects), adjX=TRUE will shift the covariate to have a zero origin. Default is NULL which means TRUE if the minimum of covariate is 1000 or larger.

var.psi

logical. If TRUE, the estimate covariance matrix is also computed via vcov.stepmented, thus the breakpoint standard errors are also included in the psi component of the returned object. Default is FALSE, as computing the estimate covariance matrix is somewhat time-consuming when the sample size is large.

weights

possible weights to include in the estimation process (only for stepmented.numeric).

Details

Given a linear regression model (usually of class "lm" or "glm"), stepmented tries to estimate a new regression model having piecewise-constant (i.e. step-function like) relationships with the variables specified in seg.Z. A stepmented relationship is defined by the mean level parameters and the break-points where the mean level changes. The number of breakpoints of each stepmented relationship depends on the psi argument, where initial values for the break-points must be specified. The model is estimated simultaneously yielding point estimates and relevant approximate standard errors of all the model parameters, including the break-points.

stepmented implements the algorithm described in Fasola et al. (2018) along with bootstrap restarting (Wood, 2001) to escape local optima. The procedure turns out to be particularly appealing and probably efficient when there are two or more covariates exhibiting different change points to be estimated.

Value

The returned object is of class "stepmented" which inherits from the class "lm" or "glm" depending on the class of obj. When only.mean=FALSE, it is a list having two 'stepmented' fits (for the mean and for the dispersion submodels).

An object of class "stepmented" is a list containing the components of the original object obj with additionally the followings:

psi

estimated break-points and relevant (approximate) standard errors (on the continuum)

psi.rounded

the rounded estimated break-points (see Note, below)

it

number of iterations employed

epsilon

difference in the objective function when the algorithm stops

model

the model frame

psi.history

a list or a vector including the breakpoint estimates at each step

seed

the integer vector containing the seed just before the bootstrap resampling. Returned only if bootstrap restart is employed

..

Other components are not of direct interest of the user

Note

The component psi.rounded of the fit object includes the rounded changepoint values which are usually taken as the final estimates. More specifically, each column of psi.rounded represents a changepoint and the corresponding rows are the range of the ‘optimal’ interval. The first row, i.e. the lower bound of the interval, is taken as point estimate. print.stepmented, print.summary.stepmented, and confint.stepmented return the rounded (lower) value of the interval.

Also:

  1. The algorithm will start if the it.max argument returned by seg.control is greater than zero. If it.max=0 stepmented will estimate a new linear model with break-point(s) fixed at the starting values reported in psi. Alternatively, it is also possible to set h=0 in seg.control(). In this case, bootstrap restarting is unncessary, then to have changepoints at mypsi type

    stepmented(.., psi=mypsi, control=seg.control(h=0, n.boot=0, it.max=1))

  2. In the returned fit object, ‘U.’ is put before the name of the stepmented variable to indicate the difference in the mean levels. slope can be used to compute the actual mean levels corresponding to the different intervals.

  3. Currently methods specific to the class "stepmented" are

    • print.stepmented

    • summary.stepmented

    • print.summary.stepmented

    • plot.stepmented

    • confint.stepmented

    • vcov.stepmented

    • lines.stepmented

    Others are inherited from the class "lm" or "glm" depending on the class of obj.

Author(s)

Vito M. R. Muggeo, vito.muggeo@unipa.it (based on original code by Salvatore Fasola)

References

Fasola S, Muggeo VMR, Kuchenhoff H (2018) A heuristic, iterative algorithm for change-point detection in abrupt change models, Computational Statistics 33, 997–1015

See Also

segmented for segmented regression, lm, glm

Examples


n=20
x<-1:n/n
mu<- 2+ 1*(x>.6)
y<- mu + rnorm(n)*.8

#fitting via regression model
os <-stepmented(lm(y~1),~x)

y<-ts(y)
os1<- stepmented(y)  #the 'ts' method
os2<- stepmented(y, npsi=2)
#plot(y)
#plot(os1, add=TRUE)
#plot(os2, add=TRUE, col=3:5)


### Example with (poisson) GLM
y<- rpois(n,exp(mu))
o<-stepmented(glm(y~1,family=poisson))
plot(o, res=TRUE)

## Not run: 

## Example using the (well-known) Nile dataset
data(Nile)
plot(Nile)
os<- stepmented(Nile)
plot(os, add=TRUE)


### Example with (binary) GLM (example from the package stepR)
set.seed(1234)
y <- rbinom(200, 1, rep(c(0.1, 0.7, 0.3, 0.9), each=50))
o<-stepmented(glm(y~1,family=binomial), npsi=3)
plot(o, res=TRUE)

### Two stepmented covariates (with 1 and 2 breakpoints); z has also an additional linear effect
n=100
x<-1:n/n
z<-runif(n,2,5)
mu<- 2+ 1*(x>.6)-2*(z>3)+3*(z>4)+z
y<- mu + rnorm(n)*.8

os <-stepmented(lm(y~z),~x+z, npsi=c(x=1,z=2))
os
summary(os)

## see ?plot.stepmented

## End(Not run)

segmented documentation built on Oct. 25, 2024, 5:07 p.m.