gfpop: Graph-Constrained Functional Pruning Optimal Partitioning...
In gfpop: Graph-Constrained Functional Pruning Optimal Partitioning

View source: R/gfpop.R

gfpop

R Documentation

Graph-Constrained Functional Pruning Optimal Partitioning (gfpop)

Description

Functional pruning optimal partitioning with a graph structure to take into account constraints on consecutive segment parameters. The user has to specify the graph he wants to use (see the graph function) and a type of cost function. This is the main function of the gfpop package. Its result can be plotted using the S3 gfpop function gfpop::plot()

Usage

gfpop(data, mygraph, type = "mean", weights = NULL, testMode = FALSE)

Arguments

`data`	vector of data to segment. For simulation studies, Data can be generated using gfpop package function `gfpop::dataGenerator()`
`mygraph`	dataframe of class "graph" to constrain the changepoint inference, see `gfpop::graph()`
`type`	a string defining the cost model to use: `"mean"`, `"variance"`, `"poisson"`, `"exp"`, `"negbin"`
`weights`	vector of weights (positive numbers), same size as data
`testMode`	boolean. `FALSE` by default. Used to debug the code

Details

The constrained optimization problem for n data points takes the following general form:

Q_n = min (with constraints) (\sum_{t=1}^n (\gamma(e[t])(y[t], \mu[t]) + \beta(e[t]))

with data points y[t], edges e[t], edge-dependent penalties \beta(e[t]) and cost functions \gamma. The cost function can take three different forms for parameter x and constants (A, B, C):

quadratic, with representation Ax^2 + Bx +C with x in R
log-linear, with representation Ax - B log(x) +C with x \ge 0
log-log, with representation - A log(x) - B log(1-x) +C with 0 \le x \le 1

For each optimization problem, we consider a unique cost representation. However, the User can define robustness values (K and a) specific to each edge, making the cost function edge-dependent. We give the atomic form of each of the five available types (for one data point of value y with weight w)

"mean" : A = w, B = -2wy, C = wy^2
"variance" : A = wy^2, B = w, C = 0
"poisson" : A = w, B = wy, C = 0
"exp" : A = wy, B = w, C = 0
"negbin" : A = w, B = wy, C = 0

Value

a gfpop object = (changepoints, states, forced, parameters, globalCost)

changepoints: is the vector of changepoints (we give the last element of each segment)
states: is the vector giving the state of each segment
forced: is the vector specifying whether the constraints of the graph are active (= TRUE) or not (= FALSE)
parameters: is the vector of successive parameters of each segment
globalCost: is a number equal to the total loss: the minimal cost for the optimization problem with all penalty values excluded

Examples

n <- 1000 #data length
### EXAMPLE 1 ### updown graph + poisson loss
 myData <- dataGenerator(n, c(0.1, 0.3, 0.5, 0.8, 1), c(1, 2, 1, 3, 1), type = "poisson")
 myGraph <- graph(penalty = 2 * sdDiff(myData)^2 * log(n), type = "updown")
 gfpop(data = myData, mygraph = myGraph, type = "poisson")

### EXAMPLE 2 ### relevant graph with min gap = 2 + poisson loss
 myData <- dataGenerator(n, c(0.1, 0.3, 0.5, 0.8, 1), c(1, 2, 3, 5, 3), type = "poisson")
 myGraph <- graph(type = "relevant", penalty = 2 * log(n), gap = 2)
 gfpop(data =  myData, mygraph = myGraph, type = "poisson")

### EXAMPLE 3 ### std graph with robust loss + variance loss
 myData <- dataGenerator(n, c(0.1, 0.3, 0.5, 0.8, 1), c(1, 5, 1, 5, 1), type = "variance")
 outliers <- 5 * rbinom(n, 1, 0.05) - 5 * rbinom(n, 1, 0.05)
### with robust parameter K
 myGraph <- graph(type = "std", penalty = 2 * log(n), K = 10)
 gfpop(data =  myData + outliers, mygraph = myGraph, type = "variance")
### no K
 myGraph <- graph(type = "std", penalty = 2 * log(n))
 gfpop(data =  myData, mygraph = myGraph, type = "variance")

### EXAMPLE 4 ###  3-segment graph with mean (Gaussian) loss
 myData <- dataGenerator(n, c(0.12, 0.31, 0.53, 0.88, 1), c(1, 2, 0, 1, 2), type = "mean")
 outliers <- 5 * rbinom(n, 1, 0.05) - 5 * rbinom(n, 1, 0.05)
 gfpop(data =  myData + outliers, mygraph = paperGraph(8, penalty = 2 * log(n)), type = "mean")

gfpop documentation built on April 1, 2023, 12:22 a.m.