# gfpop: Graph-Constrained Functional Pruning Optimal Partitioning... In gfpop: Graph-Constrained Functional Pruning Optimal Partitioning

 gfpop R Documentation

## Graph-Constrained Functional Pruning Optimal Partitioning (gfpop)

### Description

Functional pruning optimal partitioning with a graph structure to take into account constraints on consecutive segment parameters. The user has to specify the graph he wants to use (see the graph function) and a type of cost function. This is the main function of the gfpop package. Its result can be plotted using the S3 gfpop function gfpop::plot()

### Usage

gfpop(data, mygraph, type = "mean", weights = NULL, testMode = FALSE)


### Arguments

 data vector of data to segment. For simulation studies, Data can be generated using gfpop package function gfpop::dataGenerator() mygraph dataframe of class "graph" to constrain the changepoint inference, see gfpop::graph() type a string defining the cost model to use: "mean", "variance", "poisson", "exp", "negbin" weights vector of weights (positive numbers), same size as data testMode boolean. FALSE by default. Used to debug the code

### Details

The constrained optimization problem for n data points takes the following general form:

Q_n = min (with constraints) (\sum_{t=1}^n (\gamma(e[t])(y[t], \mu[t]) + \beta(e[t]))

with data points y[t], edges e[t], edge-dependent penalties \beta(e[t]) and cost functions \gamma. The cost function can take three different forms for parameter x and constants (A, B, C):

• quadratic, with representation Ax^2 + Bx +C with x in R

• log-linear, with representation Ax - B log(x) +C with x \ge 0

• log-log, with representation - A log(x) - B log(1-x) +C with 0 \le x \le 1

For each optimization problem, we consider a unique cost representation. However, the User can define robustness values (K and a) specific to each edge, making the cost function edge-dependent. We give the atomic form of each of the five available types (for one data point of value y with weight w)

• "mean" : A = w, B = -2wy, C = wy^2

• "variance" : A = wy^2, B = w, C = 0

• "poisson" : A = w, B = wy, C = 0

• "exp" : A = wy, B = w, C = 0

• "negbin" : A = w, B = wy, C = 0

### Value

a gfpop object = (changepoints, states, forced, parameters, globalCost)

changepoints

is the vector of changepoints (we give the last element of each segment)

states

is the vector giving the state of each segment

forced

is the vector specifying whether the constraints of the graph are active (= TRUE) or not (= FALSE)

parameters

is the vector of successive parameters of each segment

globalCost

is a number equal to the total loss: the minimal cost for the optimization problem with all penalty values excluded

• gfpop::dataGenerator() to generate data for multiple change-point simulations

• gfpop::graph() to create graphs complying with the gfpop function

• gfpop::plot() to plot the gfpop object and visualize inferred changepoints and parameters

### Examples

n <- 1000 #data length
### EXAMPLE 1 ### updown graph + poisson loss
myData <- dataGenerator(n, c(0.1, 0.3, 0.5, 0.8, 1), c(1, 2, 1, 3, 1), type = "poisson")
myGraph <- graph(penalty = 2 * sdDiff(myData)^2 * log(n), type = "updown")
gfpop(data = myData, mygraph = myGraph, type = "poisson")

### EXAMPLE 2 ### relevant graph with min gap = 2 + poisson loss
myData <- dataGenerator(n, c(0.1, 0.3, 0.5, 0.8, 1), c(1, 2, 3, 5, 3), type = "poisson")
myGraph <- graph(type = "relevant", penalty = 2 * log(n), gap = 2)
gfpop(data =  myData, mygraph = myGraph, type = "poisson")

### EXAMPLE 3 ### std graph with robust loss + variance loss
myData <- dataGenerator(n, c(0.1, 0.3, 0.5, 0.8, 1), c(1, 5, 1, 5, 1), type = "variance")
outliers <- 5 * rbinom(n, 1, 0.05) - 5 * rbinom(n, 1, 0.05)
### with robust parameter K
myGraph <- graph(type = "std", penalty = 2 * log(n), K = 10)
gfpop(data =  myData + outliers, mygraph = myGraph, type = "variance")
### no K
myGraph <- graph(type = "std", penalty = 2 * log(n))
gfpop(data =  myData, mygraph = myGraph, type = "variance")

### EXAMPLE 4 ###  3-segment graph with mean (Gaussian) loss
myData <- dataGenerator(n, c(0.12, 0.31, 0.53, 0.88, 1), c(1, 2, 0, 1, 2), type = "mean")
outliers <- 5 * rbinom(n, 1, 0.05) - 5 * rbinom(n, 1, 0.05)
gfpop(data =  myData + outliers, mygraph = paperGraph(8, penalty = 2 * log(n)), type = "mean")


gfpop documentation built on April 1, 2023, 12:22 a.m.