gfpop | R Documentation |
Functional pruning optimal partitioning with a graph structure to take
into account constraints on consecutive segment parameters. The user has to specify
the graph he wants to use (see the graph function) and a type of cost function.
This is the main function of the gfpop package. Its result can be plotted using
the S3 gfpop function gfpop::plot()
gfpop(data, mygraph, type = "mean", weights = NULL, testMode = FALSE)
data |
vector of data to segment. For simulation studies, Data can be generated using gfpop package function |
mygraph |
dataframe of class "graph" to constrain the changepoint inference, see |
type |
a string defining the cost model to use: |
weights |
vector of weights (positive numbers), same size as data |
testMode |
boolean. |
The constrained optimization problem for n data points takes the following general form:
Q_n = min (with constraints) (\sum_{t=1}^n (\gamma(e[t])(y[t], \mu[t]) + \beta(e[t]))
with data points y[t]
, edges e[t]
, edge-dependent penalties \beta(e[t])
and cost functions \gamma
.
The cost function can take three different forms for parameter x and constants (A, B, C):
quadratic, with representation Ax^2 + Bx +C
with x
in R
log-linear, with representation Ax - B log(x) +C
with x \ge 0
log-log, with representation - A log(x) - B log(1-x) +C
with 0 \le x \le 1
For each optimization problem, we consider a unique cost representation. However, the User can define robustness values (K and a) specific to each edge, making the cost function edge-dependent. We give the atomic form of each of the five available types (for one data point of value y with weight w)
"mean"
: A = w
, B = -2wy
, C = wy^2
"variance"
: A = wy^2
, B = w
, C = 0
"poisson"
: A = w
, B = wy
, C = 0
"exp"
: A = wy
, B = w
, C = 0
"negbin"
: A = w
, B = wy
, C = 0
a gfpop object = (changepoints, states, forced, parameters, globalCost
)
changepoints
is the vector of changepoints (we give the last element of each segment)
states
is the vector giving the state of each segment
forced
is the vector specifying whether the constraints of the graph are active (= TRUE
) or not (= FALSE
)
parameters
is the vector of successive parameters of each segment
globalCost
is a number equal to the total loss: the minimal cost for the optimization problem with all penalty values excluded
gfpop::dataGenerator()
to generate data for multiple change-point simulations
gfpop::graph()
to create graphs complying with the gfpop function
gfpop::plot()
to plot the gfpop object and visualize inferred changepoints and parameters
n <- 1000 #data length
### EXAMPLE 1 ### updown graph + poisson loss
myData <- dataGenerator(n, c(0.1, 0.3, 0.5, 0.8, 1), c(1, 2, 1, 3, 1), type = "poisson")
myGraph <- graph(penalty = 2 * sdDiff(myData)^2 * log(n), type = "updown")
gfpop(data = myData, mygraph = myGraph, type = "poisson")
### EXAMPLE 2 ### relevant graph with min gap = 2 + poisson loss
myData <- dataGenerator(n, c(0.1, 0.3, 0.5, 0.8, 1), c(1, 2, 3, 5, 3), type = "poisson")
myGraph <- graph(type = "relevant", penalty = 2 * log(n), gap = 2)
gfpop(data = myData, mygraph = myGraph, type = "poisson")
### EXAMPLE 3 ### std graph with robust loss + variance loss
myData <- dataGenerator(n, c(0.1, 0.3, 0.5, 0.8, 1), c(1, 5, 1, 5, 1), type = "variance")
outliers <- 5 * rbinom(n, 1, 0.05) - 5 * rbinom(n, 1, 0.05)
### with robust parameter K
myGraph <- graph(type = "std", penalty = 2 * log(n), K = 10)
gfpop(data = myData + outliers, mygraph = myGraph, type = "variance")
### no K
myGraph <- graph(type = "std", penalty = 2 * log(n))
gfpop(data = myData, mygraph = myGraph, type = "variance")
### EXAMPLE 4 ### 3-segment graph with mean (Gaussian) loss
myData <- dataGenerator(n, c(0.12, 0.31, 0.53, 0.88, 1), c(1, 2, 0, 1, 2), type = "mean")
outliers <- 5 * rbinom(n, 1, 0.05) - 5 * rbinom(n, 1, 0.05)
gfpop(data = myData + outliers, mygraph = paperGraph(8, penalty = 2 * log(n)), type = "mean")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.