cptga: Genetic algorithm
In changepointGA: Changepoint Detection via Modified Genetic Algorithm

cptga

R Documentation

Genetic algorithm

Description

Perform the genetic algorithm for changepoint detection. This involves the minimization of an objective function using a genetic algorithm (GA). The algorithm can be run sequentially or with explicit parallelization.

Usage

cptga(
  ObjFunc,
  N,
  prange = NULL,
  popSize = 200,
  pcrossover = 0.95,
  pmutation = 0.3,
  pchangepoint = 0.01,
  minDist = 1,
  mmax = NULL,
  lmax = NULL,
  maxgen = 50000,
  maxconv = 5000,
  option = "cp",
  monitoring = FALSE,
  parallel = FALSE,
  nCore = NULL,
  tol = 1e-05,
  seed = NULL,
  popInitialize = "random_population",
  suggestions = NULL,
  selection = "selection_linearrank",
  crossover = "uniformcrossover",
  mutation = "mutation",
  ...
)

Arguments

`ObjFunc`	The fitness function to be minimized. Users can specify any R or Rcpp function as the fitness function, setting the input as the potential solution to the optimization problem and returning a numerical value as the output/fitness. Depending on the user-specified chromosome representation, the optimization task can be changepoint detection only or changepoint detection plus model order selection, which can be specified via the `option` parameter. When `option="both"`, the list `prange` must be specified to give the range of model orders.
`N`	The sample size of the time series.
`prange`	The default value is `NULL` for changepoint detection only task. If both model order selection and changepoint detection are required, the `prange` argument should be provided as a list. Each element in this list must specify the range for a corresponding model order parameter, and the length of the list object of `prange` must match the number of order parameters to be estimated.
`popSize`	An integer represents the number of individuals/chromosomes in each population.
`pcrossover`	The probability that the crossover operator will apply to the two selected parents' chromosomes to produce the offspring. The typical value is close to 1, with the default setting in this package being 0.95.
`pmutation`	The probability that the mutation operator applies on one individual chromosome. Similar to the natural mutation process, new genetic information is introduced to the offspring chromosome with a relatively small probability (close to 0), with a default value of 0.3.
`pchangepoint`	The probability that a changepoint has occurred with a default value of 0.01. User could change this probability based on domain knowledge and the time series length. This probability is used during population initialization and in the creation of new chromosomes by the mutation operator. By default, the mutation operator function generates a new individual as the mutated offspring.
`minDist`	The minimum length between two adjacent changepoints. Default value equals to one.
`mmax`	The maximum number of changepoints allowed in the time series data corresponds to the maximum possible length of `\boldsymbol{\tau}`. For a time series of length 1000 and we only want to detect the changepoint (`option="cp"`), the default value is 499. The suggested value should be based on the length of the time series. For instance, if a time series has length `N`, the recommended `mmax` should be `N/2-1`. It is suggested to add the number of model hyperparameters if both changepoint detection and model order selection tasks are of-interested simultaneously (`option="both"`).
`lmax`	The maximum possible length of the chromosome representation. For a time series of length 1000 and we only want to detect the changepoint (`option="cp"`), the default value is 501. The suggested value should be based on the length of the time series. For instance, if a time series has length `N`, the recommended `lmax` should be `2+N/2-1`. It is suggested to add the number of model hyperparameters if both changepoint detection and model order selection tasks are of-interested simultaneously (`option="both"`).
`maxgen`	The maximum number of generations the GA can run before the search is forcibly terminated.
`maxconv`	If the overall best fitted value doesn't change after `maxconv` consecutive migrations, the GA algorithm stops.
`option`	A character string controls the optimization task. `"cp"` indicates the task is changepoint detection only. `"both"` indicates the task will include both changepoint detection and model order selection.
`monitoring`	A logical value with either `TRUE` or `FALSE`, indicating whether to print out summarized results (current best fitness function value and its corresponding $C$) for each generation from the GA.
`parallel`	A logical value with either `TRUE` or `FALSE`, indicating whether to use multiple cores for parallel computation of the fitness function values for individuals after population initialization.
`nCore`	An integer with the default value of `NULL`. It represents the number of cores used in parallel computing. The value of `nCore` must be less than the number of physical cores available.
`tol`	An numerical value with the default value of `1e-05`. The tolerance level that helps evaluate whether the two iterations have the same fitness value, which aids in determining GA termination.
`seed`	An integer with the default value of `NULL`. An single integer allows function produce reproducible results.
`popInitialize`	A function or sourced function name character string. It should be designed for initializing a population. The default population initialization is random initialization with some imposed constraints. See `random_population` for example. The function returned object is a matrix, `pop`. The users can specified their own `population` function. It could also be a matrix object, which contain the user specified chromosome. By default, each column represents one individual chromosome. See `random_population` for details.
`suggestions`	A list object. Default value is `NULL`. Each element includes better or more reasonable guessed changepoints locations, which will resulting in one chromosome. If the number of provided suggestions equals to `popSize`, all the suggestions will be used as population. If the number of provided suggestions is less than `popSize`, the function from `popInitialize` will generate the remaining individual chromosomes. The number of provided suggestions cannot be greater than `popSize`. Having better `suggestions` can help GA converges faster.
`selection`	A function or sourced function name character string. This GA operator can help select `mom` and `dad` from current generation population, where `dad` is set to have better fit (smaller fitness function values). The default for selection uses the linear rank selection method. See `selection_linearrank` for example. The function returned object is a list contain the chromosomes for `mom` and `dad`.
`crossover`	A function or sourced function name character string. This GA operator can apply crossover to the chosen parents to produce child for next generation with specified probability. The default for crossover uses the uniform crossover method. See `uniformcrossover` for details in the default crossover operator. The function returned object is a vector contain the chromosomes for `child`.
`mutation`	A function or sourced function name character string. This GA operator can apply mutation to the produced `child` with the specified probability `pmutation`. See `mutation` for details in the default mutation operator. The function returned object is a vector contain `child` chromosome representation.
`...`	additional arguments that will be passed to the fitness function.

Details

For any pre-specified time series model with a specified set of changepoint locations, model fit is evaluated using a fitness function Q(\boldsymbol{\theta}), where \boldsymbol{\theta}=(\boldsymbol{s},\boldsymbol{\tau},\boldsymbol{\beta})' denotes the full parameter vector. Here, \boldsymbol{s} denotes the set of model hyperparameters, potentially including the AR or MA orders, the degree of ARIMA differencing, or the periodic AR order for PAR models. The vector \boldsymbol{\tau}=\{\tau_{1}, \ldots, \tau_{m}\} specifies the changepoint locations, with the number of changepoints m inferred as part of the estimation. Each individual chromosome representation is specified as a vector,

C = (m, \boldsymbol{s}, \boldsymbol{\tau}, N+1)',

where m represents the number of changepoints and is also equivalent to the length of vector \boldsymbol{\tau} containing all the changepoint locations. \boldsymbol{s} contains the integer-valued parameter for time series model structure specification, such as AR, MA, or PAR orders. If no model order selection is desired, then \boldsymbol{s} is omitted and the GA detects changepoints only. The changepoint locations in \boldsymbol{\tau} are encoded as interger values between 2 and N, allowing the length of \boldsymbol{\tau} to vary dynamically with m. The value N+1 is appended at the end of C to serve as a delimiter marking the boundary of the chromosome.

Value

Return an object class cptga-class. See cptga-class for a more detailed description.

Examples



N <- 1000
XMatT <- matrix(1, nrow = N, ncol = 1)
Xt <- ts.sim(
  beta = 0.5, XMat = XMatT, sigma = 1, phi = 0.5, theta = NULL,
  Delta = c(2, -2), CpLoc = c(250, 750), seed = 1234
)

## Multiple changepoint detection without model order selection

# without suggestions
GA.res <- cptga(ObjFunc = ARIMA.BIC, N = N, XMat = XMatT, Xt = Xt)
summary(GA.res)
plot(GA.res, data = Xt)

# with suggestions
suggestions <- list(NULL, 250, c(250, 500), c(250, 625), c(250, 500, 750))
GA.res <- cptga(ObjFunc = ARIMA.BIC, N = N, suggestions = suggestions, XMat = XMatT, Xt = Xt)
summary(GA.res)
plot(GA.res, data = Xt)


## Multiple changepoint detection with model order selection

prange <- list(ar = c(0, 3), ma = c(0, 3))

# without suggestions
GA.res <- cptga(
  ObjFunc = ARIMA.BIC.Order, N = N, prange = prange,
  option = "both", XMat = XMatT, Xt = Xt
)
summary(GA.res)
plot(GA.res, data = Xt)

# with suggestions
suggestions <- list(NULL, 250, c(250, 500), c(250, 625), c(250, 500, 750))
GA.res <- cptga(
  ObjFunc = ARIMA.BIC.Order, N = N, prange = prange,
  suggestions = suggestions, option = "both", XMat = XMatT, Xt = Xt
)
summary(GA.res)
plot(GA.res, data = Xt)

changepointGA documentation built on Nov. 5, 2025, 6:54 p.m.