hdcd: hdcd
In lorenzha/hdcd: High-Dimensional Changepoint Detection

Description Usage Arguments Value Examples

High Dimensional Changepoint Detection

hdcd(x, delta = 0.1, lambda = NULL, lambda_min_ratio = 0.01,
  lambda_grid_size = 10, gamma = NULL, method = c("nodewise_regression",
  "summed_regression", "ratio_regression"), penalize_diagonal = F,
  optimizer = c("line_search", "section_search"), control = NULL,
  standardize = T, threshold = 1e-07, n_folds = 10, verbose = T,
  parallel = T, FUN = NULL, ...)

`x`	A n times p matrix or data frame.
`delta`	Numeric value between 0 and 0.5. This tuning parameter determines the minimal segment size proportional to the size of the dataset and hence an upper bound for the number of changepoints (roughly 1/δ).
`lambda`	Positive numeric value. This is the regularization parameter in the single Lasso fits. This value is ignored if FUN is not NULL.
`lambda_min_ratio`	Numeric value between 0 and 1. If the λ_max is determined internally this will pick λ_min = lambda_min_ratio λ_max*.
`lambda_grid_size`	Integer value determining the number of values between λ_min and λ_max to will be equally spaced on a logarithmic scale.
`gamma`	Numeric value or vector. If NULL the full solution path for gamma will be caluclated for every combination of λ and δ
`method`	Which estimator should be used? Possible choices are nodewise_regression: Nodewise regression is based on a single node that needs to be specified with an additional parameter `node` pointing to the column index of the node of interest. Uses `glmnet` internally. See Kovács (2016) for details. summed_regression: Summed nodewise regression sums up the residual variances of nodewise regression over all nodes. Uses `glasso` internally. See Kovács (2016) for details. ratio_regression: Likelihood ratio based regression sums the pseudo-profile-likelihood over all nodes. Uses `glasso` internally. See Kovács (2016) for details. glasso: The graphical Lasso uses the approach of Friedman et al (2007). In contrast to the other approaches the exact likelihood the whole graphical model is computed and used as loss. This value is ignored if `FUN` is not `NULL`.
`penalize_diagonal`	Boolean, should the diagonal elements of the precision matrix be penalized by λ? This value is ignored if FUN is not NULL.
`optimizer`	Which search technique should be used for performing individual splits in the binary segmentation alogrithm? Possible choices are line_search: Exhaustive linear search. All possivle split candidates are evaluated and the index with maximal loss reduction is returned. section_search: Iteratively cuts the search space according by a flexible ratio as determined by parameter `stepsize` in `control` parameter list and approximately finds an index at a local maximum. See Haubner (2018) for details.
`control`	A list with parameters that is accessed by the selected optimizer: stepsize: Numeric value between 0 and 0.5. Used by section search. min_points: Integer value larger than 3. Used by section search.
`standardize`	Boolean. If TRUE the penalty parameter λ will be adjusted for every dimension in the single Lasso fits according to the standard deviation in the data.
`threshold`	The threshold for halting the iteration in `glasso` or `glmnet`. In the former it controls the absolute change of single parameters in the latter it controls the total objective value. This value is ignored if FUN is not NULL.
`n_folds`	Number of folds. Test data will be selected equi-spaced, i.e. each n_fold-th observation.
`verbose`	Boolean. If TRUE additional information will be printed.
`parallel`	If TRUE and a parallel backend is registered, the cross-validation will be performed in parallel.
`FUN`	A loss function with formal arguments, `x`, `n_obs` and `standardize` which returns a scalar representing the loss for the segment the function is applied to.
`...`	Supply additional arguments for a specific method (e.g. `node` for nodewise_regression) or own loss function `FUN`

For a single fit a list with elements

changepoints: A numeric list with the indices of the changepoints
tree: The fully grown binary tree

For cross-validation a list with elements

changepoints: A numeric list with the indices of the changepoints
cv_results: A multi-dimensional array with the cross-validation results
cv_gamma: Best gamma value
cv_lambda: Best lambda value
cv_delta: Best delta value

If only a single fit was performed a list with the found changepoints as well as the fully grown binary tree are returned. For cross-validation the a list with the found changepopints, the optimal parameter values and the full results is returned.

dat <- SimulateFromModel(CreateModel(n_segments = 2,n = 100,p = 30, ChainNetwork))
## Not run: 
hdcd(dat, 0.1, 0.1, 0.05, method = "summed_regression", verbose = T)

## End(Not run)