R/RcppExports.R

Defines functions focus_offline generate_projection_indexes detector_candidates detector_info_sn detector_info_n detector_pieces_len get_statistics detector_update detector_create

Documented in detector_candidates detector_create detector_info_n detector_info_sn detector_pieces_len detector_update focus_offline generate_projection_indexes get_statistics

# Generated by using Rcpp::compileAttributes() -> do not edit by hand
# Generator token: 10BE3573-1514-4C36-9D1C-5A225CD40393

#' Create a FOCuS changepoint detector
#'
#' Creates an online (sequential) changepoint detector object that provides
#' a step-by-step interface to the FOCuS algorithm. Each call to
#' \code{\link{detector_update}()} adds new data, and \code{\link{get_statistics}()} computes
#' the current test statistic and detection result.
#'
#' @param type Character string specifying detector type. One of:
#'   \itemize{
#'     \item \code{"univariate"}: Two-sided univariate detection
#'     \item \code{"univariate_one_sided"}: One-sided univariate detection
#'     \item \code{"multivariate"}: Multivariate detection with projections
#'     \item \code{"npfocus"}: Nonparametric detection (NP-FOCuS). See details.
#'     \item \code{"arp"}: AutoRegressive Process detection. Requires \code{rho} parameter.
#'   }
#' @param dim_indexes List of integer vectors specifying projection index sets
#'   for high-dimensional multivariate detectors. Not required for sequences of dimentions less than 5.
#'    Each element is a vector of 0-based column indices. Default is \code{NULL}.
#' @param quantiles Numeric vector of quantiles for nonparametric
#'   (\code{"npfocus"}) detectors. Required when \code{type = "npfocus"}.
#'   Default is \code{NULL}.
#' @param pruning_mult Integer. Candidate pruning multiplier parameter.
#'   Default is 2.
#' @param pruning_offset Integer. Candidate pruning offset parameter.
#'   Default is 1.
#' @param side Character string. For one-sided detectors, either \code{"right"}
#'   (detects increases) or \code{"left"} (detects decreases). Default is
#'   \code{"right"}.
#' @param anomaly_intensity Numeric scalar. Anomaly intensity threshold for
#'   pruning candidates. Only candidates with sufficient signal magnitude are
#'   retained. Default is \code{NULL} (disabled).
#' @param rho Numeric vector. AR coefficients for AutoRegressive Process (ARP)
#'   detectors. Required when \code{type = "arp"}. Default is \code{NULL}.
#' @param mu0_arp Numeric scalar. Pre-change mean for ARP detectors (optional).
#'   When provided, enables more efficient pruning by filtering candidates based on
#'   the known pre-change parameter. Only used when \code{type = "arp"}.
#'   Default is \code{NULL}.
#'
#' @return An external pointer (SEXP) to the detector object. This should be
#'   passed to other detector functions like \code{\link{detector_update}()} and
#'   \code{\link{get_statistics}()}.
#'
#' @details
#' The detector maintains sufficient statistics internally and uses pruning
#' to efficiently track candidate changepoints. The \code{pruning_mult} and
#' \code{pruning_offset} parameters control the pruning strategy.
#'
#' AutoRegressive Process (ARP):
#' When \code{type = "arp"}, the \code{rho} parameter must be provided as a numeric
#' vector of AR coefficients (lag-1, lag-2, ..., lag-p). The detector then computes
#' statistics optimal for detecting changepoints in AR(p) processes. Use
#' \code{get_statistics(family = "arp")} to retrieve the test statistics.
#' The optional \code{theta0} parameter specifies the pre-change mean and is tied
#' to the pruning logic: if provided, it enables more efficient pruning by allowing
#' the algorithm to filter candidates based on the known pre-change parameter.
#'
#' High-dimentional multivariate detectors:
#' For high-dimensional multivariate detection, computing the full hull would be too prohibitive.
#' Additionally, the complexity is expected to be log(n)^p, where n is the number of iterations and p is the
#' dimensions. So for short, high-dimentional sequences, it is possible to reconstruct the set of changepoint
#' locations by approximating the hull on projections in smaller dimentions.  \code{dim_indexes} specifies which dimensions
#' to use for each projection of the convex hull for pruning. Use \code{generate_projection_indexes()} to
#' generate systematic projection sets.
#'
#' NPFOCuS:
#'   For non-parametric detection one needs to set \code{detector(type = "npfocus")} and the cost can be computed as \code{get_statistics(family = "npfocus")}.
#'   Any other cost will not work with this detector type. The \code{quantiles} vector argument is required, see \code{\link{get_statistics}()} for details.
#'
#' @examples
#' # Univariate detector
#' det <- detector_create(type = "univariate")
#' detector_update(det, 0.5)
#' detector_update(det, 1.2)
#' r <- get_statistics(det, family = "gaussian")
#' print(r)
#'
#' ## Online (sequential) example
#' # Generate data with a changepoint
#' set.seed(123)
#' Y <- c(rnorm(500, mean = 0), rnorm(500, mean = 1))
#' det <- detector_create(type = "univariate")
#' stat_trace <- numeric(length(Y))
#' threshold <- 20
#' for (i in seq_along(Y)) {
#'   detector_update(det, Y[i])
#'   r <- get_statistics(det, family = "gaussian")
#'   stat_trace[i] <- r$stat
#'   if (!is.null(r$stat) && r$stat > threshold) {
#'     cat("Online detection at", i, "estimate tau =", r$changepoint, "\n")
#'     plot(stat_trace[1:i], type = "l", ylab = "Test Statistic", xlab = "Time")
#'     break
#'   }
#' }
#'
#' # Multivariate detector with projections
#' dim_indexes <- list(c(0,1), c(1,2), c(0,2))  # 0-based indices
#' det_mv <- detector_create(type = "multivariate", dim_indexes = dim_indexes)
#' detector_update(det_mv, c(0.5, 1.2, -0.3))
#'
#' # Nonparametric detector
#' quants <- qnorm(c(0.25, 0.5, 0.75))
#' det_np <- detector_create(type = "npfocus", quantiles = quants)
#'
#' # One-sided univariate detector
#' det_one_sided <- detector_create(type = "univariate_one_sided", side = "left")
#'
#' @export
detector_create <- function(type, dim_indexes = NULL, quantiles = NULL, pruning_mult = 2L, pruning_offset = 1L, side = "right", anomaly_intensity = NULL, rho = NULL, mu0_arp = NULL) {
    .Call(`_focus_detector_create`, type, dim_indexes, quantiles, pruning_mult, pruning_offset, side, anomaly_intensity, rho, mu0_arp)
}

#' Update detector with new observation(s)
#'
#' Adds new observation(s) to the detector's internal state and updates
#' sufficient statistics.
#'
#' @param info_ptr External pointer to detector created by
#'   \code{\link{detector_create}()}.
#' @param y Numeric vector of new observation(s). For univariate detectors,
#'   this should be a scalar (length-1 vector). For multivariate detectors,
#'   this should be a vector matching the number of dimensions.
#'
#' @return
#' An external pointer to the detector (the same object that was passed in).
#' The detector is updated *in place* — no copy is made — so this return value
#' is provided only for convenience, for example when using the native R pipe
#' operator (`|>`) e.g.,
#' \code{det |> detector_update(y) |> get_statistics()}.
#'
#' @examples
#' # Univariate example
#' det <- detector_create(type = "univariate")
#' detector_update(det, 0.5)
#' detector_update(det, 1.2)
#'
#' # Multivariate example
#' det_mv <- detector_create(type = "multivariate")
#' detector_update(det_mv, c(0.5, 1.2, -0.3))
#'
#' ## Online (sequential) example
#' # Generate data with a changepoint
#' set.seed(123)
#' Y <- c(rnorm(500, mean = 0), rnorm(500, mean = 1))
#' det <- detector_create(type = "univariate")
#' stat_trace <- numeric(length(Y))
#' threshold <- 20
#' for (i in seq_along(Y)) {
#'   detector_update(det, Y[i])
#'   r <- get_statistics(det, family = "gaussian")
#'   stat_trace[i] <- r$stat
#'   if (!is.null(r$stat) && r$stat > threshold) {
#'     cat("Online detection at", i, "estimate tau =", r$changepoint, "\n")
#'     plot(stat_trace[1:i], type = "l", ylab = "Test Statistic", xlab = "Time")
#'     break
#'   }
#' }
#'
#'
#' @export
detector_update <- function(info_ptr, y) {
    .Call(`_focus_detector_update`, info_ptr, y)
}

#' Compute current changepoint statistics
#'
#' Computes the current changepoint test statistic and detection result based
#' on all observations processed so far.
#'
#' @param info_ptr External pointer to detector created by
#'   \code{\link{detector_create}()}.
#' @param family Character string specifying the distribution family:
#'   \itemize{
#'     \item \code{"gaussian"}: Gaussian (normal) distribution
#'     \item \code{"poisson"}: Poisson distribution
#'     \item \code{"bernoulli"}: Bernoulli (binary) distribution
#'     \item \code{"gamma"}: Gamma distribution (requires \code{shape} parameter)
#'     \item \code{"npfocus"}: Nonparametric detection (see details)
#'     \item \code{"arp"}: AutoRegressive Process detection (requires detector created with \code{type = "arp"})
#'   }
#' @param theta0 Numeric vector specifying the null hypothesis parameter.
#'   For univariate detectors: scalar (length-1 vector).
#'   For multivariate detectors: vector matching the number of dimensions.
#'   Default is \code{NULL}.
#' @param shape Numeric scalar. Shape parameter for gamma distribution.
#'   Required and must be positive when \code{family = "gamma"}.
#'   Default is \code{NULL}.
#'
#' @return A list with components:
#'   \item{stopping_time}{Integer. Current time index (number of observations
#'     processed).}
#'   \item{changepoint}{Integer or NULL. Detected changepoint location
#'     (1-based index), or NULL if no changepoint detected.}
#'   \item{stat}{Numeric scalar or vector. Test statistic(s). For univariate
#'     detectors, a scalar. For multivariate detectors, a vector of statistics
#'     for each projection.}
#'
#' @details
#' The function computes a log-likelihood ratio test statistic comparing the
#' null hypothesis (no change) against the alternative (a change at the optimal
#' location). The statistic is typically compared against a threshold to
#' determine if a changepoint should be declared.
#'
#' Gamma family:
#' When \code{family = "gamma"} a positive \code{shape} parameter must
#' be provided; otherwise an error is raised. Passing \code{shape} for a
#' non-gamma family will be ignored (and in some interfaces will trigger
#'                                     a warning).
#'
#' NPFOCuS:
#'   For non-parametric detection one needs to set \code{detector(type = "npfocus")} and the cost can be computed as \code{get_statistics(family = "npfocus")}.
#'   The \code{quantiles} vector argument is required. NPFOCuS returns two
#' statistics (sum and max over quantiles) as a vector; in the offline interface
#' \code{stat} will be a matrix with two columns.
#'
#' AutoRegressive Process (ARP):
#' For ARP detection, use \code{family = "arp"} with a detector created via
#' \code{detector_create(type = "arp", rho = ...)}. The AR coefficients (rho)
#' are already built into the detector at creation time. The optional \code{theta0}
#' parameter specifies the pre-change mean (if known) and is used for pruning logic;
#' if not provided (\code{NULL}), pruning operates without this information.
#' Returns a scalar test statistic optimized for detecting changepoints in AR processes.
#'
#' @examples
#'
#' ## Online (sequential) example
#' # Generate data with a changepoint
#' set.seed(123)
#' Y <- c(rnorm(500, mean = 0), rnorm(500, mean = 1))
#' det <- detector_create(type = "univariate")
#' stat_trace <- numeric(length(Y))
#' threshold <- 20
#' for (i in seq_along(Y)) {
#'   detector_update(det, Y[i])
#'   r <- get_statistics(det, family = "gaussian")
#'   stat_trace[i] <- r$stat
#'   if (!is.null(r$stat) && r$stat > threshold) {
#'     cat("Online detection at", i, "estimate tau =", r$changepoint, "\n")
#'     plot(stat_trace[1:i], type = "l", ylab = "Test Statistic", xlab = "Time")
#'     break
#'   }
#' }
#'
#' ## Note that multiple models can be tested simultaneously on the same detector
#' # as the statistics is independent of the detector state.
#' # For example, testing both Gaussian and Poisson costs
#' set.seed(2024)
#' # Generate Poisson count data with a rate change
#' Y_counts <- c(rpois(500, lambda = 10), rpois(500, lambda = 15))
#' # Compute full trajectories for comparison
#' det2 <- detector_create(type = "univariate")
#' stat_gaussian <- numeric(length(Y_counts))
#' stat_poisson <- numeric(length(Y_counts))
#'
#' for (i in seq_along(Y_counts)) {
#'   detector_update(det2, Y_counts[i])
#'   stat_gaussian[i] <- get_statistics(det2, family = "gaussian")$stat
#'   stat_poisson[i] <- get_statistics(det2, family = "poisson", theta0 = 10)$stat
#' }
#'
#' # Plot comparison
#' oldpar <- par(mfrow = c(1, 2))
#' plot(stat_gaussian, type = "l", main = "Gaussian Statistic on Poisson Data",
#'      xlab = "Time", ylab = "Statistic", lwd = 2, col = "blue")
#' abline(v = 500, col = "green", lty = 3, lwd = 2)
#'
#' plot(stat_poisson, type = "l", main = "Poisson Statistic on Poisson Data",
#'      xlab = "Time", ylab = "Statistic", lwd = 2, col = "red")
#' abline(v = 500, col = "green", lty = 3, lwd = 2)
#' par(oldpar)
#'
#'
#' @export
get_statistics <- function(info_ptr, family, theta0 = NULL, shape = NULL) {
    .Call(`_focus_get_statistics`, info_ptr, family, theta0, shape)
}

#' Get number of candidate segments
#'
#' Returns the number of candidate changepoint segments currently tracked
#' by the detector.
#'
#' @param info_ptr External pointer to detector created by
#'   \code{\link{detector_create}()}.
#'
#' @return Integer. Number of candidate segments.
#'
#' @details
#' The FOCuS algorithm maintains a set of candidate segments that could
#' potentially contain changepoints. This number grows with time but is
#' controlled by the pruning parameters.
#'
#' @export
detector_pieces_len <- function(info_ptr) {
    .Call(`_focus_detector_pieces_len`, info_ptr)
}

#' Get number of observations processed
#'
#' Returns the total number of observations processed by the detector.
#'
#' @param info_ptr External pointer to detector created by
#'   \code{\link{detector_create}()}.
#'
#' @return Integer. Number of observations processed (current time index).
#'
#' @export
detector_info_n <- function(info_ptr) {
    .Call(`_focus_detector_info_n`, info_ptr)
}

#' Get cumulative sum statistic
#'
#' Returns the current cumulative sum statistic maintained by the detector.
#'
#' @param info_ptr External pointer to detector created by
#'   \code{\link{detector_create}()}.
#'
#' @return Numeric vector. Cumulative sum statistic. For univariate detectors,
#'   a scalar (length-1 vector). For multivariate detectors, a vector of
#'   length equal to the number of dimensions.
#'
#' @export
detector_info_sn <- function(info_ptr) {
    .Call(`_focus_detector_info_sn`, info_ptr)
}

#' Get candidate segments
#'
#' Returns detailed information about all candidate changepoint segments
#' currently tracked by the detector.
#'
#' @param info_ptr External pointer to detector created by
#'   \code{\link{detector_create}()}.
#'
#' @return A data frame (tibble) with columns:
#'   \item{tau}{Integer vector. Candidate changepoint locations (0-based indices).}
#'   \item{st}{List of numeric vectors. Sufficient statistics for each
#'     candidate segment (e.g., cumulative sums of the data).}
#'   \item{side}{Character vector. Side indicator for each candidate
#'     (relevant for one-sided detectors).}
#'
#' @details
#' Each row represents a candidate segment from time \code{tau} to the current
#' time. The sufficient statistics in \code{st} are used to efficiently compute
#' test statistics without reprocessing the data.
#'
#' @export
detector_candidates <- function(info_ptr) {
    .Call(`_focus_detector_candidates`, info_ptr)
}

#' Generate projection index sets
#'
#' Generates projection index sets for high-dimensional multivariate detectors
#' using circular combinations.
#'
#' @param D Integer. Total number of dimensions.
#' @param p Integer. Projection subset size (number of dimensions per projection).
#'
#' @return A list of integer vectors. Each element is a vector of 0-based
#'   column indices representing one projection.
#'
#' @details
#' This function generates systematic projection sets for use with multivariate
#' detectors. The circular combination approach ensures good coverage of the
#' dimensional space while keeping the number of projections manageable.
#'
#' @examples
#' \donttest{
#' # Generate 2-dimensional projections from 5 dimensions
#' proj <- generate_projection_indexes(D = 5, p = 2)
#' print(proj)
#'
#' # Use with multivariate detector
#' det <- detector_create(type = "multivariate", dim_indexes = proj)
#' set.seed(42)
#' p <- 5
#'
#' # Create data: changepoint at t=1000
#' Y_multi <- rbind(
#'     matrix(rnorm(1000 * p, mean = -1, 1), ncol = p),
#'     matrix(rnorm(500 * p, mean = 1.2), ncol = p)
#' )
#'
#' # Full multivariate detection
#' system.time(
#' res_multi <- focus_offline(Y_multi, threshold = Inf,
#'                            type = "multivariate", family = "gaussian")
#' )
#'
#' # Low-dimensional projection approximation
#' dim_indexes <- generate_projection_indexes(5, 2)
#' system.time(
#' res_multi_approx <- focus_offline(Y_multi, threshold = Inf,
#'                                   type = "multivariate", family = "gaussian",
#'                                   dim_indexes = dim_indexes)
#' )
#'
#' # Verify similarity
#' all.equal(res_multi$stat, res_multi_approx$stat)
#' }
#'
#' @export
generate_projection_indexes <- function(D, p) {
    .Call(`_focus_generate_projection_indexes`, D, p)
}

#' @name focus_offline
#' @title Run FOCuS detector in offline batch mode
#'
#' @description
#' Processes all data at once and returns detection results and trajectories.
#' This is the most efficient way to run changepoint detection when all data
#' is available upfront.
#'
#' @param Y Numeric vector or matrix. Data array. For univariate detection,
#'   a numeric vector. For multivariate detection, a matrix with observations
#'   in rows and dimensions in columns.
#' @param threshold Numeric scalar or vector. Detection threshold(s). Can be:
#'   \itemize{
#'     \item A scalar: applied to all test statistics (default behavior for most cost functions)
#'     \item A vector: length must match the number of test statistics (for example, np-focus
#'       returns both the sum and max statistics, so threshold should be length 2).
#'     \item \code{Inf}: no thresholding (returns all statistics)
#'   }
#' @param type Character string specifying detector type. See
#'   \code{\link{detector_create}()} for options. Defaults to \code{"univariate"}.
#' @param family Character string specifying distribution family. See
#'   \code{\link{get_statistics}()} for options. Defaults to \code{"gaussian"}.
#' @param theta0 Numeric vector. Null hypothesis parameter. Default is \code{NULL}.
#' @param dim_indexes List of integer vectors. Projection index sets for
#'   multivariate detectors. Default is \code{NULL}.
#' @param quantiles Numeric vector. Quantiles for nonparametric detectors.
#'   Default is \code{NULL}.
#' @param pruning_mult Integer. Pruning multiplier parameter. Default is 2.
#' @param pruning_offset Integer. Pruning offset parameter. Default is 1.
#' @param side Character string. For one-sided detectors: \code{"right"} or
#'   \code{"left"}. Default is \code{"right"}.
#' @param shape Numeric scalar. Shape parameter for gamma distribution.
#'   Default is \code{NULL}.
#' @param anomaly_intensity Numeric scalar. Anomaly intensity threshold for
#'   pruning candidates. Only candidates with sufficient signal magnitude are
#'   retained. Default is \code{NULL} (disabled).
#' @param rho Numeric vector. AR coefficients for AutoRegressive Process (ARP)
#'   detectors. Required when \code{type = "arp"}. Default is \code{NULL}.
#' @param mu0_arp Numeric scalar. Pre-change mean for ARP detectors (optional). 
#'   Only used when \code{type = "arp"}.
#'   Default is \code{NULL}.
#'
#' @return A list with components:
#'   \item{stat}{Numeric matrix. Test statistics over time (n_obs × n_stats).
#'     Each row corresponds to one time point, each column to one statistic.}
#'   \item{changepoint}{Integer vector. Detected changepoints at each time
#'     point (1-based indices), or NA if no changepoint detected at that time.}
#'   \item{detection_time}{Integer or NULL. Time of first detection (1-based),
#'     or NULL if no detection occurred.}
#'   \item{detected_changepoint}{Integer or NULL. Changepoint location at
#'     detection time (1-based), or NULL if no detection occurred.}
#'   \item{candidates}{Data frame. Final candidate segments (see
#'     \code{detector_candidates()}).}
#'   \item{threshold}{Numeric vector. Threshold(s) used for detection.}
#'   \item{n}{Integer. Number of observations processed.}
#'   \item{type}{Character. Detector type used.}
#'   \item{family}{Character. Distribution family used.}
#'   \item{shape}{Numeric or NULL. Shape parameter (for gamma family).}
#'
#' @details
#' This function runs the complete detection algorithm in C++ for maximum
#' efficiency. It processes observations sequentially and stops at the first
#' detection (when any statistic exceeds its threshold).
#'
#' For multivariate data, the algorithm computes multiple statistics (one per
#' projection). Detection occurs when ANY statistic exceeds its threshold.
#'
#' @examples
#' # Univariate Gaussian detection
#' set.seed(123)
#' Y <- c(rnorm(100, mean = 0), rnorm(100, mean = 2))
#' result <- focus_offline(Y, threshold = 10, type = "univariate",
#'                        family = "gaussian")
#' cat("Detection at time:", result$detection_time, "\n")
#' cat("Changepoint at:", result$detected_changepoint, "\n")
#'
#' # Plot statistics
#' plot(result$stat, type = "l", ylab = "Test Statistic", xlab = "Time")
#' abline(h = result$threshold, col = "red", lty = 2)
#' if (!is.null(result$detection_time)) {
#'   abline(v = result$detection_time, col = "blue", lty = 2)
#' }
#'
#' # Multivariate detection
#' Y_multi <- matrix(rnorm(200 * 3), ncol = 3)
#' Y_multi[101:200, ] <- Y_multi[101:200, ] + 1.5  # Add mean shift
#' result_multi <- focus_offline(Y_multi, threshold = 15,
#'                               type = "multivariate",
#'                               family = "gaussian")
#'
#' # Poisson detection
#' Y_poisson <- c(rpois(100, lambda = 2), rpois(100, lambda = 5))
#' result_poisson <- focus_offline(Y_poisson, threshold = 10,
#'                                 type = "univariate",
#'                                 family = "poisson")
#'
#' @export
focus_offline <- function(Y, threshold, type = "univariate", family = "gaussian", theta0 = NULL, dim_indexes = NULL, quantiles = NULL, pruning_mult = 2L, pruning_offset = 1L, side = "right", shape = NULL, anomaly_intensity = NULL, rho = NULL, mu0_arp = NULL) {
    .Call(`_focus_focus_offline`, Y, threshold, type, family, theta0, dim_indexes, quantiles, pruning_mult, pruning_offset, side, shape, anomaly_intensity, rho, mu0_arp)
}

Try the focus package in your browser

Any scripts or data that you put into this service are public.

focus documentation built on March 30, 2026, 5:08 p.m.