R/telescoped.data.R

Defines functions telescoped.data

# Author: Xuye Luo
# Date: December 11, 2025

#' @title Discretization and Telescoping of Two Continuous or Count Data
#'
#' @description Performs multi-scale discretization (telescoping) on two continuous or count data vectors.
#' This function transforms raw data into ranked data and then iteratively discretizes the ranks 
#' at exponentially increasing scales (powers of the base). This allows for analyzing associations 
#' at different resolution levels.
#'
#' @param x A numeric vector without NA values, must be the same length as y.
#' @param y A numeric vector without NA values, must be the same length as x.
#' @param focus An integer specifying a specific resolution level to return. 
#'              If \code{NULL} (default), returns a list containing data at all resolution levels.
#'              Level -1: Raw data.
#'              Level 0: Ranked data (base resolution).
#'              Level k > 0: Discretized data at scale base^k.
#' @param base Integer. The base for exponential scaling. Defaults to 2.
#'
#' @return If \code{focus} is NULL, returns a list where each element is a matrix of the data at a different resolution.
#'         If \code{focus} is an integer, returns a single matrix for that specific level.
#'
#' @examples
#' library("Upsilon")
#' set.seed(123)
#' n <- 100
#' x <- rnorm(n)
#' y <- rnorm(n)
#' 
#' # Get all telescoped levels
#' all_levels <- telescoped.data(x, y)
#' 
#' # Get specific level (e.g., focus = 2)
#' level_2 <- telescoped.data(x, y, focus = 2)
#' print(level_2)
#' @keywords internal
#' export
#' @noRd
telescoped.data <- function(x, y, focus = NULL, base = 2) {
  
  # Input validation
  if (length(x) != length(y)) {
    stop("Vectors 'x' and 'y' must have the same length.")
  }
  
  n <- length(x)
  

  rx <- rank(x, ties.method = "average") - 1
  ry <- rank(y, ties.method = "average") - 1
  
  if (!is.null(focus)) {
    if (focus == -1) {
      # Level -1: Raw Data
      return(cbind(x, y))
    } else if (focus == 0) {
      # Level 0: Pure Ranks
      return(cbind(rx, ry))
    } else {
      # Level > 0: Discretized Ranks
      # divisor gets smaller as focus increases, creating coarser bins? 
      # Wait, original logic: n / base^focus. 
      # If focus=1 (base=2), divisor = n/2. Ranks 0..n-1 divided by n/2 -> roughly 0..1 (2 bins).
      # If focus=large, divisor is small -> many bins.
      # Let's stick to original logic:
      divisor <- n / (base^focus)
      
      # Use integer division %/%
      rx2 <- rx %/% divisor
      ry2 <- ry %/% divisor
      return(cbind(rx2, ry2))
    }
  }
  
  # Return All Levels (Telescoping)
  # List structure:
  # [[1]]: Raw Data
  # [[2]]: Ranked Data
  # [[3..]]: Discretized Levels
  telescope_list <- vector("list", length = 0)
  
  # 1. Raw Data
  telescope_list[[1]] <- cbind(x, y)
  
  # 2. Ranked Data
  telescope_list[[2]] <- cbind(rx, ry)
  
  # Discretized Levels
  # Limit loop to avoid divisor becoming < 1
  # max_level is roughly log_base(n)
  if (n >= 2) {
    max_level <- floor(log(n - 1, base = base))
    
    for (i in seq_len(max_level)) {
      divisor <- n / (base^i)
      
      rx2 <- rx %/% divisor
      ry2 <- ry %/% divisor
    
      telescope_list[[i + 2]] <- cbind(rx2, ry2)
    }
  }
  
  return(telescope_list)
}

Try the Upsilon package in your browser

Any scripts or data that you put into this service are public.

Upsilon documentation built on March 7, 2026, 5:07 p.m.