R/aberration.R

Defines functions aberration

Documented in aberration

#' Detect Copy Number Aberrations (Gains and Losses)
#'
#' Reads a CNV (Copy Number Variation) data file and identifies genomic
#' segments showing significant aberrations (gains or losses) based on a
#' user-defined effect size threshold. Results are split by chromosome and
#' returned as a named list.
#'
#' @param cnv_data_file Character. Path to the CNV data file
#'   (whitespace-delimited, with a header). Must contain columns:
#'   \code{Chromosome}, \code{Start}, \code{End}, \code{Num_Probes},
#'   \code{Segment_Mean}, and \code{Sample}.
#' @param effect_size Numeric. Threshold for calling aberrations. Segments
#'   with \code{Segment_Mean > effect_size} are called Gains; segments with
#'   \code{Segment_Mean < -effect_size} are called Losses. Default is
#'   \code{0.3}.
#'
#' @return A named list where each element corresponds to a chromosome
#'   (e.g., \code{"1"}, \code{"2"}, ...) and contains a data frame of
#'   aberrant segments for that chromosome. Each data frame includes the
#'   columns: \code{Chromosome}, \code{Start}, \code{End},
#'   \code{Num_Probes}, \code{Segment_Mean}, \code{Sample},
#'   \code{Aberration} (Gain or Loss), and \code{Aberration_Code}
#'   (1 = Gain, 0 = Loss).
#'
#' @details
#' Segments with \code{Segment_Mean} between \code{-effect_size} and
#' \code{effect_size} (inclusive) are considered neutral and excluded from
#' the output. The default threshold of 0.3 is widely used in TCGA-based
#' CNV analyses. This function is cancer-type agnostic and can be applied
#' to CNV data from any solid tumour.
#'
#' @references
#' Mermel CH, et al. (2011). GISTIC2.0 facilitates sensitive and confident
#' localization of the targets of focal somatic copy-number alteration in
#' human cancers. \emph{Genome Biol}, 12(4):R41.
#'
#' @examples
#' cnv_file <- system.file("extdata", "cnv_data.txt", package = "RiskyCNV")
#' aberrations <- aberration(
#'   cnv_data_file = cnv_file,
#'   effect_size   = 0.3
#' )
#' print(aberrations[["2"]])
#'
#' @export
aberration <- function(cnv_data_file, effect_size = 0.3) {

  cnvMatrix <- utils::read.table(cnv_data_file, sep = "", header = TRUE,
                                  fill = TRUE)

  cnvMatrix$Label <- ifelse(cnvMatrix$Segment_Mean < -1 * effect_size, "Loss",
                     ifelse(cnvMatrix$Segment_Mean > effect_size, "Gain", NA))

  cnvMatrix <- cnvMatrix[stats::complete.cases(cnvMatrix), ]

  subMatrix <- cnvMatrix[, c("Chromosome", "Start", "End",
                              "Num_Probes", "Segment_Mean", "Sample")]
  subMatrix$Aberration      <- ifelse(cnvMatrix$Label == "Gain", "Gain", "Loss")
  subMatrix$Aberration_Code <- ifelse(subMatrix$Aberration == "Gain", 1, 0)
  subMatrix$Chromosome      <- as.integer(subMatrix$Chromosome)
  subMatrix                 <- subMatrix[stats::complete.cases(subMatrix), ]

  chromosome_lists <- list()
  for (chromosome in unique(subMatrix$Chromosome)) {
    chromosome_data <- subMatrix[subMatrix$Chromosome == chromosome, ]
    chromosome_lists[[as.character(chromosome)]] <- chromosome_data
  }

  return(chromosome_lists)
}

Try the RiskyCNV package in your browser

Any scripts or data that you put into this service are public.

RiskyCNV documentation built on June 5, 2026, 5:07 p.m.