R/create_CNVMatrix.R

Defines functions create_CNVMatrix

Documented in create_CNVMatrix

#' Create a CNV Expression Matrix
#'
#' Takes an annotated CNV CSV file (output of \code{\link{annotate}}) and
#' reshapes it into a wide-format matrix where rows are samples, columns
#' are gene symbols, and values are mean segment means. Duplicate
#' sample-gene combinations are resolved by taking the mean.
#'
#' @param input_file Character. Path to the input CSV file containing
#'   columns \code{Sample}, \code{GeneSymbol}, and \code{Segment_Mean}.
#'
#' @return A data frame in wide format with samples as rows and gene
#'   symbols as columns. Missing values are represented as \code{NA}.
#'   The matrix is also saved as a timestamped CSV file in the temporary
#'   directory.
#'
#' @details
#' Duplicate \code{Sample}-\code{GeneSymbol} combinations are summarised
#' by taking their mean \code{Segment_Mean} before pivoting, avoiding
#' list-column issues in the output. This function is cancer-type agnostic.
#'
#' @examples
#' annot_file <- system.file("extdata", "annotated_cnv.csv",
#'                            package = "RiskyCNV")
#' cnv_mat <- create_CNVMatrix(annot_file)
#' dim(cnv_mat)
#' head(cnv_mat)
#'
#' @importFrom dplyr group_by summarise ungroup
#' @importFrom tidyr pivot_wider
#' @export
create_CNVMatrix <- function(input_file) {

  data <- utils::read.csv(input_file, header = TRUE)

  if (nrow(data) == 0 || ncol(data) == 0) {
    stop("The input file is empty or not correctly formatted.")
  }

  data_summarized <- data |>
    dplyr::group_by(.data$Sample, .data$GeneSymbol) |>
    dplyr::summarise(Segment_Mean = mean(.data$Segment_Mean),
                     .groups = "drop") |>
    dplyr::ungroup()

  matrix_data <- tidyr::pivot_wider(
    data_summarized,
    names_from  = "GeneSymbol",
    values_from = "Segment_Mean",
    id_cols     = "Sample"
  )

  timestamp   <- format(Sys.time(), "%Y%m%d_%H%M%S")
  output_file <- file.path(tempdir(),
                            paste0("cnv_output_matrix_", timestamp, ".csv"))
  utils::write.csv(matrix_data, file = output_file, row.names = FALSE)
  message("CNV matrix saved to: ", output_file)

  return(matrix_data)
}

Try the RiskyCNV package in your browser

Any scripts or data that you put into this service are public.

RiskyCNV documentation built on June 5, 2026, 5:07 p.m.