Combat_Normal: Process and Correct Batch Effects in TCGA's normal tissue and...

View source: R/CombatNormal.R

Combat_NormalR Documentation

Process and Correct Batch Effects in TCGA's normal tissue and GTEX Data

Description

The function first extracts histological types from the provided TCGA's normal tissue data set. After displaying these types, the user is prompted to input specific types to retain. The data is then filtered based on this input. The GTEX and TCGA's normal tissue datasets are then combined and batch corrected.

Note: This function assumes that TCGA's normal samples and GTEX samples represent different batches.

Usage

Combat_Normal(
  TCGA_normal_data_path,
  gtex_data_path,
  CombatNormal_output_path,
  auto_mode = FALSE,
  default_input = "11,12"
)

Arguments

TCGA_normal_data_path

The path to the tumor data stored in an RDS file.

gtex_data_path

The path to the GTEX data stored in an RDS file.

CombatNormal_output_path

A character string specifying the path where the output RDS file will be saved.

auto_mode

Logical. If set to TRUE, the function will not prompt the user for input and will instead use the values provided in default_input. Default is FALSE.

default_input

Character string. When auto_mode is TRUE, this parameter specifies the default TGCA's normal tissue types to be retained. It should be provided as a comma-separated string (e.g., "11,12").

Details

This function takes a TCGA's normal tissue data set and a pre-saved GTEX data set, asks the user for specific TCGA's normal tissue types to retain, then merges the two datasets. The merged dataset is then corrected for batch effects using the ComBat_seq function from the 'sva' package.

The ComBat_seq function from the 'sva' package is used to correct batch effects. The function requires the 'sva' package to be installed and loaded externally.

The example code uses 'tempfile()' to generate temporary paths dynamically during execution. These paths are valid during the 'R CMD check' process, even if no actual files exist, because ‘tempfile()' generates a unique file path that does not depend on the user’s file system. Using 'tempfile()' ensures that the example code does not rely on specific external files and avoids errors during 'R CMD check'. CRAN review checks for documentation correctness and syntax parsing, not the existence of actual files, as long as the example code is syntactically valid.

Value

A data.frame with corrected values after the ComBat_seq adjustment. Note that this function also saves the combat_count_df data as an RDS file at the specified output path.

See Also

ComBat_seq

Examples

TCGA_normal_file <- system.file("extdata",
                                "SKCM_Skin_TCGA_exp_normal_test.rds",
                                package = "TransProR")
gtex_file <- system.file("extdata", "Skin_SKCM_Gtex_test.rds", package = "TransProR")
output_file <- file.path(tempdir(), "SKCM_Skin_Combat_Normal_TCGA_GTEX_count.rds")

SKCM_Skin_Combat_Normal_TCGA_GTEX_count <- Combat_Normal(
  TCGA_normal_data_path = TCGA_normal_file,
  gtex_data_path = gtex_file,
  CombatNormal_output_path = output_file,
  auto_mode = TRUE,
  default_input = "skip"
)
head(SKCM_Skin_Combat_Normal_TCGA_GTEX_count)[1:5, 1:5]


TransProR documentation built on April 4, 2025, 3:16 a.m.