get_tcga_exp: TCGA Expression Data Processing

View source: R/GetTcgaExp.R

get_tcga_expR Documentation

TCGA Expression Data Processing

Description

This function processes expression data and phenotype information, separates tumor and normal samples, and saves the results into different files. It's specifically designed for data obtained from TCGA.

Usage

get_tcga_exp(
  counts_file_path,
  gene_probes_file_path,
  phenotype_file_path,
  output_file_path
)

Arguments

counts_file_path

File path to the counts data (usually in the form of a large matrix with gene expression data).

gene_probes_file_path

File path containing the gene probes data.

phenotype_file_path

File path to the phenotype data, which includes various sample attributes.

output_file_path

Path where the output files, distinguished between tumor and normal, will be saved.

Value

A list containing matrices for tumor and normal expression data.

Note

IMPORTANT: This function assumes that the input files follow a specific format and structure, typically found in TCGA data releases. Users should verify their data's compatibility. Additionally, the function does not perform error checking on the data's content, which users should handle through proper preprocessing.

CRITICAL: The 'output_file_path' parameter must end with '.rds' to be properly recognized by the function. It is also highly recommended that the path includes specific identifiers related to the target samples, as the function will create further subdivisions in the specified path for tumor or normal tissues. Please structure the 'output_file_path' following this pattern: './your_directory/your_sample_type.exp.rds'.

Author(s)

Dongyue Yu

Examples

counts_file <- system.file("extdata", "TCGA-SKCM.htseq_counts_test.tsv", package = "TransProR")
gene_probes_file <- system.file("extdata",
                                "TCGA_gencode.v22.annotation.gene.probeMap_test",
                                package = "TransProR")
phenotype_file <- system.file("extdata", "TCGA-SKCM.GDC_phenotype_test.tsv", package = "TransProR")
ouput_file <- file.path(tempdir(), "SKCM_Skin_TCGA_exp_test.rds")

SKCM_exp <- get_tcga_exp(
  counts_file_path = counts_file,
  gene_probes_file_path = gene_probes_file,
  phenotype_file_path = phenotype_file,
  output_file_path = ouput_file
)
head(SKCM_exp[["tumor_tcga_data"]])[1:5, 1:5]
head(SKCM_exp[["normal_tcga_data"]], n = 10) # Because there is only one column.

TransProR documentation built on April 4, 2025, 3:16 a.m.