proc_colon: Preprocess Colon Gene Expression Data

View source: R/proc_colon.R

proc_colonR Documentation

Preprocess Colon Gene Expression Data

Description

The proc_colon function preprocesses colon gene expression data by:

  1. Log transforming the raw counts.

  2. Performing two-sample t-tests for each gene between normal and tumor samples.

  3. Selecting the top 50 genes by absolute t-statistic.

  4. Returning the filtered expression matrix and sample indices/groups.

Usage

proc_colon(colon, tissues)

Arguments

colon

A numeric matrix of raw colon gene expression values (genes × samples). Rows are genes; columns are samples.

tissues

A numeric vector indicating tissue type per sample: positive for normal, negative for tumor.

Value

A list with components:

X

A numeric matrix (samples x 50 genes) of selected, log‐transformed expression values.

normal_idx

Integer indices of normal‐tissue columns in the original data.

tumor_idx

Integer indices of tumor‐tissue columns in the original data.

group

Integer vector of length ncol(colon), with 1 = normal, 2 = tumor.

Examples

data("colon")
data("tissues")
set.seed(1234)
colon_data <- proc_colon(colon, tissues)
X <- colon_data$X

foo <- bmspcov(X, Sigma = cov(X))
sigmah <- estimate(foo)


bspcov documentation built on July 3, 2025, 1:10 a.m.