alex.prep.lsn.expr: Prepare Lesion and Expression Data for Kruskal Wallis Test
In GRIN2: Genomic Random Interval (GRIN)

alex.prep.lsn.expr

R Documentation

Prepare Lesion and Expression Data for Kruskal Wallis Test

Description

Prepares matched lesion and expression data matrices for use with the KW.hit.express function, which performs Kruskal Wallis tests to assess associations between lesion groups and gene expression levels.

Usage

alex.prep.lsn.expr(
  expr.mtx,
  lsn.data,
  gene.annotation,
  min.expr = NULL,
  min.pts.lsn = NULL
)

Arguments

`expr.mtx`	A data frame or matrix of normalized, log2-transformed gene expression values with genes in rows and subjects in columns. The first column must be named `"ensembl.ID"` and contain Ensembl gene IDs.
`lsn.data`	A data frame of lesion data in GRIN-compatible format. Must contain five columns: `"ID"` (patient ID), `"chrom"` (chromosome), `"loc.start"` (lesion start position), `"loc.end"` (lesion end position), and `"lsn.type"` (lesion type; e.g., gain, loss, mutation, fusion, etc..).
`gene.annotation`	A gene annotation data frame, either user-provided or retrieved via `get.ensembl.annotation()` from the GRIN2.0 package. Must contain four columns: `"gene"` (Ensembl gene ID), `"chrom"` (chromosome), `"loc.start"` (gene start), and `"loc.end"` (gene end).
`min.expr`	Minimum total expression level required for a gene to be retained (i.e., sum of expression values across all subjects). Useful to filter out genes with very low expression.
`min.pts.lsn`	Minimum number of subjects required to have a lesion in a gene for that gene to be retained for the KW test.

Details

The function uses prep.lsn.type.matrix() internally to create a lesion matrix where each gene is represented by one row and all lesion types are included. It filters genes to retain only those with both sufficient expression and lesion data. The final expression and lesion matrices are matched by gene and patient IDs, with rows ordered by Ensembl gene ID and columns by patient ID.

Value

A list with the following components:

`alex.expr`	A matrix of gene expression data with Ensembl gene IDs as row names and patient IDs as column names.
`alex.lsn`	A matrix of lesion data for the same genes and patients as in `alex.expr`, similarly ordered.
`alex.row.mtch`	A data frame with two columns showing the matched Ensembl gene IDs from the expression and lesion matrices.

Author(s)

Abdelrahman Elsayed abdelrahman.elsayed@stjude.org, Stanley Pounds stanley.pounds@stjude.org

References

Cao, X., Elsayed, A. H., & Pounds, S. B. (2023). Statistical Methods Inspired by Challenges in Pediatric Cancer Multi-omics.

Examples

data(expr_data)
data(lesion_data)
data(hg38_gene_annotation)

# Prepare matched lesion and expression data
alex.data <- alex.prep.lsn.expr(expr_data, lesion_data,
                                hg38_gene_annotation, min.expr = 1,
                                min.pts.lsn = 5)

GRIN2 documentation built on June 17, 2025, 9:11 a.m.