grin.stats: Execute GRIN Statistical Framework
In GRIN2: Genomic Random Interval (GRIN)

grin.stats

R Documentation

Execute GRIN Statistical Framework

Description

Executes the Genomic Random Interval (GRIN) statistical framework to determine whether a specific genomic locus (gene or regulatory region) is significantly affected by either individual or a constellation of multiple lesion types.

Usage

grin.stats(lsn.data, gene.data = NULL, chr.size = NULL, genome.version = NULL)

Arguments

`lsn.data`	A `data.frame` containing lesion data formatted for GRIN analysis. It must include the following five columns: `ID`: Sample or patient identifier. `chrom`: Chromosome on which the lesion is located. `loc.start`: Genomic start coordinate of the lesion. `loc.end`: Genomic end coordinate of the lesion. `lsn.type`: Lesion type (e.g., gain, loss, mutation, fusion, etc...). For Single Nucleotide Variants (SNVs), loc.start and loc.end should be the same. For Copy Number Alterations (CNAs) such as gains and deletions, these fields represent the lesion start and end positions (lesion boundary). Structural rearrangements (e.g., translocations, inversions) should be represented by two entries (two separate rows), one for each breakpoint. An example dataset is available in the GRIN2 package (`lesion_data.rda`).
`gene.data`	A `data.frame` containing gene annotation data. Must include the following columns: `gene`: Ensembl gene ID. `chrom`: Chromosome where the gene is located. `loc.start`: Gene start position. `loc.end`: Gene end position. This data can be user-provided or retrieved automatically via `get.ensembl.annotation()` if `genome.version` is specified.
`chr.size`	A `data.frame` specifying chromosome sizes. Must contain: `chrom`: Chromosome number. `size`: Chromosome length in base pairs. The data can be user-provided or directly retrieved using `get.chrom.length()` if `genome.version` is specified.
`genome.version`	Optional. If gene annotation and chromosome size files are not provided, users can specify a supported genome assembly to retrieve these files automatically. Currently, the package only support "Human_GRCh38" genome assembly.

Details

The GRIN algorithm evaluates each locus to determine whether the observed frequency and distribution of lesions is greater than expected by chance. This is modeled using a convolution of independent, non-identical Bernoulli distributions, accounting for lesion type, locus size, and chromosome context.

For each gene, the function calculates:

A p-value for the enrichment of lesion events
An FDR-adjusted q-value using the Pounds & Cheng (2006) method
Significance of multi-lesion constellation patterns (e.g., p-value for a locus being affected by 1, 2, etc., lesion types)

Value

A list containing:

`gene.hits`	A `data.frame` of GRIN results for each gene, including annotation, subject/hit counts by lesion type, and p/q-values for individual and multi-lesion constellation significance.
`lsn.data`	The original lesion input data.
`gene.data`	The original gene annotation input data.
`gene.lsn.data`	A `data.frame` where each row represents a gene-lesion overlap. Includes columns `"gene"` (Ensembl ID) and `"ID"` (sample ID).
`chr.size`	The chromosome size reference table used in computations.
`gene.index`	Indexes linking genes to rows in `gene.lsn.data` by chromosome.
`lsn.index`	Indexes linking lesions to rows in `gene.lsn.data`.

Author(s)

Abdelrahman Elsayed abdelrahman.elsayed@stjude.org, Stanley Pounds stanley.pounds@stjude.org

References

Pounds, S. et al. (2013). A genomic random interval model for statistical analysis of genomic lesion data.

Cao, X., Elsayed, A. H., & Pounds, S. B. (2023). Statistical Methods Inspired by Challenges in Pediatric Cancer Multi-omics.

Examples

data(lesion_data)
data(hg38_gene_annotation)
data(hg38_chrom_size)

# Example1: Run GRIN with user-supplied annotation and chromosome size:
grin.results <- grin.stats(lesion_data,
                           hg38_gene_annotation,
                           hg38_chrom_size)

# Example 2: User can specify genome version to automatically retrieve annotation
# and chromosome size data:
# grin.results <- grin.stats(lesion_data,
#                            genome.version = "Human_GRCh38")

GRIN2 documentation built on June 17, 2025, 9:11 a.m.