prob.hits: Find Probablity of Locus Hit

View source: R/prob.hits.R

prob.hitsR Documentation

Find Probablity of Locus Hit

Description

The function evaluates the probability of a locus to be affected by one or a constellation of multiple types of lesions.

Usage

prob.hits(hit.cnt, chr.size = NULL)

Arguments

hit.cnt

output results of the count.hits function with number of subjects and number of hits affecting each locus.

chr.size

data.frame with the size of the 22 autosomes, in addition to X and Y chromosomes in base pairs. The data.frame should has two columns "chrom" with the chromosome number and "size" for the size of the chromosome in base pairs.

Details

The function computes p-value for the probability of each locus (gene or regulatory feature) to be affected by different types of lesions based on a convolution of independent but non-identical Bernoulli distributions to determine whether a certain locus has an abundance of lesions that is statistically significant.In addition, FDR-adjusted q value is computed for each locus based on Pounds & Cheng (2006) estimator of the proportion of tests with a true null (pi.hat). The function also evaluates if a certain locus is affected by a constellation of multiple types of lesions and computes a p and adjusted q values for the locus to be affected by one type of lesions (p1), two types of lesions (p2), etc...

Value

A list with the following components:

gene.hits

data table of GRIN results that include gene annotation, number of subjects affected by each lesion type for example gain, loss, mutation, etc.., and number of hits affecting each locus. The GRIN results table will also include P and FDR adjusted q-values showing the probability of each locus of being affected by one or a constellation of multiple types of lesions.

lsn.data

input lesion data

gene.data

input gene annotation data

gene.lsn.data

each row represent a gene overlapped by a certain lesion. Column "gene" shows the overlapped gene ensembl ID and "ID"" column has the patient ID.

chr.size

data table showing the size of the 22 autosomes, in addition to X and Y chromosomes in base pairs.

gene.index

data.frame with overlapped gene-lesion data rows that belong to each chromosome in the gene.lsn.data table.

lsn.index

data.frame that shows the overlapped gene-lesion data rows taht belong to each lesion in the gene.lsn.data table.

Author(s)

Stanley Pounds stanley.pounds@stjude.org

References

Pounds, Stan, et al. (2013) A genomic random interval model for statistical analysis of genomic lesion data.

Cao, X., Elsayed, A. H., & Pounds, S. B. (2023). Statistical Methods Inspired by Challenges in Pediatric Cancer Multi-omics.

See Also

prep.gene.lsn.data(), find.gene.lsn.overlaps(), count.hits()

Examples

data(lesion.data)
data(hg19.gene.annotation)
data(hg19.chrom.size)

# prepare gene and lesion data for later computations:
prep.gene.lsn=prep.gene.lsn.data(lesion.data,
                                 hg19.gene.annotation)

# determine lesions that overlap each gene (locus):
gene.lsn.overlap=find.gene.lsn.overlaps(prep.gene.lsn)

# count number of subjects affected by different types of lesions and number of hits that affect
# each locus:
count.subj.hits=count.hits(gene.lsn.overlap)

# compute the probability of each locus to be affected by one or a constellation of multiple
# types of lesion
hits.prob=prob.hits(count.subj.hits, hg19.chrom.size)

GRIN2 documentation built on April 4, 2025, 1:41 a.m.