KW.hit.express: Associate Lesion with Expression Data Using Kruskal-Wallis...

View source: R/KW.hit.express.R

KW.hit.expressR Documentation

Associate Lesion with Expression Data Using Kruskal-Wallis Test

Description

Function uses Kruskal-Wallis test to evaluate the association between lesion groups and expression level of the same corresponding gene.

Usage

KW.hit.express(alex.data, gene.annotation, min.grp.size = NULL)

Arguments

alex.data

output of the alex.prep.lsn.expr function. It's a list of three data tables that include "row.mtch", "alex.expr" with expression data, "alex.lsn" with lesion data. Rows of alex.expr, and "alex.lsn" matrices are ordered by gene ensembl IDs and columns are ordered by patient ID.

gene.annotation

Gene annotation data either provided by the user or retrieved from ensembl BioMart database using get.ensembl.annotation function included in the GRIN2.0 library. Data.frame should has four columns: "gene" which is the ensembl ID of annotated genes, "chrom" which is the chromosome on which the gene is located, "loc.start" which is the gene start position, and "loc.end" the gene end position.

min.grp.size

Minimum number of subjects in a lesion group to be included in the KW test (there should be at least two groups with number of patients > min.grp.size) to run the KW test for a certain gene.

Details

The function uses the ensembl IDs in each row of the row.mtch file and run the Kruskal-Wallis test for association between lesion groups of the gene in the "hit.row" column with expression level of the gene in the "expr.row" column. IDs in the two columns should be the same if the KW test will be used to evaluate association between lesion groups and expression level of the same corresponding gene. If the same patient is affected with multiple types of lesions in the same gene for example gain AND mutations, the entry will be denoted as "multiple" and patients without any type of lesions will be coded as "none".

Value

A data table with multiple columns that include:

gene

ensembl ID of the gene of interest.

gene.name

Gene name of the gene of interest.

p.KW

Kruskal-Wallis test p-value.

q.KW

Kruskal-Wallis test FDR adjusted q-value.

_n.subjects

Multiple columns with number of subjects with each type of lesion affecting the gene, number of subjects without any lesion and number of subjects with multiple types of lesions.

_mean

Multiple columns with mean expression level of the gene in subjects with each type of lesion, mean expression in subjects without any lesion and mean expression in subjects with multiple types of lesions.

_median

Multiple columns with median expression of the gene in subjects with each type of lesion, median expression in subjects without any lesion and median expression in subjects with multiple types of lesions.

_sd

Multiple columns with standard deviation of the expression level of the gene in subjects with each type of lesion, standard deviation in subjects without any lesion and standard deviation in subjects with multiple types of lesions.

Author(s)

Abdelrahman Elsayed abdelrahman.elsayed@stjude.org and Stanley Pounds stanley.pounds@stjude.org

References

Myles Hollander and Douglas A. Wolfe (1973), Nonparametric Statistical Methods. New York: John Wiley & Sons. Pages 115–120.

Cao, X., Elsayed, A. H., & Pounds, S. B. (2023). Statistical Methods Inspired by Challenges in Pediatric Cancer Multi-omics.

See Also

alex.prep.lsn.expr()

Examples

data(expr.data)
data(lesion.data)
data(hg19.gene.annotation)

# prepare expression, lesion data and return the set of genes with both types of data available
# ordered by gene IDs in rows and patient IDs in columns:
alex.data=alex.prep.lsn.expr(expr.data, lesion.data,
                             hg19.gene.annotation, min.expr=1, min.pts.lsn=5)

# run Kruskal-Wallis test for association between lesion groups and expression level of the
# same corresponding gene:
alex.kw.results=KW.hit.express(alex.data, hg19.gene.annotation, min.grp.size=5)

GRIN2 documentation built on April 4, 2025, 1:41 a.m.