View source: R/alex.prep.lsn.expr.R
alex.prep.lsn.expr | R Documentation |
The function prepares lesion and expression data matrices for the KW.hit.express function that runs the kruskal-Wallis test for the association between lesion groups and expression level of each gene with available lesion and expression data.
alex.prep.lsn.expr(
expr.mtx,
lsn.data,
gene.annotation,
min.expr = NULL,
min.pts.lsn = NULL
)
expr.mtx |
Normalized log2 transformed expression data provided by the user with genes in rows and subjects in columns (first column "ensembl.ID" should be gene ensembl IDs). |
lsn.data |
Lesion data in GRIN compatible format. Data frame should has five columns that include "ID" with patient ID, "chrom" which is the chromosome on which the lesion is located, "loc.start" which is the lesion start position, "loc.end" the lesion end position and "lsn.type" which is the lesion type for example gain, loss, mutation, fusion, etc... |
gene.annotation |
Gene annotation data either provided by the user or retrieved from ensembl BioMart database using get.ensembl.annotation function included in the GRIN2.0 library. Data.frame should has four columns: "gene" which is the ensembl ID of annotated genes, "chrom" which is the chromosome on which the gene is located, "loc.start" which is the gene start position, and "loc.end" the gene end position. |
min.expr |
Minimum allowed expression level of the gene (the sum of expression level of the gene in all patients; useful to exclude genes with very low expression) |
min.pts.lsn |
Minimum number of patients with any type of lesions in a certain gene otherwise the gene will be excluded from the lesion matrix. |
The function use prep.lsn.type.matrix function to prepare the lesion matrix that has each gene represented in one row with all lesion types included. Next, the function will prepare lesion and expression data matrices for the KW.hit.express function that runs the kruskal-Wallis test. It only keep genes with both lesion and expression data with rows ordered by ensembl ID and columns ordered by patient's ID.
A list with the following components:
alex.expr |
Expression data with gene ensembl IDs as row names and patient IDs as column names. Rows are ordered by ensembl ID and columns ordered by patient IDs. |
alex.lsn |
Lesion data for genes in the expression data matrix with gene ensembl IDs as row names and patient IDs as column names. Rows are ordered by ensembl ID and columns ordered by patient IDs. |
alex.row.mtch |
Data.frame of two columns with ensembl ID of genes in the expression and lesion data matrices (ID should be the same in the two columns). |
Abdelrahman Elsayed abdelrahman.elsayed@stjude.org and Stanley Pounds stanley.pounds@stjude.org
Cao, X., Elsayed, A. H., & Pounds, S. B. (2023). Statistical Methods Inspired by Challenges in Pediatric Cancer Multi-omics.
KW.hit.express()
data(expr.data)
data(lesion.data)
data(hg19.gene.annotation)
# prepare expression, lesion data and return the set of genes with both types of data available
# ordered by gene IDs in rows and patient IDs in columns:
alex.data=alex.prep.lsn.expr(expr.data, lesion.data,
hg19.gene.annotation, min.expr=1,
min.pts.lsn=5)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.