getMappedEntrezIDs: Get mapped Entrez Gene IDs from CpG probe names

View source: R/getMappedEntrezIDs.R

getMappedEntrezIDsR Documentation

Get mapped Entrez Gene IDs from CpG probe names

Description

Given a set of CpG probe names and optionally all the CpG sites tested, this function outputs a list containing the mapped Entrez Gene IDs as well as the numbers of probes per gene, and a vector indicating significance.

Usage

getMappedEntrezIDs(
  sig.cpg,
  all.cpg = NULL,
  array.type = c("450K", "EPIC"),
  anno = NULL,
  genomic.features = c("ALL", "TSS200", "TSS1500", "Body", "1stExon", "3'UTR", "5'UTR",
    "ExonBnd")
)

Arguments

sig.cpg

Character vector of significant CpG sites used for testing gene set enrichment.

all.cpg

Character vector of all CpG sites tested. Defaults to all CpG sites on the array.

array.type

The Illumina methylation array used. Options are "450K" or "EPIC".

anno

Optional. A DataFrame object containing the complete array annotation as generated by the minfi getAnnotation function. Speeds up execution, if provided.

genomic.features

Character vector or scalar indicating whether the gene set enrichment analysis should be restricted to CpGs from specific genomic locations. Options are "ALL", "TSS200","TSS1500","Body","1stExon", "3'UTR","5'UTR","ExonBnd"; and the user can select any combination. Defaults to "ALL".

Details

This function is used by the gene set testing functions gometh and gsameth. It maps the significant CpG probe names to Entrez Gene IDs, as well as all the CpG sites tested. It also calculates the numbers of probes for gene. Input CpGs are able to be restricted by genomic features using the genomic.features argument.

Genes associated with each CpG site are obtained from the annotation package IlluminaHumanMethylation450kanno.ilmn12.hg19 if the array type is "450K". For the EPIC array, the annotation package IlluminaHumanMethylationEPICanno.ilm10b4.hg19 is used. To use a different annotation package, please supply it using the anno argument.

Value

A list with the following elements

sig.eg

mapped Entrez Gene IDs for the significant probes

universe

mapped Entrez Gene IDs for all probes on the array, or for all the CpG probes tested.

freq

table output with numbers of probes associated with each gene

equiv

table output with equivalent numbers of probes associated with each gene taking into account multi-gene bias

de

a vector of ones and zeroes of the same length of universe indicating which genes in the universe are significantly differentially methylated.

fract.counts

a dataframe with 2 columns corresponding to the Entrez Gene IDS for the significant probes and the associated weight to account for multi-gene probes.

Author(s)

Belinda Phipson

See Also

gometh,gsameth

Examples


## Not run:  # to avoid timeout on Bioconductor build
library(IlluminaHumanMethylation450kanno.ilmn12.hg19)
library(org.Hs.eg.db)
library(limma)
ann <- getAnnotation(IlluminaHumanMethylation450kanno.ilmn12.hg19)

# Randomly select 1000 CpGs to be significantly differentially methylated
sigcpgs <- sample(rownames(ann),1000,replace=FALSE)

# All CpG sites tested
allcpgs <- rownames(ann)

mappedEz <- getMappedEntrezIDs(sigcpgs,allcpgs,array.type="450K")
names(mappedEz)
# Entrez IDs of the significant genes
mappedEz$sig.eg[1:10]
# Entrez IDs for the universe
mappedEz$universe[1:10]
# Number of CpGs per gene
mappedEz$freq[1:10]
# Equivalent numbers of CpGs measured per gene
mappedEz$equiv[1:10]
A vector of 0s and 1s indicating which genes in the universe are significant
mappedEz$de[1:10]

## End(Not run)


Oshlack/missMethyl documentation built on March 26, 2023, 1:50 p.m.