xSNP2cGenes: Function to define HiC genes given a list of SNPs

View source: R/xSNP2cGenes.r

xSNP2cGenesR Documentation

Function to define HiC genes given a list of SNPs

Description

xSNP2cGenes is supposed to define HiC genes given a list of SNPs. The HiC weight is calcualted as Cumulative Distribution Function of HiC interaction scores.

Usage

xSNP2cGenes(
data,
entity = c("SNP", "chr:start-end", "data.frame", "bed", "GRanges"),
include.HiC = NA,
GR.SNP = c("dbSNP_GWAS", "dbSNP_Common"),
cdf.function = c("empirical", "exponential"),
plot = FALSE,
verbose = TRUE,
RData.location = "http://galahad.well.ox.ac.uk/bigdata",
guid = NULL
)

Arguments

data

an input vector containing SNPs. SNPs should be provided as dbSNP ID (ie starting with rs) or in the format of 'chrN:xxx', where N is either 1-22 or X, xxx is number; for example, 'chr16:28525386'. Alternatively, it can be other formats/entities (see the next parameter 'entity')

entity

the data entity. By default, it is "SNP". For general use, it can also be one of "chr:start-end", "data.frame", "bed" or "GRanges"

include.HiC

genes linked to input SNPs are also included. By default, it is 'NA' to disable this option. Otherwise, those genes linked to SNPs will be included according to Promoter Capture HiC (PCHiC) datasets. Pre-built HiC datasets are detailed in xDefineHIC

GR.SNP

the genomic regions of SNPs. By default, it is 'dbSNP_GWAS', that is, SNPs from dbSNP (version 146) restricted to GWAS SNPs and their LD SNPs (hg19). It can be 'dbSNP_Common', that is, Common SNPs from dbSNP (version 146) plus GWAS SNPs and their LD SNPs (hg19). Alternatively, the user can specify the customised input. To do so, first save your RData file (containing an GR object) into your local computer, and make sure the GR object content names refer to dbSNP IDs. Then, tell "GR.SNP" with your RData file name (with or without extension), plus specify your file RData path in "RData.location". Note: you can also load your customised GR object directly

cdf.function

a character specifying a Cumulative Distribution Function (cdf). It can be one of 'exponential' based on exponential cdf, 'empirical' for empirical cdf

plot

logical to indicate whether the histogram plot (plus density or CDF plot) should be drawn. By default, it sets to false for no plotting

verbose

logical to indicate whether the messages will be displayed in the screen. By default, it sets to true for display

RData.location

the characters to tell the location of built-in RData files. See xRDataLoader for details

guid

a valid (5-character) Global Unique IDentifier for an OSF project. See xRDataLoader for details

Value

a data frame with following columns:

  • Gene: SNP-interacting genes caputured by HiC

  • SNP: SNPs

  • Sig: the interaction score (the higher stronger)

  • Weight: the HiC weight

Note

none

See Also

xDefineHIC, xSymbol2GeneID

Examples

RData.location <- "http://galahad.well.ox.ac.uk/bigdata"
## Not run: 
# a) provide the SNPs with the significance info
data(ImmunoBase)
data <- names(ImmunoBase$AS$variants)

# b) define HiC genes
df_cGenes <- xSNP2cGenes(data, include.HiC="Monocytes",
RData.location=RData.location)

## End(Not run)

hfang-bristol/XGR documentation built on Feb. 4, 2023, 7:05 a.m.