getFSNPs: Functional Identification of SNPs with Phenotype by...

Description Usage Arguments Details Value Note Author(s) References See Also Examples

View source: R/funcisnp.R

Description

Given a set of known tag-SNPs associated with a particular phenotype (e.g. disease, trait), and a set of available biological features (e.g. peaks derived from ChIP-seq experiments for phenotype), returns correlated SNPs (from the 1000 genomes db) which are in linkage disequilibrium (LD) to a known disease associated tag-SNP and overlaps chromatin biological features. These identified correlated SNPs are characterized as putative functional SNPs for a particular trait.

Usage

1
2
3
4
5
6
getFSNPs(snp.regions.file, bio.features.loc = NULL,
                     built.in.biofeatures = TRUE,
                     par.threads=detectCores()/2,
                     verbose = par.threads < 2, method.p = "BH",
                     search.window = 200000,
                     primary.server = "ebi")

Arguments

snp.regions.file

path: Location of the regions file: Regions file is tab-deliminated and contains three elements per row. First element defines the genomic location of the tagSNP, 'chr:position' (e.g. 5:5420030). Second element contains the tagSNP name, 'rsID' (e.g. rs6010620). Third element defines the 'POPULATION' (ASN, EUR, AFR, ALL) where the tagSNP was identified (e.g. ASN, EUR, AFR, ALL).

SNP Region file is imported and each row element (tagSNP element) is parsed for tagSNP name (rsXXXX), population (ASN, EUR, AFR, or ALL), and genomic location. Genomic location is used to define the window size (see 'search.window' argument). See example file here: file.path(system.file('data',package='FunciSNP'), dir(system.file('data',package='FunciSNP'), pattern='.snp$'));

bio.features.loc

path: Location of the biological features folder: Each biological feature for a particular genomic phenotype should be separated as individual BED files (tab deliminated file with chr, start and end). See UCSC for more information about BED formats http://genome.ucsc.edu/FAQ/FAQformat.html#format1. See example below. Default set to NULL.

built.in.biofeatures

logical: To include promoter regions, Encode DNaseI and CTCF sites as an additional biofeature in the analysis. Promoters defined as -1000 to +100 bp of a known TSS. File extracted on Feb. 9, 2012 from UCSC genome table browser. Default set to TRUE.

par.threads

an integer: Number of CPU cores to use for FunciSNP analysis. Default set at detectCores()/2. If par.threads > 1, then by default "verbose" = FALSE.

verbose

logical: If set to TRUE, then regardless of par.threads value, all verbose message will output to terminal. If set to FALSE, no verbose message will output to terminal, except for warnings(). Default setting depends on number of 'par.threads' value.

method.p

method: p-value correction (or adjustment) method (see ?p.adjust). Default set at "BH" (Benjamini & Hochberg (1995)).

search.window

an integer: genomic window size used to extract all available correlated SNPs from the 1000 genomes db. The window size is centered around the tagSNP position as defined in the regions.file.

primary.server

default value is "ebi".

Details

This is the main funtion of FunciSNP. It will identify correlated SNPs which are in linkage disequilibrium (LD) to a known disease associated tagSNP. It will also determine if the correlated SNP in LD to the tagSNP overlaps a genomic biological feature. Correlated SNPs are directly imported from the current public release of the 1000 genomes database. 1000 genomes ftp servers available for the 1000 genomes public data: 1) National Center for Biotechnology Information (NCBI) ftp://ftp-trace.ncbi.nih.gov/1000genomes/; 2) European Bioinformatics Institute (EBI) ftp://ftp.1000genomes.ebi.ac.uk/vol1/.

Correlated SNPs in LD to a tagSNP and overlapping genomic biological features are known as putative functional SNPs (also defined as 'YAFSNP' elsewhere in the package.).

Value

TSList

FunciSNP object.

Note

NA

Author(s)

Simon G. Coetzee (maintainer: scoetzee@gmail.com); Houtan Noushmehr, PhD (houtan@usp.br)

References

SG. Coetzee, SK. Rhie, BP. Berman, GA. Coetzee and H. Noushmehr, FunciSNP: An R/Bioconductor Tool Integrating Functional Non-coding Datasets with Genetic Association Studies to Identify Candidate Regulatory SNPs., Nucleic Acids Research, In press, 2012 (doi:10.1093/nar/gks542).

See Also

FunciSNPplot, FunciSNPAnnotateSummary, FunciSNPtable, FunciSNPbed

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
##
## Glioblastoma analysis using FunciSNP
##
## Full path to the example regions file for Glioblastoma 
#  (collected from SNPedia)
glioma.snp <- file.path(system.file('extdata',
  package='FunciSNP'),
  dir(system.file('extdata',package='FunciSNP'), 
  pattern='.snp$'));
 
## Full path to the example biological features BED files 
#  derived from the ENCODE project for Glioblastoma U-87 
#  cell lines.
glioma.bio <- system.file('extdata',package='FunciSNP');

## FunciSNP analysis, extracts correlated SNPs from the 
#  1000 genomes db ("ncbi") and finds overlaps between 
#  correlated SNP and biological features and then 
#  calculates LD (Rsquare, Dprime, distance, p-value).
# Do not run. Can take more than 5 min depending on internet connection and number of CPUs.
#glioma <- getFSNPs(snp.regions.file=glioma.snp, 
#  bio.features.loc = glioma.bio);

##
data(glioma);
class(glioma);
glioma;
summary(glioma);

shraddhapai/FunciSNP documentation built on May 29, 2019, 9:26 p.m.