anno.create: Function to create a mapping object

Description Usage Arguments Details Value See Also Examples

View source: R/anno.create.R

Description

This Function creates an annotatedSNPset object by mapping input SNPs to the appropriate input pathways or SNP annotation matrix.

Usage

1
anno.create(snp.ids, anno.mat=NULL, path.def=NULL, u.span=250,d.span=100)

Arguments

snp.ids

A character vector of SNP rsIDs (i.e. names of SNPs for which summary data e.g. odds ratios and p-values of association have been calculated).

anno.mat

A matrix having the SNPs listed in snp.ids in the rows and annotation features (e.g. ENCODE functional categories) in the columns. The entries in a column may be 1/0 (denoting whether a SNP has or lacks that feature respectively), or -1 to denote missing. Note that currently columns with other numeric values are removed. Only one of the two arguments, either anno.mat or path.def is allowed to be non-null. To specify both see anno.merge.

path.def

The pathway definitions.This can either be an R list with gene symbols (HGNC) or a gene-level annotion matrix with gene symbols (HGNC) as rownames and pathways or other gene-level characteristics as columns The entries of the matrix must be 1/0 (denoting whether the gene belongs to that pathway or has that characteristic). Only one of the two arguments, either anno.mat or path.def is allowed to be non-null. To specify both see anno.merge.

u.span

Upstream span for each transcript from transcript start. Default value is 250 kb. This is required for mapping SNPs into genes and is used only if path.def is non-null.

d.span

Downstream span from transcript end. Default value is 100 kb. This is required for mapping SNPs into genes and is used only if path.def is non-null.

Details

An annotatedSNPset object consists of SNP-level equivalence classes. Creation of equivalence classes is two step process. The first step is to map the SNPs into the annotation groups provided as input and then partition the SNPs into groups that map to the same unique set of annotationsi.e. an equivalence class. If anno.mat is provided then the first step is trivial. If path.def is provided, pathways (sets of genes) are first mapped to transcripts using the knownGene table of the UCSC hg19 transcript database. Positions of the input SNPs are obtained from the dbSNP build 142 and finally findOverlaps function of the GRanges package is used to map SNPs back to transcripts, genes and hence pathways.

The binary matrix eq.mat[[1]] holds the equivalence classes. The i^{th} row of this matrix indicates the annotations to which all SNPs of the i^{th} equivalence class belong. For a newly created object eq.mat is a list of length one and for a merged object, it is a list with the same length as number of merged objects. If several objects merged as a single object then all equivalence classes of input objects are merged to form a new equivalence class. eq.map matrix stores this mapping information i.e which input equivalence classes form the new equivalence classes. For a merged object,snp.eq is created by using this new equivalence class. The eq.mat lists are concatenated to hold one binary matrix from each merged object.

Value

It returns an object of annotatedSNPset-class.This object has following items.

snp.df

A data.frame with input rsIDs as rownames and both chromosome number and base pair position as columns.

snp.eq

This is a numeric vector indicating which snp mapped to which equivalence class.

eq.mat

A list of binary matrix or matrices. Dimension of i^{th} matrix is no.of equivalence class by no.of annotations of i^{th} object.

eq.map

Either NULL or an integer matrix. It is NULL for a simple annotatedSNPset object. If more than one annotatedSNPset objects are merged into a composite object then this field holds a map matrix. Number of columns of the map matrix is the number of objects that are merged and the number of rows is the total number of new equivalence classes after merging.

dim

An R list of length 3. First element is a numeric vector of dimensions (number of SNPs no_snps, total number of equivalence classes no_eq_class,total number of annotation no_anno and total number of merged objects used_ob. Other two elements are either NULL(for simple object) or numeric vectors of length equal to the number of objects used for merging i.e. used_ob. These two vectors hold the number of rows and number of columns respectively of the binary matrices in the list eq.mat.

See Also

annotatedSNPset-class,anno.merge,create.anno.mat

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
 snpfile <- system.file("sampleData", "snpData.rda", package="GKnowMTest")
 pathfile <- system.file("sampleData", "pathData.rda", package="GKnowMTest")
 anmfile <- system.file("sampleData", "anmData.rda", package="GKnowMTest")

 load(snpfile) ## loads snp
 load(pathfile) ## loads a R list of gene symbols
 load(anmfile) ##loads annotation matrix

 snp<-rownames(snpdf)
 ob1=anno.create(snp,path.def=pathlist)
 ob1

## Not run: 
  ## creation of annotation matrix
  res=create.anno.mat(snp,base.path="/home/Datasets/Encode/",
      fl.suffix=".annot.wdist.wcoding")
  anm=res[[1]]

 ob2<-anno.create(snp,anno.mat=anm) #annotation matrix as pathlist
 ob2
 
 
## End(Not run)

sbstatgen/GKnowMTest documentation built on May 27, 2019, 7:40 a.m.