SNPlocs-class: SNPlocs objects

Description Usage Arguments Value Author(s) See Also Examples

Description

The SNPlocs class is a container for storing known SNP locations (of class snp) for a given organism.

SNPlocs objects are usually made in advance by a volunteer and made available to the Bioconductor community as SNPlocs data packages. See ?available.SNPs for how to get the list of SNPlocs and XtraSNPlocs data packages curently available.

The main focus of this man page is on how to extract SNPs from an SNPlocs object.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
snpcount(x)

snpsBySeqname(x, seqnames, ...)
## S4 method for signature 'SNPlocs'
snpsBySeqname(x, seqnames, drop.rs.prefix=FALSE)

snpsByOverlaps(x, ranges, ...)
## S4 method for signature 'SNPlocs'
snpsByOverlaps(x, ranges, drop.rs.prefix=FALSE, ...)

snpsById(x, ids, ...)
## S4 method for signature 'SNPlocs'
snpsById(x, ids, ifnotfound=c("error", "warning", "drop"))

Arguments

x

A SNPlocs object.

seqnames

The names of the sequences for which to get SNPs. Must be a subset of seqlevels(x). NAs and duplicates are not allowed.

...

Additional arguments, for use in specific methods.

Arguments passed to the snpsByOverlaps method for SNPlocs objects thru ... are used internally in the call to subsetByOverlaps(). See ?IRanges::subsetByOverlaps in the IRanges package and ?GenomicRanges::subsetByOverlaps in the GenomicRanges package for more information about the subsetByOverlaps() generic and its method for GenomicRanges objects.

drop.rs.prefix

Should the rs prefix be dropped from the returned RefSNP ids? (RefSNP ids are stored in the RefSNP_id metadata column of the returned object.)

ranges

One or more genomic regions of interest specified as a GRanges or GPos object. A single region of interest can be specified as a character string of the form "ch14:5201-5300".

ids

The RefSNP ids to look up (a.k.a. rs ids). Can be integer or character vector, with or without the "rs" prefix. NAs are not allowed.

ifnotfound

What to do if SNP ids are not found.

Value

snpcount returns a named integer vector containing the number of SNPs for each sequence in the reference genome.

snpsBySeqname, snpsByOverlaps, and snpsById return an unstranded GPos object with 1 element (genomic position) per SNP and the following metadata columns:

Note that this GPos object is unstranded i.e. all the SNPs in it have their strand set to "*".

If ifnotfound="error", the object returned by snpsById is guaranteed to be parallel to ids, that is, the i-th element in the GPos object corresponds to the i-th element in ids.

Author(s)

H. Pag<c3><a8>s

See Also

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
library(SNPlocs.Hsapiens.dbSNP144.GRCh38)
snps <- SNPlocs.Hsapiens.dbSNP144.GRCh38
snpcount(snps)

## ---------------------------------------------------------------------
## snpsBySeqname()
## ---------------------------------------------------------------------

## Get all SNPs located on chromosome 22 or MT:
snpsBySeqname(snps, c("22", "MT"))

## ---------------------------------------------------------------------
## snpsByOverlaps()
## ---------------------------------------------------------------------

## Get all SNPs overlapping some genomic region of interest:
snpsByOverlaps(snps, "22:33.63e6-33.64e6")

## With the regions of interest being all the known CDS for hg38
## located on chromosome 22 or MT (except for the chromosome naming
## convention, hg38 is the same as GRCh38):
library(TxDb.Hsapiens.UCSC.hg38.knownGene)
txdb <- TxDb.Hsapiens.UCSC.hg38.knownGene
my_cds <- cds(txdb)
seqlevels(my_cds, pruning.mode="coarse") <- c("chr22", "chrM")
seqlevelsStyle(my_cds)  # UCSC
seqlevelsStyle(snps)    # NCBI
seqlevelsStyle(my_cds) <- seqlevelsStyle(snps)
genome(my_cds) <- genome(snps)
my_snps <- snpsByOverlaps(snps, my_cds)
my_snps
table(my_snps %within% my_cds)

## ---------------------------------------------------------------------
## snpsById()
## ---------------------------------------------------------------------

## Lookup some RefSNP ids:
my_rsids <- c("rs10458597", "rs12565286", "rs7553394")
## Not run: 
  snpsById(snps, my_rsids)  # error, rs7553394 not found

## End(Not run)
## The following example uses more than 2GB of memory, which is more
## than what 32-bit Windows can handle:
is_32bit_windows <- .Platform$OS.type == "windows" &&
                    .Platform$r_arch == "i386"
if (!is_32bit_windows) {
    snpsById(snps, my_rsids, ifnotfound="drop")
}

BSgenome documentation built on May 6, 2019, 2:29 a.m.