XtraSNPlocs-class: XtraSNPlocs objects

Description Usage Arguments Value Author(s) See Also Examples

Description

The XtraSNPlocs class is a container for storing extra SNP locations and alleles for a given organism. While a SNPlocs object can store only molecular variations of class snp, an XtraSNPlocs object contains molecular variations of other classes (in-del, heterozygous, microsatellite, named-locus, no-variation, mixed, multinucleotide-polymorphism).

XtraSNPlocs objects are usually made in advance by a volunteer and made available to the Bioconductor community as XtraSNPlocs data packages. See ?available.SNPs for how to get the list of SNPlocs and XtraSNPlocs data packages curently available.

The main focus of this man page is on how to extract SNPs from an XtraSNPlocs object.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
## S4 method for signature 'XtraSNPlocs'
snpcount(x)

## S4 method for signature 'XtraSNPlocs'
snpsBySeqname(x, seqnames,
        columns=c("seqnames", "start", "end", "strand", "RefSNP_id"),
        drop.rs.prefix=FALSE, as.DataFrame=FALSE)

## S4 method for signature 'XtraSNPlocs'
snpsByOverlaps(x, ranges,
        columns=c("seqnames", "start", "end", "strand", "RefSNP_id"),
        drop.rs.prefix=FALSE, as.DataFrame=FALSE, ...)

## S4 method for signature 'XtraSNPlocs'
snpsById(x, ids,
        columns=c("seqnames", "start", "end", "strand", "RefSNP_id"),
        ifnotfound=c("error", "warning", "drop"), as.DataFrame=FALSE)

## S4 method for signature 'XtraSNPlocs'
colnames(x, do.NULL=TRUE, prefix="col")

Arguments

x

An XtraSNPlocs object.

seqnames

The names of the sequences for which to get SNPs. NAs and duplicates are not allowed. The supplied seqnames must be a subset of seqlevels(x).

columns

The names of the columns to return. Valid column names are: seqnames, start, end, width, strand, RefSNP_id, alleles, snpClass, loctype. See Details section below for a description of these columns.

drop.rs.prefix

Should the rs prefix be dropped from the returned RefSNP ids? (RefSNP ids are stored in the RefSNP_id metadata column of the returned object.)

as.DataFrame

Should the result be returned in a DataFrame instead of a GRanges object?

ranges

One or more regions of interest specified as a GRanges object. A single region of interest can be specified as a character string of the form "ch14:5201-5300".

...

Additional arguments, for use in specific methods.

Arguments passed to the snpsByOverlaps method for XtraSNPlocs objects thru ... are used internally in the call to subsetByOverlaps(). See ?IRanges::subsetByOverlaps in the IRanges package and ?GenomicRanges::subsetByOverlaps in the GenomicRanges package for more information about the subsetByOverlaps() generic and its method for GenomicRanges objects.

ids

The RefSNP ids to look up (a.k.a. rs ids). Can be integer or character vector, with or without the "rs" prefix. NAs are not allowed.

ifnotfound

What to do if SNP ids are not found.

do.NULL, prefix

These arguments are ignored.

Value

snpcount returns a named integer vector containing the number of SNPs for each chromosome in the reference genome.

snpsBySeqname and snpsById both return a GRanges object with 1 element per SNP, unless as.DataFrame is set to TRUE in which case they return a DataFrame with 1 row per SNP. When a GRanges object is returned, the columns requested via the columns argument are stored as metada columns of the object, except for the following columns: seqnames, start, end, width, and strand. These "spatial columns" (in the sense that they describe the genomic locations of the SNPs) can be accessed by calling the corresponding getter on the GRanges object.

Summary of available columns (my_snps being the returned object):

colnames(x) returns the names of the available columns.

Author(s)

H. Pag<c3><a8>s

See Also

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
library(XtraSNPlocs.Hsapiens.dbSNP144.GRCh38)
snps <- XtraSNPlocs.Hsapiens.dbSNP144.GRCh38
snpcount(snps)
colnames(snps)

## ---------------------------------------------------------------------
## snpsBySeqname()
## ---------------------------------------------------------------------

## Get the location, RefSNP id, and alleles for all "extra SNPs"
## located on chromosome 22 or MT:
snpsBySeqname(snps, c("ch22", "chMT"), columns=c("RefSNP_id", "alleles"))

## ---------------------------------------------------------------------
## snpsByOverlaps()
## ---------------------------------------------------------------------

## Get the location, RefSNP id, and alleles for all "extra SNPs"
## overlapping some regions of interest:
snpsByOverlaps(snps, "ch22:33.63e6-33.64e6",
               columns=c("RefSNP_id", "alleles"))

## With the regions of interest being all the known CDS for hg38
## (except for the chromosome naming convention, hg38 is the same
## as GRCh38):
library(TxDb.Hsapiens.UCSC.hg38.knownGene)
txdb <- TxDb.Hsapiens.UCSC.hg38.knownGene
hg38_cds <- cds(txdb)
seqlevelsStyle(hg38_cds)  # UCSC
seqlevelsStyle(snps)      # dbSNP
seqlevelsStyle(hg38_cds) <- seqlevelsStyle(snps)
genome(hg38_cds) <- genome(snps)
snpsByOverlaps(snps, hg38_cds, columns=c("RefSNP_id", "alleles"))

## ---------------------------------------------------------------------
## snpsById()
## ---------------------------------------------------------------------

## Get the location and alleles for some RefSNP ids:
my_rsids <- c("rs367617508", "rs398104919", "rs3831697", "rs372470289",
              "rs141568169", "rs34628976", "rs67551854")
snpsById(snps, my_rsids, c("RefSNP_id", "alleles"))

## See ?XtraSNPlocs.Hsapiens.dbSNP144.GRCh38 for more examples of using
## snpsBySeqname() and snpsById().

BSgenome documentation built on Nov. 8, 2020, 7:48 p.m.