Convert from build 36 to build 37 SNP coordinates

Share:

Description

Convert range or SNP coordinates between builds using a chain file. Depending on the chain file this can do any conversion, but the default will use the hg18 to hg19 (36–>37) chain file built into this package. The positions to convert can be entered using using chr, pos vectors, or a RangedData or GRanges object. This function is a wrapper for liftOver() from rtracklayer, providing more control of input and output and 'defensive' preservation of order and length of the output versus the input ranges/SNPs.

Usage

1
2
conv.36.37(ranges = NULL, chr = NULL, pos = NULL, ..., ids = NULL,
  chain.file = NULL, include.cols = TRUE)

Arguments

ranges

optional GRanges or RangedData object describing positions for which conversion should be performed. No need to enter chr, pos if using ranges

chr

character, an optional vector of chromosomes to combine with 'pos' to describe positions to convert to an alternative build

pos

integer, an optional vector of chromosome positions (for SNPs), no need to enter a ranges object if this is provided along with 'chr'

...

additional arguments to makeGRanges(), so in other words, can use 'start' and 'end' to specify ranges instead of 'pos'.

ids

if the ranges have ids (e.g, SNP ids, CNV ids), then by including this parameter when using chr, pos input, the output object will have these ids as rownames. For ranges input these ids would already be in the rownames of the GRanges or RangedData object, so use of this parameter should be unnecessary

chain.file

character, a file location for the liftOver chain file to use for the conversion. If this argument is left NULL the default UCSC file that converts from hg18 to hg19 will be used. Can also use a 'Chain' object from rtracklayer created using import.chain(). Alternate chain files for other conversions are available from http://crossmap.sourceforge.net/, and you could also customize these or create your own. So this function can be used for conversion between any in-out build combination, using this argument, not just 36–37.

include.cols

logical, whether to include any extra columns (e.g, in addition to positional information) in the output object.

Value

Returns positions converted from build 36 to 37 (or equivalent for alternative chain files). If using the 'ranges' parameter for position input, the object returned will be of the same format. If using chr and pos to input, then the object returned will be a data.frame with columns, chr and pos with rownames 'ids'. Output will be the same length as the input, which is not necessarily the case for liftOver() which does the core part of this conversion. Using vector or GRanges input will give a resulting data.frame or GRanges object respectively that has the same order of rownames as the original input. Using RangedData will result in an output that is sorted by genome order, regardless of the original order. If ranges has no rownames, or if 'ids' is blank when using chr, pos, ids of the form rngXXXX will be generated in order to preserve the original ordering of locations.

Author(s)

Nicholas Cooper nick.cooper@cimr.cam.ac.uk

References

http://crossmap.sourceforge.net/

See Also

conv.37.36, conv.37.38, conv.38.37, convTo37, convTo36

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# various chain files downloadable from http://crossmap.sourceforge.net/ #
options(ucsc="hg18")
gene.labs <- c("CTLA4","IL2RA","HLA-C")
snp.ids <- c("rs3842724","rs9729550","rs1815606","rs114582555","rs1240708","rs6603785")
pp <- Pos(snp.ids); cc <- Chr(snp.ids)
conv.36.37(chr=cc,pos=pp,ids=snp.ids)
pp <- Pos(gene.labs)
gg <- GRanges(ranges=IRanges(start=pp$start,end=pp$end),seqnames=pp$chr)
conv.36.37(gg) # order of output is preserved
rr <- as(gg,"RangedData")
conv.36.37(rr) # note the result is same as GRanges, but in genome order

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.