Simulate a GRanges or RangedData object

Share:

Description

For testing purposes, this function will generate a S4 ranged object based on the human genome. The default is to produce ranges selected from chromosomes, with probability of a position in each chromosome equal to the length of that chromosome versus the whole genome. The maximum position allocated within each chromosome will be within the length bounds of that chromosome. You can specify SNPs (ie., start =end), but the default is for random ranges. You can alter the UCSC build to base the chromosome lengths on, and you can specify whether chromosomes should appear as chr1,chr2,... versus 1,2,..

Usage

1
2
3
rranges(n = 10, SNP = FALSE, chr.range = 1:26, chr.pref = FALSE,
  order = TRUE, equal.prob = FALSE, GRanges = TRUE, build = NULL,
  fakeids = FALSE)

Arguments

n

integer, number of rows to simulate

SNP

logical, whether to simulate SNPs (width 1, when SNPs=TRUE) or just ranges (when SNP=FALSE)

chr.range

integer vector of values from 1 to 26, to specify which chromosomes to include in the simulated object. 23-26 are X,Y,XY,MT respectively.

chr.pref

logical, if TRUE chromosomes will be coded as chr1,chr2,..., versus 1,2,.. when chr.pref=FALSE

order

logical, if TRUE the object returned will be in genomic order, otherwise the order will be randomized

equal.prob

logical, when FALSE (default), random positions will be selected on chromosomes chosen randomly according to the their length (i.e, assuming every point on the genome has equal probability of being chosen. If equal.prob=TRUE, then chromosomes will be selected with equal probability, so you could expect just as many MT (mitochondrial) entries as Chr1 entries.

GRanges

logical, if TRUE the returned object will be GRanges format, or if FALSE, then RangedData format

build

character, to specify the UCSC version to use, which has a small effect on the chromosome lengths. Use either "hg18" or "hg19". Will also accept build number, e.g, 36 or 37.

fakeids

logical, whether to add rownames with random IDs (TRUE) or leave rownames blank (FALSE). If SNP=TRUE, then ids will be fake rs-ids.

Value

returns a ranged object (GRanges or RangedData) containing data for 'n' simulated genomic ranges, such as SNPs or CNVs across chromosomes in 'chr.range', using UCSC 'build'.

Examples

1
2
3
4
5
6
7
8
rranges()
rr <- rranges(SNP=TRUE,chr.pref=TRUE,fakeids=TRUE)
width(rr) # note all have width 1
rr
tt <- table(chrm(rranges(1000)))
print(tt/sum(tt)) # shows frequencies at which the chr's were sampled
tt <- table(chrm(rranges(1000,equal.prob=TRUE)))
print(tt/sum(tt)) # shows frequencies at which the chr's were sampled

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.