randomizeRegions: Randomize Regions

View source: R/randomizeRegions.R

randomizeRegionsR Documentation

Randomize Regions

Description

Given a set of regions A and a genome, this function returns a new set of regions randomly distributted in the genome.

Usage

randomizeRegions(A, genome="hg19", mask=NULL, allow.overlaps=TRUE, per.chromosome=FALSE, ...)

Arguments

A

The set of regions to randomize. A region set in any of the accepted formats by toGRanges (GenomicRanges, data.frame, etc...)

genome

The reference genome to use. A valid genome object. Either a GenomicRanges or data.frame containing one region per whole chromosome or a character uniquely identifying a genome in BSgenome (e.g. "hg19", "mm10",... but not "hg"). Internally it uses getGenomeAndMask.

mask

The set of regions specifying where a random region can not be (centromeres, repetitive regions, unmappable regions...). A region set in any of the accepted formats by toGRanges (GenomicRanges,data.frame, ...). If NULL it will try to derive a mask from the genome (currently only works if the genome is a character string). If NA it gives, explicitly, an empty mask.

allow.overlaps

A boolean stating whether the random regions can overlap (FALSE) or not (TRUE).

per.chromosome

Boolean. If TRUE, the regions will be created in a per chromosome maner -every region in A will be moved into a random position at the same chromosome where it was originally-.

...

further arguments to be passed to or from methods.

Details

The new set of regions will be created with the same sizes of the original ones, and optionally placed in the same chromosomes.

In addition, they can be made explicitly non overlapping and a mask can be provided so no regions fall in an undesirable part of the genome.

Value

It returns a GenomicRanges object with the regions resulting from the randomization process.

Note

randomizeRegions assumes that chromosomes start at base 1. If a chromosome starts at another base number, for example at base 1000, random regions might appear in the [1:1000] interval. This should not affect most uses of randomizeRegions, but might be important in some advanced analysis involving artificially contructed genomes.

See Also

toDataframe, toGRanges, getGenome, getMask, getGenomeAndMask, characterToBSGenome, maskFromBSGenome, resampleRegions, createRandomRegions, circularRandomizeRegions

Examples

A <- data.frame("chr1", c(1, 10, 20, 30), c(12, 13, 28, 40))

mask <- data.frame("chr1", c(20000000, 100000000), c(22000000, 130000000))

genome <- data.frame(c("chr1", "chr2"), c(1, 1), c(180000000, 20000000))

randomizeRegions(A)

randomizeRegions(A, genome=genome, mask=mask, per.chromosome=TRUE, allow.overlaps=FALSE)


bernatgel/regioneR documentation built on Sept. 10, 2023, 12:03 a.m.