simChr: Generate a simulated chromosome based on a reference sample

View source: R/CNVMetricsSimulations.R

simChrR Documentation

Generate a simulated chromosome based on a reference sample

Description

The function generates a list of simulated segments that represent a simulated chromosome based on a reference sample specified by the user. The function only accounts for the positions where a segment is assigned. In addition, the total number of segments is preserved. A Dirichlet distribution is used to assigned new sizes to the segments with respect to the relative initial size of the segment. Then, those new segments are shuffled without replacement. The positions are replaced by values between zero and one that represent the relative position in a chromosome where positions without segment have been removed. To ensure valuable results, the reference sample should have segments covering a good proportion of the chromosome; those should include NEUTRAL segments.

Usage

simChr(curSample, chrCur, nbSim)

Arguments

curSample

a GRanges that contains a collection of genomic ranges representing copy number events, including amplified/deleted status, from exactly one sample. The sample must have a metadata column called 'state' with a state, in an character string format, specified for each region (ex: DELETION, LOH, AMPLIFICATION, NEUTRAL, etc.) and a metadata column called 'CN' that contains the log2 copy number ratios.

chrCur

a character string representing the name of the chromosome that is used as reference for the simulation.

nbSim

a single positive integer which is corresponding to the number of simulations that will be generated.

Details

TODO

Value

a codelist containing one entry per simulation. Each entry is a data.frame containing shuffled segments with 6 columns:

  • ID The name of the simulation.

  • chr The name fo the chromosome.

  • start The starting position of the segment; the positions are between zero and one. The segment width is representing the proportional size of the segment relative to the global segment size.

  • end The ending position of the segment; the positions are between zero and one. The segment width is representing the proportional size of the segment relative to the global segment size.

  • log2ratio The log2 copy number ratio assigned to the segment.

  • state The state of the region (ex: DELETION, LOH, AMPLIFICATION, NEUTRAL, etc.).

Author(s)

Astrid DeschĂȘnes, Pascal Belleau

Examples


## Load required package to generate the samples
require(GenomicRanges)

## Create one 'demo' genome with 2 chromosomes
## in a GRanges object
## The stand of the regions doesn't affect the calculation of the metric
sample01 <- GRanges(seqnames=c(rep("chr1", 4), rep("chr2", 3)),
    ranges=IRanges(start=c(1905048, 4554832, 31686841, 32686222,
        1, 120331, 725531),
    end=c(2004603, 4577608, 31695808, 32689222, 117121,
        325555, 1225582)),
    strand="*",
    state=c("AMPLIFICATION", "NEUTRAL", "DELETION", "LOH",
        "DELETION", "NEUTRAL", "NEUTRAL"),
    log2ratio=(c(0.5849625, 0, -1, -1, -0.87777, 0, 0)))


## Generates 10 simulated chromosomes (one chromosome per simulated sample)
## based on chromosome 2 from the input sample.
## The shuffled chromosomes have a start and an end between 0 an 1
CNVMetrics:::simChr(curSample=sample01, chrCur="chr2", nbSim=10)

## Generates 4 simulated chromosomes (one chromosome per simulated sample)
## based on chromosome 1 from the input sample.
## The shuffled chromosomes have a start and an end between 0 an 1
CNVMetrics:::simChr(curSample=sample01, chrCur="chr1", nbSim=4)


adeschen/CNVMetrics documentation built on July 19, 2023, 10:24 p.m.