View source: R/readsimulator.R

SimulateReadsForSample | R Documentation |

This function takes a vector of *true* relative abundances and NTRs, and then simulates
(i) read counts per gene and (ii) 4sU incorporation and conversion events. Subsequently, it
uses the same approach as implemented in the GRAND-SLAM 2.0 software (Juerges et al., Bioinformatics 2018)
to estimate the NTR from these simulated data.

```
SimulateReadsForSample(
num.reads = 2e+07,
rel.abundance = setNames(rlnorm(10000, meanlog = 4.5, sdlog = 1), paste0("Gene",
1:10000)),
ntr = setNames(rbeta(10000, 1.5, 3), paste0("Gene", 1:10000)),
dispersion = 0.05,
beta.approx = FALSE,
conversion.reads = FALSE,
u.content = 0.25,
u.content.sd = 0.05,
read.length = 75,
p.old = 1e-04,
p.new = 0.04,
p.new.fit = p.new,
seed = NULL
)
```

`num.reads` |
the total amount of reads for simulation |

`rel.abundance` |
named (according to genes) vector of the true relative abundances. Is divided by its sum. |

`ntr` |
vector of true NTRs |

`dispersion` |
vector of dispersion parameters (should best be estimated by DESeq2) |

`beta.approx` |
should the beta approximation of the NTR posterior be computed? |

`conversion.reads` |
also output the number of reads with conversion |

`u.content` |
the relative frequency of uridines in the reads |

`u.content.sd` |
the standard deviation of the u content |

`read.length` |
the read length for simulation |

`p.old` |
the probability for a conversion in reads originating from old RNA |

`p.new` |
the probability for a conversion in reads originating from new RNA |

`p.new.fit` |
the probability for a conversion in reads originating from new RNA that is used for fitting (to simulate bias in the estimation of p.new) |

`seed` |
seed value for the random number generator (set to make it deterministic!) |

The simulation proceeds as follows:

Draw for each gene the number of reads from a negative binomial distribution parametrized with the relative abundances x read number and the dispersion parameter

For each gene: Draw for each read the number of uridines according to a beta binomial distribution for the given read length (the beta prior is parametrized to match the u.content and u.content.sd parameters)

For each read: Draw the number of conversions according to the binomial mixture model of GRAND-SLAM (parametrized with p_old, p_new, the gene specific NTR and the read specific number of uridines)

Estimate the NTR by using the GRAND-SLAM approach

a matrix containing, per column, the simulated counts, the simulated NTRs, (potentially the shape parameters of the beta distribution approximation,) and the true relative frequencies and ntrs

SimulateTimeCourse

```
SimulateReadsForSample(num.reads = 10000,rel.abundance = rep(1,5),ntr=0.9)
SimulateReadsForSample(num.reads = 10000,rel.abundance = rep(1,5),ntr=0.9,seed=1337)
SimulateReadsForSample(num.reads = 10000,rel.abundance = rep(1,5),ntr=0.9,seed=1337)
# the second and third matrix should be equal, the first should be distinct
```

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.