do_baits: Generates random baits

Description Usage Arguments Value

View source: R/do_baits.R

Description

Function inspired by this comment in StackOverflow: https://stackoverflow.com/questions/49149839/simulate-random-positions-from-a-list-of-intervals

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
do_baits(
  n,
  n.per.seq,
  size,
  database,
  lengths,
  exclusions = NULL,
  regions = NULL,
  regions.prop = 0,
  regions.tiling = 1,
  targets = NULL,
  targets.prop = 0,
  targets.tiling = 1,
  seed = NULL,
  restrict,
  gc = c(0, 1),
  min.per.seq = 1,
  verbose = FALSE,
  force = FALSE
)

Arguments

n

Number of baits to generate (distributed across the various sequences).

n.per.seq

Number of baits to generate per sequence. Ignored if n is set.

size

The size of each bait

database

A database of sequences

lengths

Optional: the lengths of the sequences, compiled through load_lengths. If missing, lengths will be compiled from the database on-the-fly.

exclusions

A file containing regions to exclude

regions

A file containing regions of interest.

regions.prop

The proportion of baits that should overlap the regions of interest.

regions.tiling

The minimum number of baits to distribute per region of interest.

targets

A file containing bp's to target.

targets.prop

The proportion of baits that should overlap the targets.

targets.tiling

The minimum number of baits to distribute per target.

seed

A number to fix the randomization process, for reproducibility

restrict

A vector of chromosome names OR position numbers to which the analysis should be restricted.

gc

A vector of two values between 0 and 1, specifying the minimum and maximum GC percentage allowed in the output baits.

min.per.seq

Minimum number of baits per sequence. Defaults to 1.

verbose

Logical: Should detailed bait processing messages be displayed per sequence?

force

Logical: Proceed even if the number of baits requested is very large?

Value

A dataframe of baits


BelenJM/supeRbaits documentation built on Jan. 28, 2022, 1:44 a.m.