View source: R/extract_random_seqs_from_multiple_genomes.R
extract_random_seqs_from_multiple_genomes | R Documentation |
In some cases, users may wish to extract sequences from randomly sampled loci of a particular length from a set of genomes. This function allows users to specify a number of sequences of a specified length that shall be randomly sampled from the genome. The sampling rule is as follows: For each locus independently sample:
1) choose randomly (equal probability: see sample.int
for details) from which of the given chromosomes the locus shall be sampled (replace = TRUE
).
2) choose randomly (equal probability: see sample.int
for details) from which strand (plus or minus) the locus shall be sampled (replace = TRUE
).
3) randomly choose (equal probability: see sample.int
the starting position of the locus in the sampled chromosome and strand (replace = TRUE
).
extract_random_seqs_from_multiple_genomes(
sample_size,
replace = TRUE,
prob = NULL,
interval_width,
subject_genomes,
file_name = NULL,
separated_by_genome = FALSE,
update = TRUE,
path = NULL
)
sample_size |
a non-negative integer giving the number of loci that shall be sampled. |
replace |
logical value indicating whether sampling should be with replacement. Default: |
prob |
a vector of probability weights for obtaining the elements of the vector being sampled. Default is |
interval_width |
the length of the locus that shall be sampled. |
subject_genomes |
a vector containing file paths to the reference genomes that shall be queried (e.g. file paths returned by |
file_name |
name of the fasta file that stores the BLAST hit sequences. This name will only be used when |
separated_by_genome |
a logical value indicating whether or not hit sequences from different genomes should be stored in the same
output |
update |
shall an existing |
path |
a folder path in which corresponding |
Hajk-Georg Drost
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.