rank2map | R Documentation |
This function estimates positions of ordered single nucleotide polymorphisms (SNPs) that correspond
to a window spanning a user-defined distance in the SNP positions mapped to a reference.
Each window is centered at the SNP mapped position.
Conversion of a SNP rank position metric to a mapped position metric is useful for
kernel smoothing of the diem
output state along a genomic sequence.
rank2map(includedSites, ChosenSites = "all", windowSize = 1e+07, nCores = 1)
includedSites |
A character path to a file with columns |
ChosenSites |
A logical vector indicating which sites are to be included in the analysis. |
windowSize |
A numeric window size for metric conversion in base-pairs. |
nCores |
A numeric number of cores to be used for parallelisation. Must be
|
Single nucleotide polymorphisms (SNPs) tend to be spread across a genome randomly.
To facilitate interpretation of the diem
output, the marker states should be
assessed on the metric of their position along chromosomes (contigs). The windows for
kernel smoothing might contain a variable number of markers. This function estimates
which markers should be assessed together given their proximity on a chromosome.
Values in includedSites
are in essence SNP positions in BED format with a header.
The includedSites
file should ideally be generated by
vcf2diem to ensure congruence across all analyses.
The function reads SNP positions from the specified BED-like file and divides the genome into segments based on chromosomes. Each segment is then processed to identify genomic windows encompassing each SNP, considering the specified window size. This process is parallelized to enhance performance, and each SNP is considered within its chromosomal context to ensure accurate window placement.
Minimum value of windowSize
is equal to 3, but in genomic data evaluations, window
size should be at least two orders of magnitude larger. A good approximation of a
useful minimum window size is $(genome size) / ((number of SNPSs) / 2)$.
A two-column matrix with the number of rows corresponding to the number of
ChosenSites
, indicating start and end indices of adjacent markers that are
within an interval of length windowSize
centered on the specific marker.
The unit of parallelization when using nCores > 1
is set per chromosome.
This may differ from the parallelization approach used in diem, where processing
of compartment files is parallelized. Note that while compartment files can correspond
to chromosomes, this is not necessarily the case.
Natalia Martinkova
Filip Jagos 521160@mail.muni.cz
## Not run:
# Run this example in a working directory with write permissions
myo <- system.file("extdata", "myotis.vcf", package = "diemr")
vcf2diem(myo, "myo")
rank2map("myo-includedSites.txt", windowSize = 50)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.