bitwise.dist | R Documentation |
This function calculates both dissimilarity and Euclidean distances for genlight or snpclone objects.
bitwise.dist(
x,
percent = TRUE,
mat = FALSE,
missing_match = TRUE,
scale_missing = FALSE,
euclidean = FALSE,
differences_only = FALSE,
threads = 0L
)
x |
a genlight or snpclone object. |
percent |
|
mat |
|
missing_match |
|
scale_missing |
A logical. If |
euclidean |
|
differences_only |
|
threads |
The maximum number of parallel threads to be used within this function. A value of 0 (default) will attempt to use as many threads as there are available cores/CPUs. In most cases this is ideal. A value of 1 will force the function to run serially, which may increase stability on some systems. Other values may be specified, but should be used with caution. |
The default distance calculated here is quite simple and goes by many names depending on its application. The most familiar name might be the Hamming distance, or the number of differences between two strings.
As of poppr version 2.8.0, this function now also calculates Euclidean
distance and is considerably faster and more memory-efficient than the
standard dist()
function.
A dist object containing pairwise distances between samples.
This function is optimized for genlight and snpclone objects. This does not mean that it is a catch-all optimization for SNP data. Three assumptions must be met for this function to work:
SNPs are bi-allelic
Samples are haploid or diploid
All samples have the same ploidy
If the user supplies a genind or
genclone object, prevosti.dist()
will be used for
calculation.
Zhian N. Kamvar, Jonah C. Brooks
diss.dist()
, snpclone,
genlight, win.ia()
, samp.ia()
set.seed(999)
x <- glSim(n.ind = 10, n.snp.nonstruc = 5e2, n.snp.struc = 5e2, ploidy = 2)
x
# Assess fraction of different alleles
system.time(xd <- bitwise.dist(x, threads = 1L))
xd
# Calculate Euclidean distance
system.time(xdt <- bitwise.dist(x, euclidean = TRUE, scale_missing = TRUE, threads = 1L))
xdt
## Not run:
# This function is more efficient in both memory and speed than [dist()] for
# calculating Euclidean distance on genlight objects. For example, we can
# observe a clear speed increase when we attempt a calculation on 100k SNPs
# with 10% missing data:
set.seed(999)
mat <- matrix(sample(c(0:2, NA),
100000 * 50,
replace = TRUE,
prob = c(0.3, 0.3, 0.3, 0.1)),
nrow = 50)
glite <- new("genlight", mat, ploidy = 2)
# Default Euclidean distance
system.time(dist(glite))
# Bitwise dist
system.time(bitwise.dist(glite, euclidean = TRUE, scale_missing = TRUE))
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.