Functions for calculating concordance between variant sets and deciding whether two samples have identical genomes.
1 2 3
The two tally
Character vector of paths to files representing tally
A matrix of concordance fractions between sample pairs, as returend by
The concordance fraction above which edges are generated between samples when forming the graph.
Arguments to pass to the loading function, e.g.,
calculateVariantConcordance calculates the fraction of
concordant variants between two samples. Concordance is defined as
having the same position and alt allele.
calculateConcordanceMatrix function generates a numeric
matrix with the concordance for each pair of samples. It accepts paths
to serialized objects so that all variant calls are not loaded in
memory at once. This probably should support VCF files, eventually.
callVariantConcordance function generates a
concordant/non-concordant/undecidable status for each sample (that are
assumed to originate from the same individual), given the output of
calculateConcordanceMatrix. The status is decided as follows. A
graph is formed from the concordance matrix using
generate the edges. If there are multiple cliques in the graph that
each have more than one sample, every sample is declared
undecidable. Otherwise, the samples in the clique with more than one
sample, if any, are marked as concordant, and the others (in singleton
cliques) are marked as discordant.
Fraction of concordant variants for
numeric matrix of concordances for
or a character vector of status codes, named by sample, for
Cory Barr (code), Michael Lawrence (inferred documentation)
Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.