semicontinuousWrapper: Semicontinuous LRs

Description Usage Arguments Details

View source: R/semicontinuousMixtureInterpretation.R

Description

This is an omnibus wrapper for semicontinuous likelihood estimation. It implements the method of: Ge, Jianye, Bruce Budowle, and Ranajit Chakraborty. "Comments on" Interpreting Y chromosome STR haplotype mixture"." Legal Medicine 13.1 (2011): 52-53. as applied to variant graphs (citation coming)

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
semicontinuousWrapper(
  genomes,
  genCount,
  rcrs,
  pos0,
  pos1,
  alleles,
  knownHaps = c(),
  nInMix = 2,
  clopperQuantile = 0.95,
  tolerance = 0,
  giveExplainy = FALSE
)

Arguments

genomes

the first data frame from MMDIT::preprocessMitoGenomes

genCount

the second data frame from MMDIT::preprocessMitoGenomes

rcrs

character string. the mitochondrial genome sequence (whole thing)

pos0

0-based coordinate of alleles

pos1

1-based coordinate of alleles

alleles

the alleles present in the interval specified

knownHaps

a vector of haplotypes hypothesized to be in the mixture

nInMix

integer; the number of distinct haploid sequences present in the mixture

clopperQuantile

the upper-bound confidence interval as per Clopper and Pearson

tolerance

should be 0. this permits fuzzy matching between the haplotypes and the mixture. 0 == no fuzz

giveExplainy

optionally returns the explaining individuals

Details

The short of it, this creates a variant graph (makeVariantGraph, using pos0, pos1 and alleles) and it takes genomes from the database (genomes, which is stratified by population, genCount is every unique haplotype, regardless of population) and it appends a possibly empty set of known haplotypes (knownHaps) to the set of every unique database-derived haplotype

Then every way of explaining the mixture is computed (at the level of every known haplotype). The procedure is equivalent to (in the case of 2-person mixtures), taking every pair of haplotypes and computing the fraction of haplotypes that explain the mixture. To make things conservative the method of Clopper and Pearson (1934) is used to take the ratio (number that explain / number considered) and map that into a conservative estimate of that ratio.

The likelihood is estimated for every population, and for every subset of knowns possible. e.g., if 1 known haplotype is given, then the likelihood of both the 1 known and 0 knowns is considered. If 2 knowns are hypothesized, then the lr for both knowns, the first known, the second known (individually) and 0 knowns is computed.

The RMNE is also computed; that is, it is the number of haplotypes that explain the mixture (divided by the total, adjusted by Clopper and Pearson).


Ahhgust/MMDIT documentation built on Jan. 27, 2021, 11:48 a.m.