View source: R/CalcParentProbs.R
CalcParentProbs | R Documentation |
For each assigned offspring-parent pair, calculate the probability they are parent-offspring vs otherwise related. Probabilities are scaled to sum to one across all possible* relationships between the pair or trio; see Details.
CalcParentProbs(Pedigree = NULL, GenoM = NULL, quiet = FALSE, nCores = 1, ...)
Pedigree |
dataframe with columns id-dam-sire. By default, any
non-genotyped individuals are 'dummified'; use |
GenoM |
numeric matrix with genotype data: One row per individual,
one column per SNP, coded as 0, 1, 2, missing values as a negative number
or NA. You can reformat data with |
quiet |
logical, suppress messages. No progress is printed when >1 core is used. |
nCores |
number of computer cores to use. If |
... |
Additional arguments passed to |
The returned probabilities are calculated from the likelihoods used
throughout the rest of this package, by scaling them to sum to one across
all possible relationships. For Complex='simp'
these are
PO=parent-offspring, FS=full siblings, HS=half siblings, GP=grand-parental,
FA=full avuncular, HA=third degree relatives (incl half avuncular), and
U=unrelated. For Complex='full'
there are numerous double
relationship considered (PO & HS, HS & HA, etc), making both numerator and
denominator in the scaling step less unambiguous, and the returned
probabilities an approximation.
The likelihoods are calculated by calling CalcPairLL
once or
twice for each id-dam and id-sire pair: once not conditioning on the
co-parent, and once conditional on the co-parent, if any. For genotyped
individuals this is done with focal='PO'
, and for dummy individuals
with focal='GP'
.
For relationships between a genotyped and a dummy individual, it may only be possible to determine that the genotyped individual is a second degree relative (GP, HS, or FA) to the dummy's offspring. This then results in a probability of at most 0.33, even when the two are indeed parent and offspring.
See CalcPairLL
and the vignettes for further details.
Note that for large pedigrees this function can be fairly slow, especially
when using CalcPairLL
's default Module='ped'
and
Complex='full'
.
Subsetting the genotype data may give different results, as the likelihoods and thus the probabilities depend on the allele frequencies in the sample.
the Pedigree
dataframe with the three applicable columns
renamed to id-dam-sire, and 7 additional columns:
Probdam |
Probability that individual in dam column is the maternal parent, rather than otherwise related (LL(PO)/sum(LL)) |
Probsire |
Analogous for sire |
Probpair |
Probability for id-dam-sire trio. Approximated as the minimum of dam conditional on sire and sire conditional on dam, thus not including e.g. both being siblings (those other configurations are considered by sequoia during pedigree reconstruction, but can (currently) not be accessed directly) |
dam_alt , sire_alt |
Most likely alternative (not PO) relationship between id-dam and id-sire, respectively |
Probdam_alt , Probsire_alt |
Probability of most likely alternative relationship |
CalcPairLL
, LLtoProb
test_ped <- Ped_griffin[21:25,]
# add an incorrect sire to illustrate
test_ped$sire <- as.character(test_ped$sire)
test_ped$sire[5] <- 'i057_2003_M'
Ped_with_probs <- CalcParentProbs(test_ped, Geno_griffin)
print(Ped_with_probs, digits=2)
# Any non-genotyped non-'dummifiable' individuals are automatically skipped
# To get likelihoods for 'all' relationships, not just probabilities for
# PO & (next-)most-likely:
LL_sire_single <- CalcPairLL(
Pairs = data.frame(id1=test_ped$id,
id2=test_ped$sire,
dropPar1='both', # drop both -> id2 as single parent
focal='PO'),
Pedigree = Ped_griffin, # pedigree to condition on
GenoM = Geno_griffin, Plot=FALSE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.