R/DNAmixtures-package.R
In DNAmixturesLite: Statistical Inference for Mixed Traces of DNA (Lite-Version)

##' @importFrom grDevices dev.interactive devAskNewPage
##' @importFrom graphics abline axis box clip curve lines mtext par plot.new points
##' @importFrom methods is
##' @importFrom stats dbinom dgamma pgamma ppoints predict qnorm rgamma simulate
##' @importFrom utils data head tail
##' @importFrom Rsolnp solnp
##' @importFrom Matrix bdiag
##' @importFrom numDeriv hessian
##' @importFrom gRaven add.edge add.node get.belief get.normalization.constant get.table hugin.domain initialize.domain map.configurations retract set.finding set.table propagate.gRaven
##' @importFrom gRbase propagate
NULL


##' Statistical Inference for Mixed Samples of DNA (Lite-Version)
##'
##' @description Tools for statistical inference for one or multiple DNA mixtures.
##'
##' \emph{IMPORTANT: This is the \pkg{DNAmixturesLite} package, which is intended as a service to enable users to try \pkg{DNAmixtures} without purchasing a commercial licence for Hugin. When at all possible, we strongly recommend the use of \pkg{DNAmixtures} rather than this lite-version. See \url{https://dnamixtures.r-forge.r-project.org/} for details on both packages.}
##'
##'  \emph{While the lite-version seeks to provide the full functionality of \pkg{DNAmixtures}, note that computations are much less efficient and that there are some differences in available functionality. Be aware that the present documentation is copied from \pkg{DNAmixtures} and thus may not accurately describe the implementation of this lite-version.}
##'
##' @details The package implements a statistical model for analysis
##' of one or more mixed samples of DNA in the possible presence of
##' dropout and stutter. Details of the model can be found in Cowell
##' et. al (2013), and details on the model checking tools and
##' Bayesian network structure can be found in Graversen and Lauritzen
##' (2014).
##'
##' Any hypothesis involving unknown contributors relies on
##' computations in a Bayesian network. For performing such
##' computations, \pkg{DNAmixtures} package relies on Hugin
##' (\url{https://www.hugin.com}) through the \R-package \pkg{RHugin}. For an
##' installation guide, see the package webpage
##' \url{https://dnamixtures.r-forge.r-project.org}.
##'
##' Although \pkg{DNAmixtures} can be installed with only the free
##' version of Hugin, the size of the networks will in practice
##' require the full licence. In theory, the implementation allows
##' analysis with an arbitrary number of unknown
##' contributors. However, in practice, depending on hardware and
##' time-constraints working with up to 5 or 6 unknown contributors
##' seems realistic.
##'
##' @section Summary of the statistical model:
##'
##' The statistical model jointly models the observed peak heights and
##' the set of contributors to the DNA sample(s). In the event of
##' analysing multiple DNA mixtures, the union of the contributors is
##' used as the contributor set for each mixture. By allowing a
##' contribution of zero, we cover the case of a contributor not
##' having contributed to a particular mixture.
##'
##' Genotypes for unknown contributors are modelled using
##' allele-frequencies from a database specified by the user. The
##' database is also used to define the range of alleles at each
##' marker. A genotype for an unknown contributor is represented by a
##' vector of allele counts \eqn{n_{ia}}, counting for each allele
##' \eqn{a} the number of alleles \eqn{i} that a person possesses; in
##' the network for a marker, the allele count \eqn{n_{ia}} is
##' represented by a variable \code{n_i_a}. The vector of allele
##' counts follows a multinomial distribution with \eqn{\sum_i n_{ia}
##' = 2} and the specified allele frequencies. It is assumed that
##' genotypes are independent across markers and between
##' contributors. If desired, the database of allele frequencies may
##' be corrected for F_st or sampling adjustment before use.
##'
##' Peak heights are assumed mutually independent and their
##' distributions for a fixed set of DNA profiles are modelled using
##' gamma distributions.  The peak height for allele \eqn{a} in EPG \eqn{r}
##' is assumed to follow a gamma distribution with scale parameter
##' \eqn{\eta_r} and shape parameter
##'
##' \deqn{\rho_r \sum_a ((1-\xi_{ra})n_{ia} + \xi_{r,a+1} n_{i,a+1})\phi_{ri}.}
##'
##' Applying a detection threshold \eqn{C_r\ge 0}, any peak height
##' falling below the threshold is considered to be 0.  The peak
##' heights are denoted by \code{height1, \ldots, heightR}.
##'
##' The model parameters are for each DNA mixture
##' \describe{
##' \item{\eqn{\phi}}{The proportions of DNA from each contributor.}
##' \item{\eqn{\rho}}{Amplification parameter, which will be larger for larger amounts of DNA amplified.}
##' \item{\eqn{\eta}}{Scale parameter for the gamma distribution.}
##' \item{\eqn{\xi}}{Mean stutter percentage. Allele \eqn{a} uses stutter parameter \eqn{\xi_a = \xi} if the allele \eqn{a-1} is included in the model, and \eqn{\xi_a = 0} otherwise}
##' }
##'
##' An alternative parametrisation uses \eqn{\mu = \rho \eta} and
##' \eqn{\sigma = 1/\sqrt{\rho}}, which can be interpreted as the mean
##' peak height and the coefficient of variation respectively. Besides
##' being interpretable, an advantage of this reparametrisation is
##' that the parameters are fairly orthogonal.
##'
##' The model assumes the model parameters to be the same across
##' markers.  Relaxations of these assumptions are not implemented
##' here.
##'
##'
##' @section Computation by auxiliary variables:
##'
##' The computational approach of the implementation of this package
##' is discussed in Graversen and Lauritzen (2014).
##'
##' The Bayesian networks include three types of auxiliary variables
##' \code{O}, \code{D}, and \code{Q}; these can be thought of as
##' representing the observed peak heights, the absence/presence of
##' peaks, and the peak height distribution function. Note that if
##' invalid tables are set -- for instance if very extreme parameter
##' values are used, or if the vector of mixture proportions is
##' mis-labeled -- then any subsequent propagation will fail. No
##' roll-back functionality has so far been implemented to fix this,
##' and the easiest solution is to re-fit the mixture model.
##'
##' The workhorses of this package are the functions
##' \code{\link{setCPT.O}}, \code{\link{setCPT.D}} and
##' \code{\link{setCPT.Q}} for setting the conditional probability
##' tables for the three types of auxiliary variables according to
##' specified peak heights and model parameters.
##'
##' @section Amelogenin: As an experiment, it is possible to add the
##' marker Amelogenin, provided that the marker is named "AMEL" and
##' that the coding of alleles X and Y is of a particular form. One
##' example of a suitable form is the coding X = 0 and Y = 1. The
##' allele frequencies used should then also contain a marker "AMEL",
##' and here frequencies have a slightly different interpretation than
##' for the rest of the markers; as all people possess one X, the
##' frequencies of X and Y denote the presence of an additional X or Y
##' respectively, and thus the frequencies correspond directly to the
##' proportions of the two genders.
##'
##' @name DNAmixturesLite-package
##' @aliases DNAmixturesLite DNAmixtures
##' @docType package
##'
##' @author Therese Graversen \email{theg@@itu.dk}
##'
##' @references Details on the implemented model may be found in
##'
##' Cowell, R. G., Graversen, T., Lauritzen, S., and Mortera, J. (2015).
##' \emph{Analysis of Forensic DNA Mixtures with Artefacts}. With supplementary material documenting the analyses using \pkg{DNAmixtures}.
##' Journal of the Royal Statistical Society: Series C (Applied Statistics).
##' Volume 64, Issue 1, pages 1-48.
##'
##' Graversen, T. (2014)
##' \emph{Statistical and Computational Methodology for the Analysis of Forensic DNA Mixtures with Artefacts}.
##' DPhil. University of Oxford.
##' \url{https://ora.ox.ac.uk/objects/uuid:4c3bfc88-25e7-4c5b-968f-10a35f5b82b0}.
##'
##' Graversen, T. and Lauritzen, S. (2014).
##' \emph{Computational aspects of {DNA} mixture analysis}.
##' Statistics and Computing, DOI: 10.1007/s11222-014-9451-7.
##'
##' @example inst/examples/main.R
##'
NULL

Any scripts or data that you put into this service are public.

DNAmixturesLite documentation built on March 31, 2023, 8:21 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

DNAmixturesLite
Statistical Inference for Mixed Traces of DNA (Lite-Version)

R/DNAmixtures-package.R
In DNAmixturesLite: Statistical Inference for Mixed Traces of DNA (Lite-Version)

Try the DNAmixturesLite package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

DNAmixturesLite Statistical Inference for Mixed Traces of DNA (Lite-Version)

R/DNAmixtures-package.R In DNAmixturesLite: Statistical Inference for Mixed Traces of DNA (Lite-Version)

Try the DNAmixturesLite package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

DNAmixturesLite
Statistical Inference for Mixed Traces of DNA (Lite-Version)

R/DNAmixtures-package.R
In DNAmixturesLite: Statistical Inference for Mixed Traces of DNA (Lite-Version)