Description Usage Arguments Details Value Author(s) References Examples
Marker data are simulated on pedigrees,
conditional on existing genotypes and likelihoods are calculated
using FamiliasConditional
or
paramlinkConditional
.
transferMarkerdata2
transfers markerdata from a linkdat
object to
a linkdat
object or a list thereby
generalising transferMarkerdata
.
1 2 3 4 5 6 7 8 | conditionalLR(Nsim=5, datamatrix, loci, pedigrees, file = NULL , program = "Familias",
prior=NULL, available=NULL, seed=NULL, ref=NULL, truePeds = NULL,
verbose = TRUE, simplify = FALSE)
# Remove persons, e.g., singletons for all hypotheses
removePersons(pedigrees, datamatrix, ids=NULL)
label2num(label, familiasped)
|
Nsim |
Integer. Number of simulations. |
datamatrix |
A data frame. The row names must be the names of the persons you have data for. The columns contain the alleles, two columns for each marker, in the same order used in the loci list. |
loci |
A FamiliasLocus object or a list of such objects. |
pedigrees |
An object of type 'FamiliasPedigree' or 'pedigree', or a list of such objects. |
familiasped |
An object of type 'FamiliasPedigree' or 'pedigree', or a list of such objects. |
file |
Character. First part of name for output file. If NULL nothing is written to files. |
program |
Character. Specifies program used for likelihood calculation. Either 'Familias' (default) or 'paramlink'. |
prior |
Double vector. Not currently relevant as only LRs are reported. |
available |
Character or integer identifying person to be simulated. |
seed |
Integer. |
ref |
Integer. Index of pedigree in numerator of LR. Set to last if |
truePeds |
Indices of pedigrees to be simulated from. If |
simplify |
Logical. |
verbose |
logical. If |
label |
A character vector, 'Familias' ID label. |
ids |
A character or integer vector. |
If truePeds
is a subset of all pedigrees, only the files corresponding to
truePeds
are written. In this case, LR[,,i]
contains missing values if i
is not in truePeds
The number of hypotheses corresponds to the number of pedigrees.
In many cases, there will only be two hypotheses. The output is then simplified
if one specifies
simplify = TRUE
. By default the reference hypothesis is number 2. i.e., ref=2
)
and the likelihood ratio is LR = Pr(data|H1)/Pr(data|H2)
.
The output will then be a matrix with columns. The first column
consists of the simulated LR-s when H1 is assumed true, the second one when H2 is true.
When there are more than two hypothes, an array is returned.
LR[,,i]
are the LR values.
when simulations are conditioned on pedigree i
.
There is one row for each simulation and one column for each pedigree.
In other words LR[k,l,i]
is the k
-th simulated value
of Pr(data|Hl)/Pr(dat|Href)
when hypothesis Hi
is true, the one simulated from.
File(s) with simulated values are written (if variable file
is not NULL
).
If simplify=TRUE
one file is written, otherwise there will be one
one for each pedigree i
.
Thore Egeland <Thore.Egeland@gmail.com>
Kling et al. (2017).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 | # Example
# Computational details for (currently) Example 1 of Kling et al (2017) are provided.
# The purpose is to explain the code and also check results against exact formulae and compare
# to a previous implementation, i.e., library(famr). # There is one marker with alleles 1, 2
# and 3 having frequencies p1, p2, and p3. # We diregard complicating factors like mutation
# (see next example), # and silent alleles. # One person, the grandmother GM, is genotyped as 1/1.
# The grandson GS is to be simulated.
# The dataset 'grandmother' loaded below is a list with three components explained
# in the documentation of FamiliasPosterior, i.e, pedigrees, datamatrix and loci.
# The paramlink function 'Familias2linkdat' converts from 'Familias' format to a format
# suitable for plotting and conditional simulation, likelihood calculations etc,
# using 'paramlink'; below 'plotPedList' is used to plot.
data(grandmother)
pedigrees = grandmother$pedigrees
datamatrix = grandmother$datamatrix
loci = grandmother$loci
persons = rownames(datamatrix)
## Not run:
# Plot with newdev=TRUE, resize plot window and then plot with newed=FALSE
x = Familias2linkdat(pedigrees, datamatrix, loci)
plotPedList(x, newdev=TRUE, frametitles=c("H1", "H2"),
available ="shaded", marker=1, skip.empty.genotypes = TRUE)
## End(Not run)
# In this case there are two (=length(pedigrees))
# hypotheses H1 and H2 or equivalently two pedigrees.
# We will be interested in the likelihood ratio (LR) defined by Pr(data|H1)/Pr(data|H2)
# or rather the random variables LR(H1) and LR(H2) where H1 and H2 indicate the true hypotheses.
# In other words, we estimate the distribution of LR(H1) by simulating assuming H1 to be true
# and similarly for H2. Obviously, we also condition on genotyped individuals, GM in this case.
# Assume first H1 to be true. Then, as explained in the Kling et al. (2017),
# there are three possible values for the likelihood ratios,
# namely y1=1/2, y2=0.5+1/(4*p1), and y3=0.5+0.5/p1
# occuring with the probabilities py1, py2 and py3 calculated below
p = loci[[1]]$alleles
p1 = p[1]; p2=p[2]; p3=p[3]
py2 = (1-p1)*(p1+0.5)
py3 = 0.5*p1*(1+p1)
py1 =1-py2-py3
y1 = 0.5; y2 = 0.5+1/(4*p1); y3 =0.5+0.5/p1; LRs = c(y1, y2, y3)
LR.H1.exact = c(py1,py2,py3)
names(LR.H1.exact) = paste(LRs)
# The above probability distribution, LR.H1.exact, can be approximated by simulation,
# using 'markerSim' followed by likelihood calculation in 'Familias' or 'paramlink'.
Nsim = 5; seed = 17; avail = "GS"
res1 = conditionalLR(Nsim=Nsim, datamatrix, loci, pedigrees, available=avail,
seed=seed, program = "Familias", simplify=TRUE)
res2 = conditionalLR(Nsim=Nsim, datamatrix, loci, pedigrees, available=avail,
seed=seed, program = "paramlink", simplify=TRUE) #Change
LR.H1.Familias = table(res1[,1])/Nsim
LR.H1.paramlink = table(res2[,1])/Nsim
stopifnot(round(LR.H1.Familias-LR.H1.paramlink,12)==0)
# We see that the two implementations give the same result and agree well
# with the theoretical result. The previous implementation also agrees well.
## Not run:
#Try old code
install.packages("http::/familias.name/famr_1.0-zip")
library(famr)
res3 <- conditionalSimulationWrite(nsim = Nsim, datamatrix, persons,
loci, pedigrees, available = 3, seed = 1482659436, ref = 2, file = NULL)
LR.H1.old = table(res3[,1])/Nsim
LR.H2.old = table(res3[,2])/Nsim
# The above code is limited in some respects: It assumes
# that there are two pedigrees and that there
# are no mutations. With ref=2, H2 is the denominator of the LR. Column 1 of the output,
# res3[,1] above is simulated assumed H1 to be true, res[,2] assuming H2 to be true.
## End(Not run)
# Consider next simulation under H2 calculated for H2
LR.H2.Familias = table(res1[,2])/Nsim
LR.H2.paramlink = table(res2[,2])/Nsim
stopifnot(round(LR.H2.Familias-LR.H2.paramlink,12)==0)
# Obviously, the possible values for LR are the same for H1 and H2.
# Note that the largest value of LR occurs with probability py3=p1^2= 1e-04
# This value may not be reached in the simulations. Again results agree well and
# also with the previous implementation if Nsim=1000. The probability distribution of LR
# conditionally on H2 is calculated as
py3 = p1^2
py1 = (1-p1)^2
py2 = 2*p1*(1-p1)
LR.H2.exact = c(py1,py2,py3)
names(LR.H2.exact) = paste(LRs)
# The previous example continues, but we will now
# model mutations and for simplicity assume a SNP marker.
## Not run:
# Let the mutation rate be 0.05
# The mutation rate is chosen (too) high to see some impact.
# in a 'proportional mutation model, i.e.,
p = c(0.2, 0.8); R=0.05
loci = list(FamiliasLocus(p, 1:2, "L1", MutationModel = "Proportional",
MutationRate=R))
x = Familias2linkdat(pedigrees[[1]], datamatrix, loci)
m = marker(x, 1, c(1,1), alleles=1:2, afreq=p)
x = addMarker(x,m)
p.GS = oneMarkerDistribution(x, 3, partialmarker=1, verbose = FALSE)
# For instance
p22.one = p.GS["2/2"]
# is the probability of the grandson being 2/2 when
# the grandmother is 1/1; without mutation this would be
p22.ind = 0.5*p[2]^2
# We next check the exact result by an exact formula
# Egeland, Pinto and Amorim (2017, submitted) and also
# using simulation. Let
H = 1-sum(p^2); k = R/H
p22 = 0.5*p[2]^2*(2-(1-k)^2) # LR = 0.5+0.5*(1-(1-k)^2) with
stopifnot(round(p22.one-p22, 10)==0)
# probability p22. Below we simulate to check
Nsim = 1000; seed=177
res = conditionalLR(Nsim=Nsim, datamatrix, loci, pedigrees, available="GS",
seed=seed, program = "Familias", verbose = FALSE, simplify=TRUE)
LR.H1.mut = table(res[,1])/Nsim
(LR.H1.mut[1]-p22)/p22 #relative difference
## End(Not run)
# Example Missing grandchild example
## Not run: #Takes 3-4 minutes
data(F21)
pedigrees = F21$pedigrees
datamatrix = F21$datamatrix
loci = F21$loci
persons = rownames(datamatrix)
x = Familias2linkdat(pedigrees, datamatrix, loci)
Nsim = 1000
res1 = conditionalLR(Nsim = Nsim, datamatrix, loci, pedigrees, program ="Familias",
available = "Missing Person", seed=17, verbose = FALSE, simplify=TRUE)
LR = data.frame(LR.H1=res1[,1], LR.H2=res1[,2])
length(LR[,1][LR[,1]>10^5])/Nsim
length(LR[,2][LR[,2]==0])/Nsim #PE estimate
res1 = FamiliasConditional(Nsim = Nsim, datamatrix, loci, pedigrees,
available = "Missing Person", seed=17)
res2 = conditionalLR(Nsim = Nsim, datamatrix, loci, pedigrees, program ="Familias",
available = "Missing Person", seed=17, verbose = FALSE)
res = cbind(res1[[1]][,1], res2[,,1][,1])
boxplot(log(res)); title ("log LR(H1), Familias and paramlink (right)")
## End(Not run)
## Not run:
# Example
data(Demo3Markers)
pedigrees = Demo3Markers$pedigrees
datamatrix = Demo3Markers$datamatrix
loci = Demo3Markers$loci
persons = rownames(datamatrix)
Nsim = 5
res.Familias = conditionalLR(Nsim = Nsim, datamatrix, loci, pedigrees,
file = NULL, program = "Familias", truePed = NULL,
available = "Mother", ref=NULL, seed=177, simplify = TRUE)
res.paramlink = conditionalLR(Nsim = Nsim, datamatrix, loci, pedigrees,
file = NULL, program = "paramlink", truePed = NULL,
available = "Mother", ref=NULL, seed=177, simplify=TRUE)
# Always the same LR as the probabilities of the genotypes of the person simulated ("Mother")
# is the same for both hypotheses:
stopifnot(round(res.Familias[,1]-res.paramlink[1,1],12)==0)
stopifnot(round(res.Familias[,2]-res.paramlink[1,2],12)==0)
# Example
data(symmetric)
pedigrees = symmetric$pedigrees
datamatrix = symmetric$datamatrix
loci = symmetric$loci
persons = rownames(datamatrix)
truePeds = 1:3
res.Familias = conditionalLR(Nsim = 5, datamatrix, loci, pedigrees,
file = NULL, program = "Familias", truePeds = truePeds,
available = NULL, ref=2, seed=177, verbose = FALSE)
res.paramlink = conditionalLR(Nsim = 5, datamatrix, loci, pedigrees,
file = NULL, program = "paramlink", truePeds = truePeds, #Change
available = NULL, ref=2, seed=177, verbose = FALSE)
stopifnot(round(res.Familias[,,truePeds]-res.paramlink[,,truePeds],12)==0)
# Example
ped = list(singleton(5,1), nuclearPed(2))
x = markerSim(ped, N=5, alleles=1:5, verbose=FALSE, available=5)
y = nuclearPed(3)
y2 = transferMarkerdata(x, y)
y2
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.