Probability of encountering a genotype more than once by chance
1 2 
gid 
a genind or genclone object. 
pop 
either a formula to set the population factor from the

by_pop 
When this is 
freq 
a vector or matrix of allele frequencies. This defaults to

G 
an integer specifying the number of observed genets. If NULL, this will be the number of original multilocus genotypes. 
method 
which method of calculating psex should be used? Using

... 
options from correcting rare alleles. The default is to correct allele frequencies to 1/n 
Psex is the probability of encountering a given genotype more than once by chance. The basic equation is
psex = 1  (1  pgen)^G
where G is the number of multilocus genotypes. See
pgen
for its calculation. For a given value of alpha (e.g.
alpha = 0.05), genotypes with psex < alpha can be thought of as a single
genet whereas genotypes with psex > alpha do not have strong evidence that
members belong to the same genet (Parks and Werth, 1993).
When method = "multiple"
, the method from ArnaudHaond et al. (1997)
is used where the sum of the binomial density is taken:
psex = sum(dbinom(1:N, N, pgen))
where N is the number of samples with the same genotype, i is the ith sample, and pgen is the value of pgen for that genotype.
The function will automatically calculate the roundrobin allele
frequencies with rraf
and G with nmll
.
a vector of Psex for each sample.
The values of Psex represent the value for each multilocus genotype.
Additionally, when the argument pop
is not NULL
,
by_pop
is automatically TRUE
.
Zhian N. Kamvar, Jonah Brooks, Stacy A. KruegerHadfield, Erik Sotka
ArnaudHaond, S., Duarte, C. M., Alberto, F., & SerrĂ£o, E. A. 2007. Standardizing methods to address clonality in population studies. Molecular Ecology, 16(24), 51155139.
Parks, J. C., & Werth, C. R. 1993. A study of spatial features of clones in a population of bracken fern, Pteridium aquilinum (Dennstaedtiaceae). American Journal of Botany, 537544.
pgen
, rraf
, rrmlg
,
correcting rare alleles
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64  data(Pram)
Pram_psex < psex(Pram, by_pop = FALSE)
plot(Pram_psex, log = "y", col = ifelse(Pram_psex > 0.05, "red", "blue"))
abline(h = 0.05, lty = 2)
## Not run:
# With multiple encounters
Pram_psex < psex(Pram, by_pop = FALSE, method = "multiple")
plot(Pram_psex, log = "y", col = ifelse(Pram_psex > 0.05, "red", "blue"))
abline(h = 0.05, lty = 2)
# This can be also done assuming populations structure
Pram_psex < psex(Pram, by_pop = TRUE)
plot(Pram_psex, log = "y", col = ifelse(Pram_psex > 0.05, "red", "blue"))
abline(h = 0.05, lty = 2)
# The above, but correcting zerovalue alleles by 1/(2*rrmlg) with no
# population structure assumed
# See the documentation for rare_allele_correction for details.
Pram_psex2 < psex(Pram, by_pop = FALSE, d = "rrmlg", mul = 1/2)
plot(Pram_psex2, log = "y", col = ifelse(Pram_psex2 > 0.05, "red", "blue"))
abline(h = 0.05, lty = 2)
## An example of supplying previously calculated frequencies and G
# From Parks and Werth, 1993, using the first three genotypes.
# The row names indicate the number of samples found with that genotype
x < "
Hk Lap Mdh2 Pgm1 Pgm2 X6Pgd2
54 12 12 12 23 22 11
36 22 22 11 22 33 11
10 23 22 11 33 13 13"
# Since we aren't representing the whole data set here, we are defining the
# allele frequencies before the analysis.
afreq < c(Hk.1 = 0.167, Hk.2 = 0.795, Hk.3 = 0.038,
Lap.1 = 0.190, Lap.2 = 0.798, Lap.3 = 0.012,
Mdh2.0 = 0.011, Mdh2.1 = 0.967, Mdh2.2 = 0.022,
Pgm1.2 = 0.279, Pgm1.3 = 0.529, Pgm1.4 = 0.162, Pgm1.5 = 0.029,
Pgm2.1 = 0.128, Pgm2.2 = 0.385, Pgm2.3 = 0.487,
X6Pgd2.1 = 0.526, X6Pgd2.2 = 0.051, X6Pgd2.3 = 0.423)
xtab < read.table(text = x, header = TRUE, row.names = 1)
# Here we are expanding the number of samples to their observed values.
# Since we have already defined the allele frequencies, this step is actually
# not necessary.
all_samples < rep(rownames(xtab), as.integer(rownames(xtab)))
xgid < df2genind(xtab[all_samples, ], ncode = 1)
freqs < afreq[colnames(tab(xgid))] # only used alleles in the sample
pSex < psex(xgid, by_pop = FALSE, freq = freqs, G = 45)
# Note, pgen returns log values for each locus, here we take the sum across
# all loci and take the exponent to give us the value of pgen for each sample
pGen < exp(rowSums(pgen(xgid, by_pop = FALSE, freq = freqs)))
res < matrix(c(unique(pGen), unique(pSex)), ncol = 2)
colnames(res) < c("Pgen", "Psex")
res < cbind(xtab, nRamet = rownames(xtab), round(res, 5))
rownames(res) < 1:3
res # Compare to the first three rows of Table 2 in Parks & Werth, 1993
## End(Not run)

Questions? Problems? Suggestions? Tweet to @rdrrHQ or email at ian@mutexlabs.com.
All documentation is copyright its authors; we didn't write any of that.