opt81.md
In genepop: Population Genetic Data Analysis Using Genepop

\index{Null alleles} This sub-option allows estimation of gene frequencies when a null allele is present. Different methods are available: maximum likelihood, maximum likelihood with genotyping failure, and Brookfield’s (1996) estimator, which differences are explained in Section \@ref(null-alleles).[^22]

Genepop takes the allele with the highest number for a given locus across all populations as the null allele.[^23] For example, if you have 4 alleles plus a null allele, a null homozygote individual should be indicated as e.g. 0505 or 9999 in the input file.

The default estimation method is maximum likelihood, using the EM algorithm of @DempsterLR77. Apparent null genotypes may also be due to nonspecific genotyping failures. Joint maximum likelihood estimation of such failure rate (“$\beta$”) and of allele frequencies is available through the setting NullAlleleMethod=ApparentNulls. Finally, the estimator of @Brookfield96 is also available through the setting NullAlleleMethod=B96.\index{NullAlleleMethod setting} Confidence intervals for null allele frequencies are computed for each locus in each population. Their coverage probability can be modified by the same setting CIcoverage as in options 6.5 and 6.6.\index{CIcoverage setting}

The output file is saved in the file yourdata.NUL. This file may contain

For the maximum likelihood methods, estimated allelic frequencies and predicted numbers of homozygotes and of heterozygotes with a null allele. For example, in an output such as
```
 Allele   EM freq.  Homoz.    Null Heter.
  1      0.2762    2.7046     4.2954
  2      0.2576    1.8500     3.1500
  3      0.2251    1.3567     2.6433
  4      0.0217    0.0000     0.0000
 Null    0.2193
```
of the seven (2.7046+4.2954) apparent homozygotes for allele 1, it is predicted that 4.2954 are actually heterozygotes for allele 1 and for the null allele. This predicted value is the expected, or average, number of such heterozygotes over different samples with the same number of apparent genotypes, under the assumptions of the model.
a summary locus-by-population table of estimates of null allele frequencies.
a summary locus-by-population table of estimates of genotyping failure frequencies (“beta”), if applicable.
A table of bootstrap confidence intervals for estimates of null allele frequencies.

Note that there may be insufficient information to compute estimates and/or confidence intervals: not enough alleles in the sample, for example. These are indicated by the message No information. Sometimes the point estimate can formally be computed but the computed CI is not meaningful. This happens for example in case of heterozygote excess, and generates a (No info for CI) warning (if all pseudo-samples generated by some resampling technique show an heterozygote excess, all pseudo-estimates of null allele frequency will be zero and there is no information to construct a non-null CI from this distribution).

The confidence intervals for null allele frequencies are obtained by a bootstrap method, and are not suitable for testing for the presence of null alleles, because the null hypothesis is at the boundary of the parameter space [@Andrews00]. Instead, the exact score test for Hardy-Weinberg proportions can be used.