To Convert a forensic microsatellite allele name into its length in base pairs
Conventions used to name forensic microsatellite alleles (STR) are described in Bar et al. (1994). The name "9.3" means for instance that there are 9 repetitions of the complete base oligomer and an incomplete repeat with 3 bp.
The name of the allele, coerced to a string type.
The length in bp of the microsatellite base repeat, most of them are tetranucleotides so that it defaults to 4. Do not forget to change this to 5 for loci based on pentanucleotides such as Penta D or Penta E.
The convention is to use a dot, as in "9.3", between the number of repeats
and the number of bases in the incomplete repeat. On some locales where the
decimal separator is a comma this could be a source of problem, try to use
"," instead for this argument which is forwarded to
Warnings generated by faulty numeric conversions are suppressed here.
A single numeric value corresponding to the size in bp of the allele, or NA when characters spoting off ladder alleles are encountedred or when numeric conversion is impossible (e.g. with "X" or "Y" allele names at Amelogenin locus).
Bar, W. and Brinkmann, B. and Lincoln, P. and Mayr, W.R. and Rossi, U. (1994) DNA recommendations. 1994 report concerning further recommendations of the DNA Commission of the ISFH regarding PCR-based polymorphisms in STR (short tandem repeat) systems. Int. J. Leg. Med., 107:159-160.
identifiler for forensic microsatellite allele name examples.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
# # Quality check and examples: # stopifnot( al2bp("9") == 36 ) # 9 repeats of a tetranucleotide is 36 bp stopifnot( al2bp(9) == 36 ) # also OK with numerical argument stopifnot( al2bp(9, 5) == 45 ) # 9 repeats of a pentanucleotide is 45 bp stopifnot( al2bp("9.3") == 39 ) # microvariant case stopifnot( is.na(al2bp("<8")) ) # off ladder case stopifnot( is.na(al2bp(">19")) ) # off ladder case stopifnot( is.na(al2bp("X")) ) # non STR case # # Application to the alleles names in the identifiler data set where all loci are # tetranucleotide repeats: # data(identifiler) al.names <- unlist(identifiler) al.length <- sapply(al.names, al2bp) loc.names <- unlist(lapply(identifiler, names)) loc.nall <-unlist(lapply(identifiler, function(x) lapply(x,length))) loc.fac <- factor(rep(loc.names, loc.nall)) par(lend = "butt", mar = c(5,6,4,1)+0.1) boxplot(al.length~loc.fac, las = 1, col = "lightblue", horizontal = TRUE, main = "Range of allele lengths at forensic loci", xlab = "Length (bp)", ylim = c(0, max(al.length, na.rm = TRUE)))
Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.