al2bp: To Convert a forensic microsatellite allele name into its...

View source: R/al2bp.R

al2bpR Documentation

To Convert a forensic microsatellite allele name into its length in base pairs


Conventions used to name forensic microsatellite alleles (STR) are described in Bar et al. (1994). The name "9.3" means for instance that there are 9 repetitions of the complete base oligomer and an incomplete repeat with 3 bp.


al2bp(, repeat.bp = 4, offLadderChars = "><", split = "\\.")


The name of the allele, coerced to a string type.


The length in bp of the microsatellite base repeat, most of them are tetranucleotides so that it defaults to 4. Do not forget to change this to 5 for loci based on pentanucleotides such as Penta D or Penta E.


NA is returned when at least one of these characters are found in the allele name. Off ladder alleles are typically reported as "<8" or ">19"


The convention is to use a dot, as in "9.3", between the number of repeats and the number of bases in the incomplete repeat. On some locales where the decimal separator is a comma this could be a source of problem, try to use "," instead for this argument which is forwarded to strsplit.


Warnings generated by faulty numeric conversions are suppressed here.


A single numeric value corresponding to the size in bp of the allele, or NA when characters spoting off ladder alleles are encountedred or when numeric conversion is impossible (e.g. with "X" or "Y" allele names at Amelogenin locus).


J.R. Lobry


Bar, W. and Brinkmann, B. and Lincoln, P. and Mayr, W.R. and Rossi, U. (1994) DNA recommendations. 1994 report concerning further recommendations of the DNA Commission of the ISFH regarding PCR-based polymorphisms in STR (short tandem repeat) systems. Int. J. Leg. Med., 107:159-160.


See Also

identifiler for forensic microsatellite allele name examples.


#   Quality check and examples:
stopifnot( al2bp("9") == 36 )   # 9 repeats of a tetranucleotide is 36 bp
stopifnot( al2bp(9) == 36 )      # also OK with numerical argument
stopifnot( al2bp(9, 5) == 45 )  # 9 repeats of a pentanucleotide is 45 bp
stopifnot( al2bp("9.3") == 39 ) # microvariant case
stopifnot("<8")) )   # off ladder case 
stopifnot(">19")) ) # off ladder case
stopifnot("X")) )     # non STR case
# Application to the alleles names in the identifiler data set where all loci are 
# tetranucleotide repeats:
al.names <- unlist(identifiler)
al.length <- sapply(al.names, al2bp)
loc.names <- unlist(lapply(identifiler, names))
loc.nall  <-unlist(lapply(identifiler, function(x) lapply(x,length)))
loc.fac <- factor(rep(loc.names, loc.nall))
par(lend = "butt", mar = c(5,6,4,1)+0.1)
boxplot(al.length~loc.fac, las = 1, col = "lightblue",
  horizontal = TRUE, main = "Range of allele lengths at forensic loci",
  xlab = "Length (bp)", ylim = c(0, max(al.length, na.rm = TRUE)))

seqinr documentation built on May 20, 2022, 1:09 a.m.