al2bp | R Documentation |
Conventions used to name forensic microsatellite alleles (STR) are described in Bar et al. (1994). The name "9.3" means for instance that there are 9 repetitions of the complete base oligomer and an incomplete repeat with 3 bp.
al2bp(allele.name, repeat.bp = 4, offLadderChars = "><", split = "\\.")
allele.name |
The name of the allele, coerced to a string type. |
repeat.bp |
The length in bp of the microsatellite base repeat, most of them are tetranucleotides so that it defaults to 4. Do not forget to change this to 5 for loci based on pentanucleotides such as Penta D or Penta E. |
offLadderChars |
|
split |
The convention is to use a dot, as in "9.3", between the number of repeats
and the number of bases in the incomplete repeat. On some locales where the
decimal separator is a comma this could be a source of problem, try to use
"," instead for this argument which is forwarded to |
Warnings generated by faulty numeric conversions are suppressed here.
A single numeric value corresponding to the size in bp of the allele, or NA when characters spoting off ladder alleles are encountedred or when numeric conversion is impossible (e.g. with "X" or "Y" allele names at Amelogenin locus).
J.R. Lobry
Bar, W. and Brinkmann, B. and Lincoln, P. and Mayr, W.R. and Rossi, U. (1994) DNA recommendations. 1994 report concerning further recommendations of the DNA Commission of the ISFH regarding PCR-based polymorphisms in STR (short tandem repeat) systems. Int. J. Leg. Med., 107:159-160.
citation("seqinR")
identifiler
for forensic microsatellite allele name examples.
#
# Quality check and examples:
#
stopifnot( al2bp("9") == 36 ) # 9 repeats of a tetranucleotide is 36 bp
stopifnot( al2bp(9) == 36 ) # also OK with numerical argument
stopifnot( al2bp(9, 5) == 45 ) # 9 repeats of a pentanucleotide is 45 bp
stopifnot( al2bp("9.3") == 39 ) # microvariant case
stopifnot( is.na(al2bp("<8")) ) # off ladder case
stopifnot( is.na(al2bp(">19")) ) # off ladder case
stopifnot( is.na(al2bp("X")) ) # non STR case
#
# Application to the alleles names in the identifiler data set where all loci are
# tetranucleotide repeats:
#
data(identifiler)
al.names <- unlist(identifiler)
al.length <- sapply(al.names, al2bp)
loc.names <- unlist(lapply(identifiler, names))
loc.nall <-unlist(lapply(identifiler, function(x) lapply(x,length)))
loc.fac <- factor(rep(loc.names, loc.nall))
par(lend = "butt", mar = c(5,6,4,1)+0.1)
boxplot(al.length~loc.fac, las = 1, col = "lightblue",
horizontal = TRUE, main = "Range of allele lengths at forensic loci",
xlab = "Length (bp)", ylim = c(0, max(al.length, na.rm = TRUE)))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.