| create.diffsmatrix | R Documentation | 
Generate an array of goodness-of-fit (or distance) between samples and knowns based on the sizes (in base pairs) of TRFLP peaks. For each sample/known combination, and for each enzyme/primer combination, this calculates the minimum distance between any peak in the sample and the single peak in the known.
create.diffsmatrix(samples, knowns)
| samples | A  | 
| knowns | A  | 
This function will rarely need to be called directly, but does most of
the calculations behind TRAMP, so it is useful to
understand how this works.
This function generates a three-dimensional s \times k \times
    n matrix of the (smallest, see below) distance in base
pairs between peaks in a collection of unknowns (run data) and a
database of knowns for several enzyme/primer combinations.  s is
the number of different samples in the samples data
(length(labels(samples))), k is the number of different
types in the knowns database (length(labels(knowns))), and
n is the number of different enzyme/primer combinations.  The
enzyme/primer combinations used are all combinations present in the
knowns database; combinations present only in the samples will be
ignored.  Not all samples need contain all enzyme/primer combinations
present in the knowns.
In the resulting array, m[i,j,k] is the difference (in base
pairs) between the ith sample and the jth known for the
kth enzyme/primer combination.  The ordering of the n
enzyme/primer combinations is arbitrary, so a data.frame of
combinations is included as the attribute enzyme.primer, where
enzyme.primer$enzyme[k] and enzyme.primer$primer[k]
correspond to enzyme and primer used for the distances in
m[,,k].
Each case in the knowns database has a single (or no) peak for each
enzyme/primer combination, but each sample may contain multiple peaks
for an enzyme/primer combination; the difference is always the
smallest distance from the sample to the known peak.  Where a sample
and/or a known lacks an enzyme/primer combination, the value of the
difference is NA.  The smallest absolute distance is
taken between sample and known peaks, but the sign of the difference
is preserved (negative where the closest sample peak was less than the
known peak, positive where greater; see absolute.min).
A three-dimensional matrix, with an attribute enzyme.primer,
described above.
TRAMP, which uses output from
create.diffsmatrix.
data(demo.samples)
data(demo.knowns)
s <- length(labels(demo.samples))
k <- length(labels(demo.knowns))
n <- nrow(unique(demo.knowns$data[c("enzyme", "primer")]))
m <- create.diffsmatrix(demo.samples, demo.knowns)
dim(m)
identical(dim(m), c(s, k, n))
## Maximum error for each sample/known (i.e. across all enzyme/primer
## combinations), similar to how calculated by \link{TRAMP}
error <- apply(abs(m), 1:2, max, na.rm=TRUE)
dim(error)
## Euclidian error (see ?\link{TRAMP})
error.euclid <- sqrt(rowSums(m^2, TRUE, 2))/rowSums(!is.na(m), dims=2)
## Euclidian and maximum error will require different values of
## accept.error in TRAMP:
plot(error, error.euclid, pch=".")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.