Description Usage Arguments Details Value Dependencies Author(s) Examples
dpseg
splits a curve (x,y data) into linear segments by a
straight forward dynamic programming recursion:
S_j = max(S_{i-jumps} + score(i,j) -P)
where score is a measure of the
goodnes of the fit of a linear regression (equiv. to
lm(y~x)
) between data points i<j. The default scoring
function is simply the negative variance of residuals of the linear
regression (see arguments type
and scoref
). P
is a break-point penality that implicitly regulates the number of
segments (higher P: longer segments), and jumps==1
allows
for disjoint segments. The arguments minl
and maxl
specify minimal (i≤ j-minl) and maximal (i≥ j-maxl)
segment lengths, which allows to significantly decrease memory
usage when expected segment lengths are known.
1 2 3 |
x |
x-values, not used if |
y |
y-values, or a pre-calculated scoring function matrix
SCR_{i,j} (eg. from a previous run of |
maxl |
maximal segment length, i≥ j-maxl |
jumps |
allow for jumps between segments, if |
P |
break-point penalty, increase to get longer segments with lower scores (eg. higher residual variance) |
minl |
minimal segment length, i≤ j-minl |
S0 |
initialization of S_0, choose high enough to avoid length 1 cutoffs at start |
type |
type of scoring function: available are "var" for
"variance of residuals", "cor" for Pearson correlation, or "r2"
for r-squared; see the package |
scoref |
alternative scoring function |
verb |
print progress messages |
move |
logical indicating whether move is required in backtracing, required for the alternative recursion S_i + score(i+1,j) |
store.values |
store scoring values (linear regression results) |
store.matrix |
store the fitscore matrix |
add.lm |
add a linear fit using R base |
recursion |
internal recursion function to be used for
segmentation; used for debugging, benchmarking and development,
and required for putative novel scoring functions |
backtrace |
internal function to be used for back-tracing;
used for debugging, benchmarking and development, and may be
required to test novel scoring functions |
... |
further arguments to |
See the vignette("dpseg")
for the theory and details on the
choice of scoring functions and selection of the penalty parameter
P.
Returns a list object of class dpseg
(with
print.dpseg
plot.dpseg
and predict.dpseg
methods). The main result of the algorithm is a table
(data.frame
) of predicted segments in list object
segments
. The original data, run parameters and
(optionally) additional data calculated and used by the
algorithm are also returned.
main result table: a data.frame
that lists
the start and end x-values of the segments, the start and end
indices (i,j) in the data vectors, the linear regression
coefficients and goodness-of-fit measures for the segments
(intercept, slope, r-squared, variance of residuals). If
dpseg
was called with a pre-calculated scoring matrix,
the table only contains start and end indices i,j. If option
add.lm=TRUE
or the result object was sent through
function addLm
the table additionally contains
results from R's lm
, indicated by an ".lm" suffix.
results of the recursion, ie. S_j in above equation.
vector j=1,…,n, storing the i_{max} that yielded S_j, ie., the sole input for the backtracing function.
linear regression coefficients and measures for the
segment ending at j and starting at
i_{max}(j). Only present if store.valus=TRUE
.
scoring function matrix SCR_{i,j} = score(i,j)
where positions j are the columns and i the rows; a banded
matrix with non-NA values between i≤ j-minl and
i≥ j-maxl. Note, that this matrix can be re-used in
subsequent calls as dpseg(y=previous$SCR)
which runs
much faster and allows to efficiently scan for alternative
parameters. Only present if store.matrix=TRUE
.
result objects from lm
. Only present if
add.lm=TRUE
.
result of the call to the backtracing function: ends of the segments.
original x/y data (xy.coords).
index of NA/Inf values that were removed before running the alorithm.
used parameters P
, jumps
,
maxl
and minl
.
The package strictly depends only on RcppEigen
.
All other dependencies are usually present in a
basic installation (stats
, graphics
, grDevices
).
Rainer Machne machne@hhu.de, Peter F. Stadler studla@bioinf.uni-leipzig.de
1 2 3 4 5 6 7 8 9 10 11 12 | ## calculate linear segments in semi-log bacterial growth data
## NOTE: library loads bacterial growth curve data as data.frame oddata
segs <- dpseg(x=oddata$Time, y=log(oddata$A3), minl=5, P=0.0001, verb=1)
## inspect resulting segments
print(segs)
## plot results (also see the movie method)
plot(segs, delog=TRUE, log="y")
## predict method
plot(predict(segs), type="l")
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.