learnPattern: Learn Local Auto-Patterns for Time Series Representation and...
In LPStimeSeries: Learned Pattern Similarity and Representation for Time Series

Description Usage Arguments Value Note Author(s) References See Also Examples

learnPattern implements ensemble of regression trees (based on Breiman and Cutler's original Fortran code) to learn local auto-patterns for time series representation. Ensemble of regression trees are used to learn an autoregressive model. A local time-varying autoregressive behavior is learned by the ensemble.

## Default S3 method:
learnPattern(x,
   segment.factor=c(0.05,0.95),
   random.seg=TRUE, target.diff=TRUE, segment.diff=TRUE, 
   random.split=0,
   ntree=200,
   mtry=1,
   replace=FALSE,
   sampsize=if (replace) ceiling(0.632*nrow(x)) else nrow(x),
   maxdepth=6,
   nodesize=5,
   do.trace=FALSE,
   keep.forest=TRUE,
   oob.pred=FALSE,
   keep.errors=FALSE, 
   keep.inbag=FALSE, ...)
## S3 method for class 'learnPattern'
print(x, ...)

`x`	time series database as a matrix in UCR format. Rows are univariate time series, columns are observations (for the `print` method, a `learnPattern` object).
`segment.factor`	The proportion of the time series length to be used for both predictors and targets, if `random.seg` is `TRUE` (default), minimum and maximum factor should be provided as array of length two.
`random.seg`	`TRUE` if segment length is random between thresholds defined by `segment.factor`
`target.diff`	Can target segment be a difference feature?
`segment.diff`	Can predictor segments be difference feature?
`random.split`	Type of the split. If set to zero (0), splits are generated based on decrease in SSE in target segment Setting of one (1) generates the split value randomly between max and min values. Setting of two (2) generates a kd-tree type of split (i.e. median of the values at each node is chosen as the split).
`ntree`	Number of trees to grow. Larger number of trees are preferred if there is no concern regarding the computation time.
`mtry`	Number of predictor segments randomly sampled as candidates at each split. Note that it is preset to 1 for now.
`replace`	Should bagging of time series be done with replacement? All training time series are used if `FALSE` (default).
`sampsize`	Size(s) of sample to draw with replacement if replace is set to `TRUE`
`maxdepth`	The maximum depth of the trees in the ensemble.
`nodesize`	Minimum size of terminal nodes. Setting this number larger causes smaller trees to be grown (and thus take less time).
`do.trace`	If set to `TRUE`, give a more verbose output as `learnPattern` is run. If set to some integer, then running output is printed for every `do.trace` trees.
`keep.forest`	If set to `FALSE`, the forest will not be retained in the output object.
`oob.pred`	if replace is set to `TRUE`, predictions for the time series observations are returned.
`keep.errors`	If set to `TRUE`, the mean square error (MSE) of target prediction over target segments is evaluated for each tree. If `oob.pred=TRUE`, this information is evaluated on “out-of-bag” samples at each tree.
`keep.inbag`	Should an `n` by `ntree` matrix be returned that keeps track of which samples are “in-bag” in which trees
`...`	optional parameters to be passed to the low level function `learnPattern`.

An object of class learnPattern, which is a list with the following components:

`call`	the original call to `learnPattern`.
`type`	`regression`
`segment.factor`	the proportion of the time series length to be used for both predictors and targets.
`segment.length`	used segment length settings by the trees of ensemble
`nobs`	number of observations in a segment
`ntree`	number of trees grown
`maxdepth`	maximum depth level for each tree
`mtry`	number of predictor segments sampled for spliting at each node.
`target`	starting time of the target segment for each tree.
`target.type`	type of the target segment; 1 if observed series, 2 if difference series.
`forest`	a list that contains the entire forest; `NULL` if `keep.forest=FALSE`.
`oobprediction`	predicted observations based on “out-of-bag” time series are returned if `oob.pred=TRUE`
`ooberrors`	Mean square error (MSE) over the trees evaluated using the predicted observations on “out-of-bag” time series is returned if `oob.pred=TRUE`.
`inbag`	`n` by `ntree` matrix be returned that keeps track of which samples are “in-bag” in which trees if `keep.inbag=TRUE`
`errors`	Mean square error (MSE) of target prediction over target segments for each tree. If `oob.pred=TRUE`, Mean square error (MSE) is reported based on “out-of-bag” samples at each tree.

OOB predictions may have missing values (i.e. NA) if time series is not left out-of-bag during computations. Even, it is left out-of-bag, there is a potential of some observations (i.e. time frames) not being selected as the target. In such cases, there will no OOB predictions.

Mustafa Gokce Baydogan baydoganmustafa@gmail.com, based on original Fortran code by Leo Breiman and Adele Cutler, R port by Andy Liaw and Matthew Wiener.

Baydogan, M. G. (2013), “Learned Pattern Similarity“, Homepage: http://www.mustafabaydogan.com/learned-pattern-similarity-lps.html.

Breiman, L. (2001), Random Forests, Machine Learning 45(1), 5-32.

predict.learnPattern, computeSimilarity, tunelearnPattern

data(GunPoint)
set.seed(71)

## Learn patterns on GunPoint training series with default parameters
ensemble=learnPattern(GunPoint$trainseries)
print(ensemble)

## Find the similarity between test and training series based on the learned model
similarity=computeSimilarity(ensemble,GunPoint$testseries,GunPoint$trainseries)

## Find the index of 1 nearest neighbor (1NN) training series for each test series
NearestNeighbor=apply(similarity,1,which.min)

## Predicted class for each test series
predicted=GunPoint$trainclass[NearestNeighbor]

## Compute the percentage of accurate predictions
accuracy=sum(predicted==GunPoint$testclass)/nrow(GunPoint$testseries)
print(100*accuracy)

## Learn patterns randomly on GunPoint training series with default parameters
ensemble=learnPattern(GunPoint$trainseries, random.split=1)

## Find the similarity between test and training series and classify test series
similarity=computeSimilarity(ensemble,GunPoint$testseries,GunPoint$trainseries)
NearestNeighbor=apply(similarity,1,which.min)
predicted=GunPoint$trainclass[NearestNeighbor]
accuracy=sum(predicted==GunPoint$testclass)/nrow(GunPoint$testseries)
print(100*accuracy)

## Learn patterns by training each tree on a random subsample
## and classify test time series
ensemble=learnPattern(GunPoint$trainseries,replace=TRUE)
similarity=computeSimilarity(ensemble,GunPoint$testseries,GunPoint$trainseries)
NearestNeighbor=apply(similarity,1,which.min)
predicted=GunPoint$trainclass[NearestNeighbor]
print(predicted)

## Learn patterns and do predictions on OOB time series
ensemble=learnPattern(GunPoint$trainseries,replace=TRUE,target.diff=FALSE,oob.pred=TRUE)
## Plot first series and its OOB approximation
plot(GunPoint$trainseries[1,],xlab='Time',ylab='Observation',
	type='l',lty=1,lwd=2)
points(c(1:ncol(GunPoint$trainseries)),ensemble$oobpredictions[1,],
	type='l',col=2,lty=2,lwd=2)
legend('topleft',c('Original series','Approximation'),
	col=c(1,2),lty=c(1,2),lwd=2)