Description Usage Arguments Details Value References Examples
Oversample a univariate, multi-modal time series sequence of imbalanced classified data.
1 2 |
sample |
Univariate sequence data samples |
label |
Labels corresponding to samples |
class |
The number of the classes to be oversampled, starting from the class with the fewest observations, with the default setting to progress to as many classes as possible. |
ratio |
The oversampling ratio number (>=1) (default = 1) |
per |
Ratio of weighting between ESPO and ADASYN (default = 0.8) |
r |
A scalar ratio specifying which level (towards the boundary) we shall push the synthetic data in ESPO (default = 1) |
k |
Number of nearest neighbours in k-NN (for ADASYN) algorithm (default = 5) |
m |
Seeds from the positive class in m-NN (for ADASYN) algorithm (default = 15) |
parallel |
Whether to execute in parallel mode (default = TRUE). (Recommended for datasets with over 30,000 records.) |
progBar |
Whether to include progress bars (default = TRUE). For ESPO approach, the bar charactor is |——–|100%. For ADASYN approach, the bar charactor is |========|100%. |
This function balances univariate imbalance time series data based on structure preserving oversampling.
sample: the time series sequences data oversampled
label: the label corresponding to each row of records
H. Cao, X.-L. Li, Y.-K. Woon and S.-K. Ng, "Integrated Oversampling for Imbalanced Time Series Classification" IEEE Trans. on Knowledge and Data Engineering (TKDE), vol. 25(12), pp. 2809-2822, 2013
H. Cao, V. Y. F. Tan and J. Z. F. Pang, "A Parsimonious Mixture of Gaussian Trees Model for Oversampling in Imbalanced and Multi-Modal Time-Series Classification" IEEE Trans. on Neural Network and Learning System (TNNLS), vol. 25(12), pp. 2226-2239, 2014
H. Cao, X. L. Li, Y. K. Woon and S. K. Ng, "SPO: Structure Preserving Oversampling for Imbalanced Time Series Classification" Proc. IEEE Int. Conf. on Data Mining ICDM, pp. 1008-1013, 2011
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 | # This is a simple example to show the usage of OSTSC. See the vignetter for a tutorial
# demonstrating more complex examples.
# Example one
# loading data
data(Dataset_Synthetic_Control)
# get split feature and label data
train.label <- Dataset_Synthetic_Control$train.y
train.sample <- Dataset_Synthetic_Control$train.x
# the first dimension of the feature set and labels must be the same
# the second dimension of the feature set is the sequence length
dim(train.sample)
dim(train.label)
# check the imbalance ratio of the data
table(train.label)
# oversample class 1 to the same number of observations as class 0
MyData <- OSTSC(train.sample, train.label, parallel = FALSE)
# store the feature data after oversampling
x <- MyData$sample
# store the label data after oversampling
y <- MyData$label
# check the imbalance of the data
table(y)
# Example two
# loading data
ecg <- Dataset_ECG()
# get split feature and label data
train.label <- ecg$train.y
train.sample <- ecg$train.x
# the first dimension of the feature set and labels must be the same
# the second dimension of the feature set is the sequence length
dim(train.sample)
dim(train.label)
# check the imbalance ratio of the data
table(train.label)
# oversample minority class to the same number of observations as majority classes
MyData <- OSTSC(train.sample, train.label, parallel = FALSE)
# store the feature data after oversampling
x <- MyData$sample
# store the label data after oversampling
y <- MyData$label
# check the imbalance of the data
table(y)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.