extract_event_ftrs: Extracts events from a data stream and computes event...

View source: R/extract_event_ftrs.R

extract_event_ftrsR Documentation

Extracts events from a data stream and computes event features.

Description

This function extracts events from a 2D or 3D data stream and computes a set of 30 features for 2D streams and 13 features for 3D streams, by using a moving window. 2D data streams with class labels can be generated by using the function gen_stream. To get the class labels of the extracted events for the supervised setting, the event position is matched with the details of the events, which is part of the output of the gen_stream function.

Usage

extract_event_ftrs(
  stream,
  supervised = FALSE,
  details = NULL,
  win_size = 200,
  step_size = 20,
  thres = 0.95,
  folder = NULL,
  vis = FALSE,
  tt = 10,
  epsilon = 5,
  miniPts = 10,
  rolling = TRUE
)

Arguments

stream

A data stream. This can be the output of either the gen_stream function or the stream_from_files function.

supervised

If TRUE, event class labels need to be given in details.

details

Event details. This is also an output of the gen_stream function. Event details are used to get the class labels of the extracted events, by matching the position.

win_size

The window length of the moving window model, default is set to 200.

step_size

The window is moved by the step_size, default is 20.

thres

The cut-off quantile. Default is set to 0.95. Values greater than the quantile will be clustered. The rest is not clustered.

folder

If set to a local folder, this is where the jpegs of window data and extracted events are saved for a 2D data stream.

vis

If TRUE, the window data and the extracted events are plotted for a 2D data stream.

tt

Related to event ages. For example if tt=10 then the event ages are 10, 20, 30 and 40.

epsilon

The eps parameter in dbscan function in the package dbscan

miniPts

The minPts parameter in dbscan function in the package dbscan

rolling

This parameter is set to TRUE if rolling windows are considered.

Value

An Nx22x4 array is returned for 2D data streams and an Nx13x4 array for 3D data streams. Here N is the total number of events extracted from all windows. The second dimension has m features and the class label for the supervised setting. The third dimension has 4 different event ages : tt, 2tt, 3tt, 4tt. For example, the element at [10,6,3] has the 6th feature, of the 10th extracted event when the age of the event is 3tt. The features for 2D streams are listed below. For 3D streams the features cluster_id, pixels, length, width, height, total_value, l2w_ratio, centroid_x, centroid_y, centroid_z, mean, std_dev and sd_from_global_mean are computed.

cluster_id

An identification number for each event.

pixels

The number of pixels of each event.

length

The length of the event.

width

The width of the event.

total_value

The total value of the pixels.

l2w_ratio

Length to width ratio of event.

centroid_x

x coordinate of event centroid.

centroid_y

y coordinate of event centroid.

mean

Mean value of event pixels.

std_dev

Standard deviation of event pixels.

avg_slope

The slope of an lm object fitted to the event pixels.

quad_1

The linear coefficient of a second order polynomial fitted to event pixels using lm.

quad_2

The quadratic coefficient of a second order polynomial fitted to event pixels using lm.

2sd_from_mean

The proportion of event pixels/cells that has values greater than 2 global standard deviations from the global mean of the window.

3sd_from_mean

The proportion of event pixels/cells that has values greater than 3 global standard deviations from the global mean of the window.

4sd_from_mean

The proportion of event pixels/cells that has values greater than 4 global standard deviations from the global mean of the window.

5iqr_from_median

A small portion of each window and its column medians and column IQRs are used to construct two smoothing splines: a median spline and an IQR spline. The value of the median smoothing spline at each event centroid is used as the local median for that event. Similarly, the value of the IQR smoothing spline at each event centroid is used as the local IQR for that event. This feature gives the proportion of event pixels/cells that has values greater than 5 local IQRs from the local median.

6iqr_from_median

The proportion of event pixels/cells that has values greater than 6 local IQRs from the local median computed using splines.

7iqr_from_median

The proportion of event pixels/cells that has values greater than 7 local IQRs from the local median computed using splines.

8iqr_from_median

The proportion of event pixels/cells that has values greater than 8 local IQRs from the local median computed using splines.

iqr_from_median

Let us denote the 75th percentile of the event pixels value by x. How many local IQRs is x is away from the local median? Both local IQR and local median are computed using splines. That value is given by this feature.

sd_from_mean

Let us denote the 80th percentile of the event pixels value by x. How many global standard deviations is x is away from the global mean? Here both global values are computed from window data.

Examples

# 2D data stream example
out <- gen_stream(1, sd=15)
zz <- as.matrix(out$data)
features <- extract_event_ftrs(zz, supervised=TRUE, details = out$details)
features

# 3D data stream example
set.seed(1)
arr <- array(rnorm(12000),dim=c(40,25,30))
arr[25:33,12:20, 20:23] <- 10
# getting events
ftrs <- extract_event_ftrs(arr, supervised=FALSE, win_size=10, step_size = 2, tt=2, thres=0.985)
ftrs


eventstream documentation built on May 16, 2022, 9:06 a.m.