flow_knn: Make prediction using KNN method.
In ahorawzy/TFTSA: Traffic Flow Time Series Analysis

Description Usage Arguments KNN method Object flow and Flow Database Dynamics Similarity between flows Flexibility See Also

This function makes flexible prediction of traffic flow using K-Nearest Neighbours method.

1 2	flow_knn(obj, base, start, k, lag_duration, fore_duration, cat = FALSE, save_detail = FALSE, imbalance = FALSE)

`obj`	The object flow to be predicted, a row from a dataframe.
`base`	The flow database to select nearest neighbours.
`start`	Start point to make prediction.
`k`	The number of nearest neighbours.
`lag_duration`	The time window to determine the similarity between flows.
`fore_duration`	The time window to make prediction.
`cat`	If cat=1, it will print the detail information about neighbours each time.
`save_detail`	Default is FALSE. If it's set to a string filename with path, selected neighbours of each time will be saved in a .csv file in the given path.
`imbalance`	Default is FALSE. If it's set to be True, it will use dist_imbalance() to caculate distance.

K-Nearest Neighbours method is a non-parameter method. It can be used to make classifcation and prediction. This function uses KNN method for traffic flow prediction.

This function needs two components: object flow and flow database. The object flow is the flow to be predicted, which's used to refresh the flow as time goes on and make criterion of prediction. The flow database is used to select k nearest neighbours to make prediction of object flow in a given time window.

Unlike most KNN predicting functions which are static, this function is designed to be dynamic. It means that the process will not stop when a prediction of time window in the future is made. As time goes on, it will use the new real flow data to refresh the object flow, which sometimes is critical to the determination of similarity between flows. In other words, this function is rolling.

How to define similarity between flows is critical in KNN method. The function in this version is designed to use Euclidean distance, which has been testified to be useful in the context of traffic flow similarity defination.

The flexibility of this function is designed as follows.

When to start prediction is arbitrary. Before start point, the prediction of flow equals to real flow value.
The number of nearest neighbour is arbitrary. But it must be lower than the number of flows in flow database.
The time window in the past is arbitrary. It's used to determine the similarity between two flows. It's advised to be times of 12.
The time window in the future is arbitrary. It's used to determine how long the prediction will be. It's advised to be times of 12.

dist

ahorawzy/TFTSA documentation built on May 13, 2019, 12:18 p.m.