NYCdata: NYC Taxi Network Dataset

NYCdataR Documentation

NYC Taxi Network Dataset


A dataset derived from NYC taxi trip records. Additionally, the dataset includes a signal f that represents the total amount with added noise.




A list with 2 elements:

  • A: NYC adjacency matrix, constructed using Gaussian weights based on mean distances between locations.

  • f: Signal representing the "total amount" with added artificial noise.


The graph constructed represents the connectivity based on taxi trips between different locations. The weights of the edges represent the frequency and distances of trips between locations.

The data comes from the methodology in the referenced paper. It is constructed from real-world data fetched from NYC taxis databases. The graph consists of 265 vertices which correspond to different LocationID (both Pick-Up and Drop-Off points). Gaussian weights are defined by

w_{ij} = \exp(-\tau d_{i,j}^2)

, where d_{i,j} represents the mean distance taken on all the trips between locations i and j or j and i.

The signal f is constructed based on the "total amount" variable from the taxi dataset, with added artificial noise.


de Loynes, B., Navarro, F., Olivier, B. (2021). Data-driven thresholding in denoising with Spectral Graph Wavelet Transform. Journal of Computational and Applied Mathematics, Vol. 389.

gasper documentation built on Oct. 27, 2023, 1:07 a.m.