ValleyBike.org data package.
For the reproducible data curation process, check out the Data Import Workflow Documentation. For more specific information on the datasets and utility functions included, see the package manual.
Install the development version from GitHub:
devtools::install_github("Amherst-Statistics/valleybikeData")
library(valleybikeData)
Due to issues on certain platforms, the valleybikeData
package does
not use lazy data. As such, you need to manually load in datasets
anytime you want to use them using utils::data()
, for example:
data("september2019", envir = environment())
Please see ?data
for more details and best practices.
For more information on the datasets and utility functions included in the package, please see the manual.
The valleybikeData
package includes all currently-available ValleyBike
trajectory data in month-by-month chunks, as well as additional
aggregated data on individual users and trips. A dataset
containing all permanent bikeshare stations is also included.
| Dataset Name | Description |
| ------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------ |
| trips
| one-row-per-trip data, including variables like duration, start and end times, start and end stations, etc. |
| stations
| data on all permanent ValleyBike stations |
| users
| one-row-per-user data, including variables like total number of trips, time and date of first trip, top start and end stations, etc. |
| june2018
up to november2018
, april2019
up to november2019
, june2020
up to december2020
, and january2021
up to may2021
| by-month trajectory data for all active months of ValleyBike, collected at 5-second intervals during each trip |
The valleybikeData
package includes a variety of workflow functions
for importing the raw data:
import_day
(import a single day’s worth of data from source)import_month
(import a month’s worth of data from source)get_full_data
(get all available data from source)aggregate_trips
(aggregate a one-row-per-trip dataset)aggregate_users
(aggregate a one-row-per-user dataset)download_files
(download raw .csv.gz data files from source)The get_monthly_dataset
function is an important utility provided by
the valleybikeData
package. It can be used to access a monthly
trajectory dataset through numeric representations of the corresponding
month and year. This is particularly useful when data access must be
automated, so writing out dataset names like "july2019"
becomes
inconvenient:
# returns the july2019 dataset
get_monthly_dataset(month = 7, year = 2019)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.