README.md

valleybikeData

R build
status Lifecycle:
stable

ValleyBike.org data package.

For the reproducible data curation process, check out the Data Import Workflow Documentation. For more specific information on the datasets and utility functions included, see the package manual.

Installation

Install the development version from GitHub:

devtools::install_github("Amherst-Statistics/valleybikeData")
library(valleybikeData)

Usage

Due to issues on certain platforms, the valleybikeData package does not use lazy data. As such, you need to manually load in datasets anytime you want to use them using utils::data(), for example:

data("september2019", envir = environment())

Please see ?data for more details and best practices.

For more information on the datasets and utility functions included in the package, please see the manual.

Datasets

The valleybikeData package includes all currently-available ValleyBike trajectory data in month-by-month chunks, as well as additional aggregated data on individual users and trips. A dataset containing all permanent bikeshare stations is also included.

| Dataset Name | Description | | ------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------ | | trips | one-row-per-trip data, including variables like duration, start and end times, start and end stations, etc. | | stations | data on all permanent ValleyBike stations | | users | one-row-per-user data, including variables like total number of trips, time and date of first trip, top start and end stations, etc. | | june2018 up to november2018, april2019 up to november2019, june2020 up to december2020, and january2021 up to may2021 | by-month trajectory data for all active months of ValleyBike, collected at 5-second intervals during each trip |

Functions

Data Import Functions

The valleybikeData package includes a variety of workflow functions for importing the raw data:

Utility Functions

The get_monthly_dataset function is an important utility provided by the valleybikeData package. It can be used to access a monthly trajectory dataset through numeric representations of the corresponding month and year. This is particularly useful when data access must be automated, so writing out dataset names like "july2019" becomes inconvenient:

# returns the july2019 dataset
get_monthly_dataset(month = 7, year = 2019)


Amherst-Statistics/valleybikeData documentation built on Sept. 8, 2021, 5:48 a.m.