View source: R/duckdb-helpers.R
spod_duckdb_od | R Documentation |
This function creates a duckdb connection to the origin-destination data stored in CSV.gz files.
spod_duckdb_od(
con = DBI::dbConnect(duckdb::duckdb(), dbdir = ":memory:", read_only = FALSE),
zones = c("districts", "dist", "distr", "distritos", "municipalities", "muni",
"municip", "municipios", "lua", "large_urban_areas", "gau", "grandes_areas_urbanas"),
ver = NULL,
data_dir = spod_get_data_dir()
)
con |
A duckdb connection object. If not specified, a new in-memory connection will be created. |
zones |
The zones for which to download the data. Can be |
ver |
Integer. Can be 1 or 2. The version of the data to use. v1 spans 2020-2021, v2 covers 2022 and onwards. |
data_dir |
The directory where the data is stored. Defaults to the value returned by |
A duckdb
connection object with 2 views:
od_csv_raw
- a raw table view of all cached CSV files with the origin-destination data that has been previously cached in $SPANISH_OD_DATA_DIR
od_csv_clean
- a cleaned-up table view of od_csv_raw
with column names and values translated and mapped to English. This still includes all cached data.
The structure of the cleaned-up views od_csv_clean
is as follows:
Date
. The full date of the trip, including year, month, and day.
factor
. The identifier for the origin location of the trip, formatted as a code (e.g., '01001_AM').
factor
. The identifier for the destination location of the trip, formatted as a code (e.g., '01001_AM').
factor
. The type of activity at the origin location (e.g., 'home', 'work'). Note: Only available for district level data.
factor
. The type of activity at the destination location (e.g., 'home', 'other'). Note: Only available for district level data.
factor
. The province of residence for the group of individual making the trip, encoded according to the INE classification. Note: Only available for district level data.
factor
. The province of residence for the group of individuals making the trip (e.g., 'Cuenca', 'Girona'). Note: Only available for district level data.
integer
. The time slot (the hour of the day) during which the trip started, represented as an integer (e.g., 0, 1, 2).
factor
. The distance category of the trip, represented as a code (e.g., '002-005' for 2-5 km).
double
. The number of trips taken within the specified time slot and distance.
double
. The total length of all trips in kilometers for the specified time slot and distance.
double
. The year of the trip.
double
. The month of the trip.
double
. The day of the trip.
The structure of the original data in od_csv_raw
is as follows:
Date
. The date of the trip, including year, month, and day.
character
. The identifier for the origin location of the trip, formatted as a character string (e.g., '01001_AM').
character
. The identifier for the destination location of the trip, formatted as a character string (e.g., '01001_AM').
character
. The type of activity at the origin location (e.g., 'casa', 'trabajo').
character
. The type of activity at the destination location (e.g., 'otros', 'trabajo').
character
. The code representing the residence of the individual making the trip (e.g., '01') according to the official INE classification.
character
. The age of the individual making the trip. This data is actaully filled with 'NA' values, which is why this column is removed in the cleaned-up and translated view described above.
integer
. The time period during which the trip started, represented as an integer (e.g., 0, 1, 2).
character
. The distance category of the trip, represented as a character string (e.g., '002-005' for 2-5 km).
double
. The number of trips taken within the specified time period and distance.
double
. The total length of all trips in kilometers for the specified time period and distance.
double
. The day of the trip.
double
. The month of the trip.
double
. The year of the trip.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.