View source: R/duckdb-helpers.R
| spod_duckdb_od | R Documentation |
This function creates a duckdb connection to the origin-destination data stored in CSV.gz files.
spod_duckdb_od(
con = DBI::dbConnect(duckdb::duckdb(), dbdir = ":memory:", read_only = FALSE),
zones = c("districts", "dist", "distr", "distritos", "municipalities", "muni",
"municip", "municipios", "lua", "large_urban_areas", "gau", "grandes_areas_urbanas"),
ver = NULL,
data_dir = spod_get_data_dir()
)
con |
A duckdb connection object. If not specified, a new in-memory connection will be created. |
zones |
The zones for which to download the data. Can be |
ver |
Integer. Can be 1 or 2. The version of the data to use. v1 spans 2020-2021, v2 covers 2022 and onwards. See more details in codebooks with |
data_dir |
The directory where the data is stored. Defaults to the value returned by |
A duckdb connection object with 2 views:
od_csv_raw - a raw table view of all cached CSV files with the origin-destination data that has been previously cached in $SPANISH_OD_DATA_DIR
od_csv_clean - a cleaned-up table view of od_csv_raw with column names and values translated and mapped to English. This still includes all cached data.
The structure of the cleaned-up views od_csv_clean is as follows:
Date. The full date of the trip, including year, month, and day.
factor. The identifier for the origin location of the trip, formatted as a code (e.g., '01001_AM').
factor. The identifier for the destination location of the trip, formatted as a code (e.g., '01001_AM').
factor. The type of activity at the origin location (e.g., 'home', 'work'). Note: Only available for district level data.
factor. The type of activity at the destination location (e.g., 'home', 'other'). Note: Only available for district level data.
factor. The province of residence for the group of individual making the trip, encoded according to the INE classification. Note: Only available for district level data.
factor. The province of residence for the group of individuals making the trip (e.g., 'Cuenca', 'Girona'). Note: Only available for district level data.
integer. The time slot (the hour of the day) during which the trip started, represented as an integer (e.g., 0, 1, 2).
factor. The distance category of the trip, represented as a code (e.g., '002-005' for 2-5 km).
double. The number of trips taken within the specified time slot and distance.
double. The total length of all trips in kilometers for the specified time slot and distance.
double. The year of the trip.
double. The month of the trip.
double. The day of the trip.
The structure of the original data in od_csv_raw is as follows:
Date. The date of the trip, including year, month, and day.
character. The identifier for the origin location of the trip, formatted as a character string (e.g., '01001_AM').
character. The identifier for the destination location of the trip, formatted as a character string (e.g., '01001_AM').
character. The type of activity at the origin location (e.g., 'casa', 'trabajo').
character. The type of activity at the destination location (e.g., 'otros', 'trabajo').
character. The code representing the residence of the individual making the trip (e.g., '01') according to the official INE classification.
character. The age of the individual making the trip. This data is actaully filled with 'NA' values, which is why this column is removed in the cleaned-up and translated view described above.
integer. The time period during which the trip started, represented as an integer (e.g., 0, 1, 2).
character. The distance category of the trip, represented as a character string (e.g., '002-005' for 2-5 km).
double. The number of trips taken within the specified time period and distance.
double. The total length of all trips in kilometers for the specified time period and distance.
double. The day of the trip.
double. The month of the trip.
double. The year of the trip.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.