library(tidyverse) library(ecmwfr) library(tidync)
This will download a single nc
file. At the moment, it's hard-wired to pull down temperature at 2 meters. We'll look into flexing that.
The file extract function will process a single nc
file. Again, the variable of interest is (semi) hard-wired. The function is fragile in that it presumes consistency between the data in the file and the date associated with it.
This function will reduce the geographic resolution of the data. Based on conversation with other researchers, this function will be modified to process a day with observations at every hour. We would then pick the maximum and minimum across 24 hours. We will also consider simply filtering out data points, rather than rounding the longitude and latitude.
The function below will create a data frame which houses a set of dates for extraction. Storing this in a data frame is overkill. Earlier I'd had some additional columns which I decided I didn't need. This structure will allow us to introduce some later if the mood strikes us.
First, create the tracker file if it doesn't exist. For now, this is ad hoc to fetch one or two years at a time.
tbl_tracker <- create_tracker_table( months = 1 , years = 1982 )
Now, we open up the output file and filter on everything which hasn't been downloaded and extracted yet. Note that we're checking for the existence of the output file. This is pretty general and could support options to store all the data from a single year into a particular file. Simply change the name.
file_out <- file.path("data", "out.csv") if (file.exists(file_out)) { tbl_written <- read_csv(file_out) %>% select(date) %>% unique() tbl_unwritten <- tbl_tracker %>% anti_join(tbl_written, by = "date") } else { tbl_unwritten <- tbl_tracker %>% select(date) }
We'll cheat slightly and use a for
loop to churn through each day. This is fine, performance-wise as it's unlikely to be able to parallelize the remote calls. We've set a maximum file size. This isn't super necessary. It's just there as a circuit breaker in case something weird happens.
Because I'm putting this on GitHub, I'm storing my CDS user name in an environment variable. That sits in the .Renviron file which lives at user root.
max_file_out_size <- 200e6 for (i_date in seq_len(nrow(tbl_unwritten))) { if (file.exists(file_out) & file.size(file_out) > max_file_out_size) break a_file <- download_one_file( Sys.getenv('CDS_USER_NAME') , tbl_unwritten$date[i_date] , c('2m_temperature') # , c('2m_temperature', 'surface_pressure', 'total_precipitation') ) tbl_extract <- a_file %>% extract_one_file(tbl_unwritten$date[i_date]) tbl_extract <- tbl_extract %>% trim_grid() %>% add_date_time() %>% summarize_day() if (file.exists(file_out)) { tbl_extract %>% write_csv(path = file_out, append = TRUE, col_names = FALSE) } else { tbl_extract %>% write_csv(path = file_out, append = FALSE, col_names = TRUE) } }
This is a code snippet which plotted temperature. Hanging on to it so that I don't need to think any more than I have to.
tbl_results <- read_csv(file_out) tbl_results %>% filter(date == min(date)) %>% ggplot(aes(longitude, latitude)) + geom_point(aes(color = mean)) + scale_color_gradient(low = "blue", high = "red") + theme_minimal()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.