handle_gsod | R Documentation |
This function can do four things related to the Global Summary of the Day ("GSOD") database from the National Climatic Data Centre (NCDC) of the National Oceanic and Atmospheric Administration (NOAA):
1. It can list stations that are close to a specified position (geographic coordinates).
2. It can retrieve weather data for a named weather station (or a vector of multiple stations).
For the name, the chillRcode from the list returned by the list_stations
operation
should be used.
3. It can 'clean' downloaded data (for one or multiple stations), so that they can easily be used in chillR
4. It can delete the downloaded intermediate weather files from the machine
Which of these functions is carried out depends on the action
argument.
This function can run independently, but it is also called by the
get_weather
and weather2chillR
functions, which some users might find a bit
easier to handle.
handle_gsod(
action,
location = NULL,
time_interval = c(1950, 2020),
stations_to_choose_from = 25,
end_at_present = FALSE,
add.DATE = FALSE,
update_station_list = FALSE,
path = "climate_data",
update_all = FALSE,
clean_up = NULL,
override_confirm_delete = FALSE,
max_distance = 150,
min_overlap = 0,
verbose = "normal"
)
action |
accepts 4 types of inputs to decide on the mode of action for the function.
|
location |
either a vector of geographic coordinates (for the
|
time_interval |
numeric vector with two elements, specifying the start
and end date of the period of interest. Only required when running in
|
stations_to_choose_from |
if the location is specified by geographic coordinates, this argument determines the number of nearby stations in the list that is returned. |
end_at_present |
boolean variable indicating whether the interval of
interest should end on the present day, rather than extending until the end
of the year specified under |
add.DATE |
is a boolean parameter to be passed to |
update_station_list |
boolean, by default set FALSE. Decides if the weather station list is read from the disk (if present) or if it is newly downloaded in case of action = list_stations. |
path |
character, by default "climate_data". Specifies the folder, relative to the working directory where the weather data is downloaded to. |
update_all |
boolean, by default set to FALSE. If set TRUE, it will download every stations data, even if previously downloaded and
still present in the temporary folder, specifief by the function argument |
clean_up |
character, by default set to NULL. In combination with 'action = delete', this can be set to 'all' to delete all weather data, or 'station' if only data from specific stations ('location') should be deleted |
override_confirm_delete |
Boolean, request whether the delete function needs user confirmation to run. Defaults to |
max_distance |
numeric, by default 150. Expresses the distance in kilometers how far away weather stations can be located from the original location, when searching for weather stations |
min_overlap |
numeric, by default set to 0. Expresses in percent how much of the specified period needs to be covered by weather station to be included in the list, when searching for stations. |
verbose |
is a character, deciding how much information is returned while downloading the weather data. By default set to "normal". If set to "detailed" the function will say how many years of data have been successfully downloaded for each station. If set "quiet" no information is printed during download. |
The GSOD database is described here: https://www.ncei.noaa.gov/access/metadata/landing-page/bin/iso?id=gov.noaa.ncdc:C00516
under the 'list_stations'
mode, several formats are possible for specifying
the location vector, which can consist of either two or three coordinates
(it can include elevation). Possible formats include c(1, 2, 3)
, c(1, 2)
,
c(x = 1, y = 2, z = 3)
, c(lat = 2, long = 1, elev = 3)
. If elements of the vector are not
names, they are interpreted as c(Longitude, Latitude, Elevation).
The 'chillRCode' is generated by this function, when it is run with geographic coordinates as location inputs. In the list of nearby stations that is returned then, the chillRCode is provided and can then be used as input for running the function in 'downloading' mode. For downloading the data, use the same call as before but replace the location argument with the chillRCode.
The output depends on the action argument. If it is 'list_stations'
,
the function returns a list of station_to_choose_from
weather stations that
are close to the specified location. This list also contains information
about how far away these stations are (in km), how much the elevation
difference is (if elevation is specified; in m) and how much overlap there
is between the data contained in the database and the time period specified
by time_interval
. If action is 'download_weather'
the output is a list of
the downloaded weather record, extended
to the full duration of the specified time interval. If the location
input
was a vector of stations, the output will be a list of such objects.
If action is a weather data.frame
or a weather record downloaded with
this function (in 'download_weather'
mode), the data structure remains
in the same, but the data are processed for easy use with chillR
.
If drop_most was set to TRUE
, most columns are dropped. If the
location
input was a list of weather datasets, all elements of the
list will be processed.
**IMPORTANT NOTE:** as of chillR
version 0.73, the output format no
longer contains a list element that specifies the database name, because this
has been considered confusing (and annoying) by various users. This means,
however, that some earlier calls to results from the handle_gsod
function
may produce errors now.
Also note that a few parameters, station_list
, drop_most
,
quiet
, add_station_name
are no longer needed due to some
reworking of the function's mechanisms. After careful consideration, we
decided to drop these parameters entirely, which may lead to some downward
compatibility problems.
Apologies for any inconvenience caused by this transition. If you want to
keep using the previous function (which is much slower), feel free to adopt
the deprecated handle_gsod_old
function - but note that this will no
longer be updated and may disappear eventually.
Many databases have data quality flags, which may sometimes indicate that data aren't reliable. These are not considered by this function!
For many places, the GSOD database is quite patchy, and the length of the record indicated in the summary file isn't always very useful (e.g. there could only be two records for the first and last date). Files are downloaded by year, so if we specify a long interval, this may take a bit of time.
Adrian Fülle, Lars Caspersen, Eike Luedeling
The chillR package:
Luedeling E, Kunz A and Blanke M, 2013. Identification of chilling and heat requirements of cherry trees - a statistical approach. International Journal of Biometeorology 57,679-689.
#coordinates of Bonn
long <- 7.0871843
lat <- 50.7341602
#get a list of close-by weather stations
# stationlist <-
# handle_gsod(action = "list_stations",
# time_interval = c(1995,2000),
# location = c(long,lat))
#download data
# test_data <-
# handle_gsod(action = "download_weather",
# time_interval = c(1995,2000),
# location = stationlist$chillR_code[c(1,2)])
#
# format downloaded data
# test_data_clean <- handle_gsod(action = test_data)
## data deletion on disk for clean_up
# functions will ask for confirmation in the console - 'y' for yes to
# confirm deletion, anything else cancels the deletion
# handle_gsod(action = "delete",
# clean_up = "all",
# override_confirm_delete = TRUE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.