Load the package other packages used in this vignette.
suppressPackageStartupMessages(library("dplyr")) library("geysertimes")
The gt_get_data
function downloads the compressed eruptions data
from https://geysertimes.org/archive/
,
reads the data compressed data into R
and saves version of the R object
in the location specified
in the dest_folder
argument to the function.
The default location for dest_folder
is
file.path(tempdir(), "geysertimes"))
.
This default location is used to meet the CRAN requirement of
not writing files by default to any location other than under tempdir()
.
default_path <- gt_get_data() #> Set dest_folder to geysertimes::gt_path() so that data persists between R sessions.
default_path
#> [1] "/tmp/RtmpUtc3wI/geysertimes/2021-05-06"
Users are encouraged to set dest_folder
to the value given by
gt_path()
which is a permanent location appropriate for the
user on the particular platform.
gt_path() #> [1] "~/.local/share/geysertimes"
If a permanent location is used, the user only needs to get the
data once.
Using the suggested value for dest_folder
:
suggested_path <- gt_get_data(dest_folder=gt_path())
suggested_path
#> [1] "~/.local/share/geysertimes/2021-05-06"
The gt_load_eruptions
is used to load the eruptions data in the
current session.
The gt_load_geysers
loads the geyser location data in the current session.
eruptions <- gt_load_eruptions() geysers <- gt_load_geysers()
A quick look at the data:
dim(eruptions) #> [1] 1292133 25 names(eruptions) #> [1] "eruption_id" "geyser" "time" #> [4] "has_seconds" "exact" "near_start" #> [7] "in_eruption" "electronic" "approximate" #> [10] "webcam" "initial" "major" #> [13] "minor" "questionable" "duration" #> [16] "duration_seconds" "duration_resolution" "duration_modifier" #> [19] "entrant" "observer" "comment" #> [22] "time_updated" "time_entered" "primary_id" #> [25] "other_comments"
The data that is downloaded is versioned. The version id is the date when the data was downloaded.
The gt_version()
lists the latest version of the data that
has been downloaded.
Setting all=TRUE
will list all versions of the data that have been
downloaded.
gt_version() #> [1] "2021-05-06"
gt_version(all=TRUE) #> [1] "2021-05-06"
Geysers with the most recorded eruptions:
print(n=20, eruptions %>% group_by(geyser) %>% summarise(N=n()) %>% arrange(desc(N))) #> # A tibble: 450 x 2 #> geyser N #> <chr> <int> #> 1 Old Faithful 184096 #> 2 Plume 143792 #> 3 Daisy 99807 #> 4 Little Cub 96889 #> 5 Lion 63418 #> 6 Grand 41475 #> 7 Aurum 35881 #> 8 Oblong 34080 #> 9 Fountain 27463 #> 10 Spouter 26981 #> 11 Riverside 26678 #> 12 Castle 25787 #> 13 Logbridge 23416 #> 14 Plate 21594 #> 15 Echinus 21330 #> 16 Depression 20668 #> 17 Turban 18885 #> 18 Great Fountain 18880 #> 19 Grotto 17533 #> 20 Jet 16368 #> # ... with 430 more rows
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.