bb_get: Convenience function to define and synchronize a bowerbird...

View source: R/get.R

bb_getR Documentation

Convenience function to define and synchronize a bowerbird data collection

Description

This is a convenience function that provides a shorthand method for synchronizing a small number of data sources. The call bb_get(...) is roughly equivalent to bb_sync(bb_add(bb_config(...), ...), ...) (don't take the dots literally here, they are just indicating argument placeholders).

Usage

bb_get(
  data_sources,
  local_file_root,
  clobber = 1,
  http_proxy = NULL,
  ftp_proxy = NULL,
  create_root = FALSE,
  verbose = FALSE,
  confirm_downloads_larger_than = 0.1,
  dry_run = FALSE,
  ...
)

Arguments

data_sources

tibble: one or more data sources to download, as returned by e.g. bb_example_sources

local_file_root

string: location of data repository on local file system

clobber

numeric: 0=do not overwrite existing files, 1=overwrite if the remote file is newer than the local copy, 2=always overwrite existing files

http_proxy

string: URL of HTTP proxy to use e.g. 'http://your.proxy:8080' (NULL for no proxy)

ftp_proxy

string: URL of FTP proxy to use e.g. 'http://your.proxy:21' (NULL for no proxy)

create_root

logical: should the data root directory be created if it does not exist? If this is FALSE (default) and the data root directory does not exist, an error will be generated

verbose

logical: if TRUE, provide additional progress output

confirm_downloads_larger_than

numeric or NULL: if non-negative, bb_sync will ask the user for confirmation to download any data source of size greater than this number (in GB). A value of zero will trigger confirmation on every data source. A negative or NULL value will not prompt for confirmation. Note that this only applies when R is being used interactively. The expected download size is taken from the collection_size parameter of the data source, and so its accuracy is dependent on the accuracy of the data source definition

dry_run

logical: if TRUE, bb_sync will do a dry run of the synchronization process without actually downloading files

...

: additional parameters passed through to bb_config or bb_sync

Details

Note that the local_file_root directory must exist or create_root=TRUE must be passed.

Value

a tibble, as for bb_sync

See Also

bb_config, bb_example_sources, bb_source, bb_sync

Examples

## Not run: 
  my_source <- bb_example_sources("Australian Election 2016 House of Representatives data")
  status <- bb_get(local_file_root = tempdir(), data_sources = my_source, verbose = TRUE)

  ## the files that have been downloaded:
  status$files[[1]]

  ## Define a new source: Geelong bicycle paths from data.gov.au
  my_source <- bb_source(
    name = "Bike Paths - Greater Geelong",
    id = "http://data.gov.au/dataset/7af9cf59-a4ea-47b2-8652-5e5eeed19611",
    doc_url = "https://data.gov.au/dataset/geelong-bike-paths",
    citation = "See https://data.gov.au/dataset/geelong-bike-paths",
    source_url = "https://data.gov.au/dataset/7af9cf59-a4ea-47b2-8652-5e5eeed19611",
    license = "CC-BY",
    method = list("bb_handler_rget", accept_download = "\\.zip$", level = 1),
    postprocess = list("bb_unzip"))

  ## get the data
  status <- bb_get(data_sources = my_source, local_file_root = tempdir(), verbose = TRUE)

  ## find the .shp file amongst the files, and plot it
  shpfile <- status$files[[1]]$file[grepl("shp$", status$files[[1]]$file)]
  library(sf)
  bx <- read_st(shpfile)
  plot(bx)

## End(Not run)

AustralianAntarcticDivision/bowerbird documentation built on March 8, 2024, 8:33 a.m.