tar_download | R Documentation |
Create a target that downloads file from one or more URLs and automatically reruns when the remote data changes (according to the ETags or last-modified time stamps).
tar_download(
name,
urls,
paths,
method = NULL,
quiet = TRUE,
mode = "w",
cacheOK = TRUE,
extra = NULL,
headers = NULL,
iteration = targets::tar_option_get("iteration"),
error = targets::tar_option_get("error"),
memory = targets::tar_option_get("memory"),
garbage_collection = targets::tar_option_get("garbage_collection"),
deployment = targets::tar_option_get("deployment"),
priority = targets::tar_option_get("priority"),
resources = targets::tar_option_get("resources"),
storage = targets::tar_option_get("storage"),
retrieval = targets::tar_option_get("retrieval"),
cue = targets::tar_option_get("cue"),
description = targets::tar_option_get("description")
)
name |
Symbol, name of the target.
In A target name must be a valid name for a symbol in R, and it
must not start with a dot. Subsequent targets
can refer to this name symbolically to induce a dependency relationship:
e.g. In most cases, The target name is the name of its local data file in storage. Some file systems are not case sensitive, which means converting a name to a different case may overwrite a different target. Please ensure all target names have unique names when converted to lower case. In addition, a target's
name determines its random number generator seed. In this way,
each target runs with a reproducible seed so someone else
running the same pipeline should get the same results,
and no two targets in the same pipeline share the same seed.
(Even dynamic branches have different names and thus different seeds.)
You can recover the seed of a completed target
with |
urls |
Character vector of URLs to track and download. Must be known and declared before the pipeline runs. |
paths |
Character vector of local file paths to download each of the URLs. Must be known and declared before the pipeline runs. |
method |
Method to be used for downloading files. Current
download methods are The method can also be set through the option
|
quiet |
If |
mode |
character. The mode with which to write the file. Useful
values are |
cacheOK |
logical. Is a server-side cached value acceptable? |
extra |
character vector of additional command-line arguments for
the |
headers |
named character vector of additional HTTP headers to
use in HTTP[S] requests. It is ignored for non-HTTP[S] URLs. The
|
iteration |
Character of length 1, name of the iteration mode of the target. Choices:
|
error |
Character of length 1, what to do if the target stops and throws an error. Options:
|
memory |
Character of length 1, memory strategy. Possible values:
For cloud-based dynamic files
(e.g. |
garbage_collection |
Logical: |
deployment |
Character of length 1. If |
priority |
Numeric of length 1 between 0 and 1. Controls which
targets get deployed first when multiple competing targets are ready
simultaneously. Targets with priorities closer to 1 get dispatched earlier
(and polled earlier in |
resources |
Object returned by |
storage |
Character string to control when the output of the target
is saved to storage. Only relevant when using
|
retrieval |
Character string to control when the current target
loads its dependencies into memory before running.
(Here, a "dependency" is another target upstream that the current one
depends on.) Only relevant when using
|
cue |
An optional object from |
description |
Character of length 1, a custom free-form human-readable
text description of the target. Descriptions appear as target labels
in functions like |
tar_download()
creates a pair of targets, one upstream
and one downstream. The upstream target uses format = "url"
(see targets::tar_target()
) to track files at one or more URLs,
and automatically invalidate the target if the ETags
or last-modified time stamps change. The downstream target
depends on the upstream one, downloads the files,
and tracks them using format = "file"
.
A list of two target objects, one upstream and one downstream. The upstream one watches a URL for changes, and the downstream one downloads it. See the "Target objects" section for background.
Most tarchetypes
functions are target factories,
which means they return target objects
or lists of target objects.
Target objects represent skippable steps of the analysis pipeline
as described at https://books.ropensci.org/targets/.
Please read the walkthrough at
https://books.ropensci.org/targets/walkthrough.html
to understand the role of target objects in analysis pipelines.
For developers, https://wlandau.github.io/targetopia/contributing.html#target-factories explains target factories (functions like this one which generate targets) and the design specification at https://books.ropensci.org/targets-design/ details the structure and composition of target objects.
Other targets with custom invalidation rules:
tar_change()
,
tar_force()
,
tar_skip()
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) {
targets::tar_dir({ # tar_dir() runs code from a temporary directory.
targets::tar_script({
list(
tarchetypes::tar_download(
x,
urls = c("https://httpbin.org/etag/test", "https://r-project.org"),
paths = c("downloaded_file_1", "downloaded_file_2")
)
)
})
targets::tar_make()
targets::tar_read(x)
})
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.