You need an API key to access the content endpoint. You can apply for a key at http://developer.zeit.de/quickstart/. The function below appends the key automatically to your R environment file. Hence, every time you start R the key is loaded. get_content
and get_content_all
access your key automatically by executing Sys.getenv("ZEIT_ONLINE_KEY")
. Replace api_key
with the key you receive from ZEIT and path
with the location of your R environment.
# save the api key in the .Renviron file set_api_key(api_key = "xxx", path = "~/.Renviron")
Important: This function requires an API key. See Authentication.
You can query the whole ZEIT and ZEIT ONLINE archive using the content endpoint. Define your query using the query
argument. The API supports query syntax such as +
and AND/OR
. You find more information at the ZEIT API documentation. As stated above, get_content
and get_content_all
access your key automatically by executing Sys.getenv("ZEIT_ONLINE_KEY")
, but you can provide a key by your own using the api_key
argument.
Why are there two functions?
rzeit2 provides two functions to query the content endpoint: get_content
and get_content_all
. get_content
supports only 1000 rows per call because of the API specifications. You can use get_content_all
to derive all articles matching your query. Internally, get_content_all
executes get_content
but fills all the arguments automatically. Please set a timeout to be polite.
# fetch articles up to 1000 rows tatort_articles <- get_content(query = "Tatort", begin_date = "20180101", end_date = "20180131") # fetch ALL articles tatort_articles <- get_content(query = "Tatort", timeout = 2)
get_article_text
allows you to download the article text for a given url. The function is vectorized. Hence, you can define multiple urls. The function automatically scrapes all pages if the article has multiple pages. Please set a timeout to be polite.
tatort_content <- get_article_text(url = tatort_articles$content$href, timeout = 1)
get_article_comments
allows you to download the article comments for a given url. The function is not vectorized. This function may take quite long due to the ZEIT ONLINE website structure. Since ZEIT does not provide an API for comments this function downloads all comments as your browser would do. Please set a timeout to be polite.
tatort_comments <- get_article_comments(url = tatort_articles$content$href[1], timeout = 1)
get_article_images
allows you to download the article images for a given url. The function is vectorized. Hence, you can define multiple urls. The function automatically scrapes all pages if the article has multiple pages. You can directly download the images if you define a path in download
. Please ensure that the folder you define exists. The file name is derived as md5 hash of the image url. Hence, it should be un Please set a timeout to be polite.
Please set a timeout to be polite.
tatort_images <- get_article_images(url = tatort_articles$content$href, timeout = 1, download = "~/Documents/tatort-img/")
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.