datadrop_id: Get Google Drive ID for latest or archive DoH Data Drop...

Description Usage Arguments Details Value Author(s) Examples

View source: R/get_id.R

Description

The DoH Data Drop is distributed using Google Drive with the latest data released through a new Google Drive folder and the older data archived into the same persistent Google Drive folder.

Usage

1
2
3
4
5
6
7
datadrop_id_latest(verbose = TRUE)

datadrop_id_archive(verbose = TRUE, .date = NULL)

datadrop_id(verbose = TRUE, version = c("latest", "archive"), .date = NULL)

datadrop_id_file(tbl, fn)

Arguments

verbose

Logical. Should message on operation progress be shown. Default is TRUE.

.date

A character value for date in YYYY-MM-DD format. This is the date for the archive DoH Data Drop for which an ID is to be returned. Should be specified when using datadrop_id_archive(). For datadrop_id(), only used when version is set to archive otherwise ignored.

version

A character value specifying whether to get the latest available DoH Data Drop (latest) or to get DoH Data Drop archive (archive). Default to latest.

tbl

A tibble output produced by datadrop_ls() that lists the files within a particular DoH Data Drop Google Drive folder

fn

A character string composed of a word or words that can be used to match to the name of a file within a particular DoH Data Drop Google Drive folder listed in tbl.

Details

The Philippines Department of Health (DoH) currently distributes the latest Data Drop via a fixed shortened URL (bit.ly/DataDropPH) which links/points to a new Google Drive endpoint daily or whenever the daily updated data drop is available. This Google Drive endpoint is a README document in portable document format (PDF) which contains a privacy and confidentiality statement, technical notes with regard to the latest data, technical notes with regard to previous (archive data) and two shortened URLs - one linking to the Google Drive folder that contains all the latest officially released datasets, and the other linking to the datasets released previously (archives). Of these, the first shortened URL linking to the Google Drive folder containing the latest officially released datasets is different for every release and can only be obtained through the README document released for a specific day.

The function datadrop_id_latest() reads the README PDF file, extracts the shortened URL for the latest official released datasets written in that file, expands that shortened URL and then extracts the unique Google Drive ID for the latest officially released datasets. With this Google Drive ID, other functions can then be used to retrieve information and data from the Google Drive specified by this ID.

The DoH Data Drop archives, on the other hand, is distributed via a fixed shortened URL (bit.ly/DataDropArchives) which links/points to a Google Drive folder containing the previous DoH Data Drop releases.

The function datadrop_id_archive() expands that shortened URL and then extracts the unique Google Drive ID for the DoH Data Drop archives folder. With this Google Drive ID, other functions can then be used to retrieve information and data from the Google Drive specified by this ID.

Value

A 33-character string for the Google Drive ID of the latest DoH Data Drop or the archive DoH Data Drop

A 33-character string for the Google Drive ID of the specified DoH Data Drop file. If fn matches with more than one file, a vector of 33-character strings for the Google Drive IDs of the specified DoH Data Drop files.

Author(s)

Ernest Guevarra

Ernest Guevarra

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
## Not run: 
  library(googledrive)

  ## Deauthorise
  googledrive::drive_deauth()

  ## Two ways to get the Google Drive ID of the latest DoH Data Drop
  datadrop_id_latest()
  datadrop_id()

  ## Two ways to get the Google Drive ID of the archive DoH Data Drop for
  ## 1 November 2020
  datadrop_id_archive(.date = "2020-11-01")
  datadrop_id(version = "archive", .date = "2020-11-01")

## End(Not run)

## Not run: 
  library(googledrive)

  ## Authentication
  googledrive::drive_auth_configure(api_key = Sys.getenv("GOOGLEDRIVE_TOKEN"))

  ## Deauthorise
  googledrive::drive_deauth()

  ## Typical workflow
  tbl <- datadrop_ls(id = datadrop_id())
  datadrop_id_file(tbl = tbl, fn = "Case Information")

  ## Piped workflow using magrittr %>%
  library(magrittr)

  ## Get the id for the latest Case Information file
  datadrop_id() %>%
    datadrop_ls() %>%
    datadrop_id_file(fn = "Case Information")

## End(Not run)

como-ph/covidphdata documentation built on Dec. 31, 2020, 10:06 p.m.