knitr::opts_chunk$set(
  warning = FALSE,
  message = FALSE,
  comment = FALSE
)
library(ggplot2)
library(tidyverse)
library(fs)
library(ussc)

Confluence REST API

ussc provides five functions that will be useful if/when you need to download data from Confluence:

To use these functions, you need to connect to the Confluence REST API. Grab an API key from https://confluence.atlassian.com/cloud/api-tokens-938839638.html. You can add the API key into your .renviron file by running usethis::edit_r_environ(). The confluence keys look like this in my .renviron file:

CONFLUENCE_URL = https://usscsydney.atlassian.net/wiki

CONFLUENCE_USERNAME = firstname.lastname@sydney.edu.au (add your own email)

CONFLUENCE_PASSWORD = KFkuYkNdkqSKEWfdrjODxxxx (please change your API key -- note that this API key does not work as I changed the last four digits)

Save the .renviron file and restart R so the changes take effect.

Scrape HTML tables from Confluence

ussc_confluence_table() returns HTML tables that have been embedded in Confluence as tidy tibbles. There are three parameters - the ID of the Confluence page, your username and your password (API key). The username and password default to CONFLUENCE_USERNAME and CONFLUENCE_PASSWORD in your .renviron file. You can add the values manually if you wish.

It is a fairly simple function to run. Say I want to grab the Key Performance Indicators from the Comms team. The page ID is 950239240.

comms_kpi <-  ussc_confluence_table("950239240")

This grabs all of the tables on the page. You can extract each table by running:

online_kpi_table1 <- comms_kpi[1][[1]]

head(online_kpi_table1)

Publications KPI table

If you want to grab the publication metrics data, run ussc_confluence_kpi_table(). This will download the data and clean it. Let's take a look at the 2020 Q1 metrics.

pub_kpi <- ussc_confluence_kpi_table("1173357736")

head(pub_kpi)

Version history

To see the version history of a table, run the function ussc_confluence_version_history(). You will be able to see when edits to the page were made and by whom.

pubs_calendar_version_history <- ussc_confluence_version_history("1020526593")

head(pubs_calendar_version_history)

Download Excel files from Confluence

Another function, ussc_confluence_excel(), downloads all Excel files on a given page. Note: this function might take a while to run if you need to download many files. I have only tested this function on Firefox. It also assumes that you have set your settings so that the files download automatically.

one_file <- ussc_confluence_excel(id = "950239621")

The Excel file has 4 sheets. We can extract the sheets below.

publications <- one_file[["Publications"]]
media <- one_file[["Media"]]
events <- one_file[["Events"]]
meetings_etc <- one_file[["Meetings and other"]]


head(publications)


USStudiesCentre/ussc documentation built on Sept. 2, 2020, 2:51 p.m.