options(scipen=999)
library(Rbduk)
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/Vignette-",
  out.width = "100%"
)

General functions

readme()

All R projects produced should have their files stored within a Google Cloud Repository to allow for source control, easier collaboration, and to maintain best practices. When comitting these project to a repository, you should include a README file that details key information about the project for anyone who wants to find out more or has to contribute to the project.

This function creates a template README.Rmd file within the current working directory detailing what sohuld be included within this. This function will initially open a popup window, which the doccument can be editted within, but it is recommended you simply click save and then edit the file directly.

Once the README file has been created, it should be 'Knitted' by either clicking the 'Knit' button within Rstudio, or using rmarkdown::knit(). Once this has been done, another file, README.md (a markdown output) will be generated. Ensure to commit both the README.Rmd and the README.md files to your repository.

If you are unfamiliar with GIT, or need help setting up GIT on your machine, view a collection of git resources here.

readme()

is_integer64()

Within GCP, large numbers (eg UPRNs) are stored are 64-bit integers. R, on the whole, does not like 64-bit integer, and whilst it can cope, it's easier to read in these columns are numeric, or convert them to numeric within R. To see if an object is currently a 64-bit integer, this function can be used.

is_integer64(100)

is_integer64("x")

pretty_postcode()

This function takes a postcode in any format and any case, and converts it into pretty format XX(X)(X)(Y)XXX, where X is the postcode, and Y is the specified seperator, which is a space by default. The case can also be specified, and is upper case by default.

pretty_postcode("SW1a2nP")

pretty_postcode("SW1a2nP", sep="")

pretty_postcode("SW1a2nP", sep=".")

pretty_postcode("SW1a2nP", sep=".", uppercase=FALSE)

is_postcode()

This function takes a string and returns TRUE or FALSE depedent on whether the string is in a valid UK postcode format or not. This may contain one space and still be valid. This does not indicate whether a postcode is an existing postcode, but that is has the format of one.

The following demonstrates valid postcodes and invalid postcodes:

is_postcode("SW1a2nP")

is_postcode("SW1a 2nP")
#Too short
is_postcode("S 2NP")

#Contains punctuation
is_postcode("Sw1a.2np")

#Too long
is_postcode("Sw1a2npX")

#Contains too many spaces
is_postcode("Sw1a  2np")

#Not valid number/letter combiantion
is_postcode("000000")

#Not valid number/letter combiantion
is_postcode("XXXXXX")

is_uprn()

This function takes a numeric or character vector and returns TRUE or FALSE dependent on whether it is in a valid UPRN format (all numeric, between 1 and 12 characters). It will also flag as a message any UPRNs that end 0000, as a common conversion error caused by scientific notation and reading/writing from excel can cause genuine UPRNs to end in a number of zeros, meaning they are no longer genuine. One or two of these messages is acceptable, but many indicates that this error has occurred and that the UPRNs should be checked thoroughly. The function can be set to allow this with allow_scientific=TRUE (by default), but this can be disabled, and all UPRNs ending in 0000 will return false.

The following demonstrates valid and invalid UPRNs:

is_uprn(1)

is_uprn(999999999999)

is_uprn("1")

is_uprn("999999999999")
#Too long
is_uprn(9999999999999)

#Too long
is_uprn("9999999999999")

#Empty
is_uprn("")

#Letters
is_uprn("ABC")

#Contains trailing characters
is_uprn("1,")

#Not an integer
is_uprn(111.999)

#Not an integer
is_uprn("111.999")

#Contains punctuation
is_uprn("111-999")

#Contains punctuation
is_uprn("111 999")

#Flag scientific numbering errors, but return them as true.
is_uprn("100000")

#Flag scientific numbering errors, but return them as false.
is_uprn("100000", allow_scientific=FALSE)

%notin% and %!in%

These functions serve the same purpose, and that is to give a more readable version of the opposite of %in%. They output the opposite logical response that %in% would provide.

2 %!in% c(1,3)

2 %notin% c(1,3)

"b" %!in% c("a","c")

"b" %notin% c("a","c")
2 %!in% c(1,2,3)

2 %notin% c(1,2,3)

"b" %!in% c("a","b","c")

"b" %notin% c("a","b","c")
all.equal(2 %!in% c(1,3), 2 %notin% c(1,3), !2 %in% c(1,3))

all.equal("b" %!in% c("a","c"), "b" %notin% c("a","c"), !"b" %in% c("a","c"))

GCP functions

bduk_bq()

THis function provides a shorthand for writing queries directly to bigquery within R. The input to this function must be a valid query, and the name of the project you are quering from within GCP. To run such a query a .json billing key must be stored on your machine. By default, this function will assume the key is the project name with "_bigquery.json" at the end. If this is not the case, the key name will need to be specified. By default, this function will assume the key is stored in the current working directory or project. If this is not the case, the key must be specified. This function also sets 'bigint' to integer64, allowing R to load in integer64 numbers (eg. UPRNs), but it is recommended that you use 'CAST(VARIABLE as NUMERIC) as VARIABLE' when running your queries, or converting these to numeric once they have been read in.

sql="SELECT pcds,Rurality FROM `dcms-datalake-staging.GEO_ONS.ONS_RURALITY` LIMIT 1 "

setwd("/home/dcms/keys")

bduk_bq(
      sql=sql,
      project="dcms-datalake-staging"
    )

bduk_bq(
      sql=sql,
      project="dcms-datalake-staging",
      key="dcms-datalake-staging_bigquery.json"
    )

bduk_bq(
      sql=sql,
      project="dcms-datalake-staging",
      keypath="/home/dcms/keys"
    )
sql="SELECT pcds,Rurality FROM `dcms-datalake-staging.GEO_ONS.ONS_RURALITY` LIMIT 1 "

#No key in this directory
setwd("/home/dcms/")

bduk_bq(
      sql=sql,
      project="dcms-datalake-staging"
    )

Spatial functions

geojson_to_sf()

Data in BigQuery is frequently stored in geojson format. This function provides a cimple shorthand to input a data frame or tibble that contains a geojson column (eg. a table read from BigQuery) as easily convert it to a simple feature. The inputs to this function are the data frame or tibble, and the column name of the geojson column as a string.

``` {r geojson_to_sf} bq_table<-bduk_bq( sql="SELECT * FROM dcms-datalake-staging.GEO_ONS.shp_LA LIMIT 1" , project="dcms-datalake-staging", keypath="/home/dcms/keys")

geojson_to_sf( data=bq_table, geojson_colname = "geom" )

### make_sf()

This function provides an easy shorthand to take any dataframe with coordinate columns, and convert it into a simple feature object. The inputs to this function are the data frame or tibble, the x and y column names, both as strings, and the coordinate system number. Coordinate reference system codes can be found at [https://spatialreference.org/](https://spatialreference.org/). This mostly serves as a wrapped to sf::st_as_sf(), but the defaults for this are set as latitude and longitude. Columns containing the string "lon" or "lat" in any case being used as the x_colname and y_colname respectively, and the coordinate reference system set to 4326. If multiple columns contain these strings, the function will return the data frame unchanged and produce an error message prompting you to specify the column names manually.

``` {r make_sf}
make_sf(
       data=data.frame(
          "Longitude"=c(1,2,3),
          "Latitude"=c(51,52,53)
          ),
       x_colname="Longitude",
       y_colname="Latitude",
       crs=4326
    )

make_sf(
       data=data.frame(
          "Longitude"=c(1,2,3),
          "Latitude"=c(51,52,53)
          )
    )

make_sf(
      data=data.frame(
        "Longitude_1"=c(1,2,3),
        "Longitude_2"=c(11,12,14),
        "Latitude"=c(51,52,53)
      )
    )

Shiny functions

clearGroups()

A function to clear all groups from a leaflet object from within a shiny application. The prevents having to run clearGroup("groupname") for every group. The input to this is the name of the map, and then the groups wanting to be cleared as comma seperated strings.

library(shiny)
library(leaflet)
library(dplyr)

points_1<-data.frame("lng"=c(-2,-1,0),"lat"=c(51,52,53))
points_2<-data.frame("lng"=c(-2.1,-1.1,-0.1),"lat"=c(51,52,53))

ui<-fluidPage(
  fluidRow(
    actionButton("addgroups", "Add Groups"),
    actionButton("cleargroups", "Clear Groups"),
    leafletOutput("mymap")
  )
)

server<-function(input,output,session){

  output$mymap <- renderLeaflet({
    leaflet(options = leafletOptions(preferCanvas = TRUE)) %>%
      addTiles(options = tileOptions(opacity = 0.8), group = "Open Street Map") %>%
      setView(lng = -1, lat = 52, zoom = 7)
  })

  observeEvent(input$addgroups,{
    leafletProxy("mymap")%>%
      addCircles(data=points_1,lng=~lng,lat=~lat,color="Red",group="Group 1") %>%
      addCircles(data=points_2,lng=~lng,lat=~lat,color="Blue",group="Group 2")
  })

  observeEvent(input$cleargroups,{
    clearGroups(map="mymap","Group 1","Group 2")
  })
}

shinyApp(ui, server)

radioTooltip()

By default, shiny and shinyBS together allow us to add popup tooltips to shiny buttons and input fields. However, this tooltip will apply to all parts of the input fiels. For radio buttons and groups of checkboxes, this can be problematic. This function allows us to add a tooltip to specific parts of a radio button. docker run --rm -p 3838:3838 rocker/shiny

library(Rbduk)
library(shiny)
library(leaflet)
library(dplyr)

ui<-fluidPage(
  fluidRow(
    radioButtons(
      "radio",
      "Radio Button",
      choices=c("Tooltip shows on mouseover here",
                "Tooltip shows on mouseover here as well",
                "Tooltip does not show on mouseover here")
                ),
    radioTooltip(
      id="radio",
      title="Tooltip message appears like this",
      choice=c(
        "Tooltip shows on mouseover here",
        "Tooltip shows on mouseover here as well"
      ),
      placement="right",
      trigger="hover"
    )
  )
)

server<-function(input,output,session){}

shinyApp(ui, server)


SamA-DS/Rbduk documentation built on May 1, 2021, 5:29 a.m.