knitr::opts_chunk$set(echo = FALSE)

This document gives an outline of concepts relevant to the development of the community package. Each section presents questions or exercises to consider.

Data Background

Data Alignment

  1. What is the value of value in 2015?
data.frame(value = c(1, 2, 3), year = c(2012, 2013, 2014))

Answer: NA; the year 2015 does not exist here, but we can still virtually access it.

  1. What is the mean absolute error between the values of d1 and d2?
(datasets <- list(
  d1 = data.frame(value = c(1, 2, 3), year = c(2012, 2013, 2014)),
  d2 = data.frame(value = c(1, 2, 3), year = c(2013, 2014, 2015))
))

Answer: MAE = r mean(abs(datasets$d1$value[2:3] - datasets$d2$value[1:2])); to calculate this, we need to temporally align vectors (which could be done virtually):

(data.frame(
  year = c(2012, 2013, 2014, 2015),
  d1 = c(1, 2, 3, NA),
  d2 = c(NA, 1, 2, 3)
))

ID Mapping

  1. If ID 123 is selected, which rows should be included?
data.frame(ID = c(1234, 1243, 1235, 1245))

Answer: rows 1 (1234) and 3 (1235); with no additional information, we can look for initial substring matches to map one set of IDs to another.

  1. Calculate all possible aggregates of the lower dataset values to the higher dataset.
(datasets <- list(
  higher = data.frame(
    ID = c(123, 124, 123, 124),
    time = c(1, 1, 2, 2),
    v1 = NA,
    v2 = NA
  ),
  lower = data.frame(
    ID = c(1234, 1243, 1235, 1245, 1234, 1243, 1235, 1245),
    time = c(1, 1, 1, 1, 2, 2, 2, 2),
    v1 = c(10, 5, 20, 13, 7, 15, 18, 23),
    v2 = c(1, NA, 3, NA, 5, NA, NA, NA)
  )
))

Aggregated higher:

datasets$higher[, 3:4] <- do.call(rbind, lapply(
  split(datasets$lower[, -2], datasets$lower$time), function(d) {
    do.call(rbind, lapply(
      split(d[, -1], substring(d$ID, 1, 3)), colMeans,
      na.rm = TRUE
    ))
  }
))
datasets$higher[is.na(datasets$higher)] <- NA
datasets$higher

We only ever calculate mean aggregates -- this is just a way to fill out numbers, as appropriate aggregates should always be included, rather than calculated here.

R Packaging Background

Documentation

Edit the description of the dir argument of the site_build function.

  1. Where would you make that change?

  2. How would you get that change to appear in the local R help (?site_build)?

  3. How would you get that change to appear on the documentation site (site_build.html)?

Debugging

This is an exercise about stepping into a running function:

Without editing any source code, calculate the row means of cols after it has been updated in

util_make_palette(rep(c("red", "green", "blue"), each = 10), polynomial = TRUE)

Expected result:

c(red = 102, green = 51, blue = 102)

Testing

Write a test to cover the first uncovered line of the datacommons_refresh function.

  1. How would you update the coverage report?

JavaScript Background

Build Tools

  1. The npm start command is a shortcut. What is the underlying command?

  2. What does npm run find-deadcode do, and when might you want to run it?

Editing Source

Testing

Take a look at the coverage of the data handler > checks > value_checks object, and note that some functions are not covered (such as !=).

Write a test to cover one of those functions.

  1. How would you update the coverage report?

  2. How would you trigger that function in a browser console?



uva-bi-sdad/community documentation built on Oct. 12, 2023, 1:18 p.m.