Responses to review 2 of stats19

Thanks for the detailed review. We think all the suggestions make sense, and link to previous discussions about combining the 3 stage (dl, read, format) process into a single function call, which we were thinking of calling get_stats19(). There reason for splitting the process up is to ensure maximum transparency and to give the user control over what the package is doing. However, as long as it is properly documents, we think the benefits of a get_stats19() function will outweigh any possible negatives we can think of so we plan to go ahead and create this function.

We would very much welcome a pull request that address some of the other issues you mention.

Responses to other issues/questions/comments are provided below. We suspect that some of the comments refer to an older version of the package, which is completely understandable (the package has evolved since the initial submission!) and explains why some of the responses are short / questions.

Thanks for the comments, we have indeed tried to keep dependencies to a minimum but consider readr and tibble worthwhile. readxl and curl have been demoted to Suggests, as detailed in another comment.

Thanks. If you think of other was we can communicate the value of the data, do let us know (I think the second mapping figure could be improved...).

We have long been planning to add a get_stats19() function as per https://github.com/ITSLeeds/stats19/issues/11 The review comment, combined with further discussion, has triggered us to re-prioritise it. It's been beneficial to polish each of the component functions first, however, and good to document each stage for maximum transparency, however, so we plan to keep the dl, read and format functions exported.

Agreed.


The guidance is to 'Only use package startup messages when necessary'. A case can be made that this is necessary. As with osmdata, the package provides access to data that has a license that requires it to be cited. The osmdata load message is as follows:

library(osmdata)

We fully agree with the reasoning behind remove package startup messages however. As a compromise, we've shortened the startup from 4 lines to 2:

# before:
# Data provided under the conditions of the Open Government License.
# If you use data from this package, mention the source
# (UK Department for Transport), cite the package, and link to:
# www.nationalarchives.gov.uk/doc/open-government-licence/version/3/.
# after:
library(stats19)

running goodpractice::gp() found the following lines with > 80 lines:

    R/format.R:62:1
    R/format.R:67:1
    R/read.R:141:1
    R/utils.R:167:1

All these have been fixed.

curl is used in the tests and readxl is used in the examples. These have been demoted to Suggests. tibble has been removed from the DESCRIPTION file.

dl_stats19(year = 1979, type = "deaths")
No files of that type found for that year.
This will download 240 MB+ (1.8 GB unzipped).
Files identified: Stats19-Data1979-2004.zip

Download now (y = enter, n = esc)? 

Warning message:
In find_file_name(years = year, type = type) :
  Coordinates unreliable in this data.

Tests

DESCRIPTION File

README File(s)

Vignette

should be

This is a good point. Fixed, by adding a much more useful dataset, representing the juristictions of polic forces across England and Wales.

Meta



ITSLeeds/stats19 documentation built on May 4, 2019, 7:35 a.m.