knitr::opts_chunk$set(
  collapse = TRUE,
  eval = FALSE,
  comment = "#>"
)

This vignette details some key design choices in this package with the hope that stating these explicitly will improve package navigability.

Comparison to Alternatives

This package is designed with a focus on project management and communication. This emphasis distinguishes it from some of the other excellent R packages which wrap the GitHub API. In short, this package focuses on the subset of GitHub API functionality most critical to project management and provides additional tooling to support communcation of planning and results.

In particular:

Repo as First Class Citizen

Repositories are "first class" citizens in projmgr. The first step to accessing or sending information is to create a repository reference using the create_repo_ref() function. The resulting object is the first element passed into all get_ and post_ functions.

For users that work with databases in R with the DBI package, this codeflow is analogous to querying from a database. In this case, users first create a database connection object with dbConnect() which is passed into subsequent functions such as dbGetQuery.

This decision was based on the assumption that the most common use case for projmgr would be interacting with a single repository at a time. Admittedly, some users may prefer a view further up in the hierarchy, e.g. an organization object versus a repository object. By providing lower-level building blocks, broader functionality can be achieved by mapping over a set of repositories. A code example is provided in the Event & Team Management vignette.

Function Naming

Functions generally conform to the <verb>_<details> convention. For functions interaction with the GitHub API, the <verb> component is the HTTP method invoked (e.g. GET, POST, DELETE).

Verbs like "post" might seem less intuitive than a synonym like "create" or "submit" for users who have not worked with APIs previously. However, this convention describes the function's action most precisely and ideally also serves to raise awareness of HTTP methods.

Function Parameters

Functions that interact with GitHub's API demure to the naming conventions of that API. This ideally empowers users for future, direct work with the API and allows for easier maintenance.

More specifically, parameters required by the GitHub API are required by the corresponding functions in this package. Any additional parameters not required by the GitHub API can be passed in through the ...s. The help_{function name} and browse_docs() functions can be used to find out more about the names and descriptions of these optional parameters.

Two noteworthy exceptions are get_issues() and get_milestones(). In the GitHub API, there are separate endpoints for getting a single item (issue/milestone) or multiple items However, it seemed unneccesary to create separate functios for the single and plural versions. Instead, if either function is provided an argument for number, the single-item endpoint is used. Any other query parameters are then irrelevant and ignored. If no argument is provided for the number parameter, the multiple-item enpoint is used with allowed parameters given by help_{function name}().

Get-Parse Codeflow

All get_ functions make a call to the GitHub API and return the result as an R list. The corresponding parse_ function converts each list into a dataframe for easier wrangling and analysis. In most all cases, users will likely call parse_ immediately after get_ and never work with the output of get_ directly. For example:

my_repo <- create_repo_ref('username', 'my_repo')
issues <- get_issues(my_repo, state = 'all') %>% parse_issues()
issue_events <- get_issue_events(my_repo, number = 7) %>% parse_issue_events()
milestones <- get_milestones(my_repo) %>% parse_milestones()

However, the get_ and parse_ functions are provided separately to empower users. Some use cases where users may prefer to not use the parse_ functions include if they:

In rare cases, get_ functions will return additional information not provided by the API to preserve data lineage. For example, get_issue_events() and get_issue_comments() include the issue number (provided as a required function argument) in the output so users know to what issue they refer.

Parse Output Variable Names

The dataframe returned by parse_ functions attempt to maintain the same field names as used by the GitHub API, similar to the conventions described in [Function Parameters]. However, there are a few key exceptions:

There are some disadvantages to this approach. For example, if one wishes to join issues and milestone data by milestone number, in the issues data it will be called milestone_number and in the milestone data it will simply be called number. One alternative would be to use {object}_{field} conventions (e.g. issue_name) uniformly across all datasets. However, this was not selected since it makes variable names long and bulky.



emilyriederer/projmgr documentation built on Jan. 26, 2024, 3:09 a.m.