knitr::opts_chunk$set( collapse = TRUE, eval = FALSE, comment = "#>" )
This vignette details some key design choices in this package with the hope that stating these explicitly will improve package navigability.
This package is designed with a focus on project management and communication. This emphasis distinguishes it from some of the other excellent R packages which wrap the GitHub API. In short, this package focuses on the subset of GitHub API functionality most critical to project management and provides additional tooling to support communcation of planning and results.
In particular:
gh
descibes itself as a 'Minimalistic GitHub API client in R'. It is very robust and flexible (and powers this package!) but demands slightly more from users (e.g. understanding the GitHub API's endpoints). Instead, this package supports only a subset of the GitHub API. Instead, projmgr
prioritizes making key project management tasks as friendly as possible.ghapiv3
also wraps gh
for a user-friendly, higher-level interface to the GitHub API. However, like gh
, it also provides broader support for the API but lacks the workflow-specific functionality.ghclass
shares the goal of streamlining a specific GitHub workflow easier. However, classroom management is the specific workflow it is built to improve. That said, ghclass
has some complementary functions, such as programmatically setting up groups of repositories.Repositories are "first class" citizens in projmgr
. The first step to accessing or sending information is to create a repository reference using the create_repo_ref()
function. The resulting object is the first element passed into all get_
and post_
functions.
For users that work with databases in R with the DBI
package, this codeflow is analogous to querying from a database. In this case, users first create a database connection object with dbConnect()
which is passed into subsequent functions such as dbGetQuery
.
This decision was based on the assumption that the most common use case for projmgr
would be interacting with a single repository at a time. Admittedly, some users may prefer a view further up in the hierarchy, e.g. an organization object versus a repository object. By providing lower-level building blocks, broader functionality can be achieved by mapping over a set of repositories. A code example is provided in the Event & Team Management vignette.
Functions generally conform to the <verb>_<details>
convention. For functions interaction with the GitHub API, the <verb>
component is the HTTP method invoked (e.g. GET, POST, DELETE).
Verbs like "post" might seem less intuitive than a synonym like "create" or "submit" for users who have not worked with APIs previously. However, this convention describes the function's action most precisely and ideally also serves to raise awareness of HTTP methods.
Functions that interact with GitHub's API demure to the naming conventions of that API. This ideally empowers users for future, direct work with the API and allows for easier maintenance.
More specifically, parameters required by the GitHub API are required by the corresponding functions in this package. Any additional parameters not required by the GitHub API can be passed in through the ...
s. The help_{function name}
and browse_docs()
functions can be used to find out more about the names and descriptions of these optional parameters.
Two noteworthy exceptions are get_issues()
and get_milestones()
. In the GitHub API, there are separate endpoints for getting a single item (issue/milestone) or multiple items However, it seemed unneccesary to create separate functios for the single and plural versions. Instead, if either function is provided an argument for number
, the single-item endpoint is used. Any other query parameters are then irrelevant and ignored. If no argument is provided for the number
parameter, the multiple-item enpoint is used with allowed parameters given by help_{function name}()
.
All get_
functions make a call to the GitHub API and return the result as an R list. The corresponding parse_
function converts each list into a dataframe for easier wrangling and analysis. In most all cases, users will likely call parse_
immediately after get_
and never work with the output of get_
directly. For example:
my_repo <- create_repo_ref('username', 'my_repo') issues <- get_issues(my_repo, state = 'all') %>% parse_issues() issue_events <- get_issue_events(my_repo, number = 7) %>% parse_issue_events() milestones <- get_milestones(my_repo) %>% parse_milestones()
However, the get_
and parse_
functions are provided separately to empower users. Some use cases where users may prefer to not use the parse_
functions include if they:
parse
d outputparse_
functions have not been updatedIn rare cases, get_
functions will return additional information not provided by the API to preserve data lineage. For example, get_issue_events()
and get_issue_comments()
include the issue number (provided as a required function argument) in the output so users know to what issue they refer.
The dataframe returned by parse_
functions attempt to maintain the same field names as used by the GitHub API, similar to the conventions described in [Function Parameters]. However, there are a few key exceptions:
parsed_
field is instead called "n_comments"There are some disadvantages to this approach. For example, if one wishes to join issues and milestone data by milestone number, in the issues data it will be called milestone_number
and in the milestone data it will simply be called number
. One alternative would be to use {object}_{field}
conventions (e.g. issue_name
) uniformly across all datasets. However, this was not selected since it makes variable names long and bulky.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.