knitr::opts_chunk$set( collapse = TRUE, eval = FALSE, comment = "#>" ) dplyr_issues_list <- readRDS(system.file("extdata", "dplyr-issues-list.rds", package = "projmgr", mustWork = TRUE))
This vignette illustrates basic projmgr
codeflows.
library(projmgr)
The first step to interacting with a GitHub repository is creating a repository reference. Repository references are "first class citizens" in projmgr
and contain all authorization credentials. These references are passed as the first argument in all get_
functions and post_
functions.
Suppose, for example, we are interested in pulling data about issues in the dplyr
repository. We would start off by creating a repository reference with the create_repo_ref()
function.
dplyr <- create_repo_ref('tidyverse', 'dplyr')
Note that this function has many additional parameters that you can specify as needed. For example:
is_enterprise = TRUE
and pass your company's internal GitHub URL in through hostname
is_enterprise = TRUE
), the name you used can be passed through hostname
identifier
lets you provide your credentials in different ways if you haven't set up a PAT or it is called something else (see the GitHub PAT vignette or the create_repo_ref()
documentation for more details)The check_
family of functions provide some basic validations that everything is working.
check_internet()
tests for problems connecting to the internet.
check_internet()
#> [1] TRUE
check_credentials()
confirms the login associated with the PAT and describes the level of access that PAT have to the specified repo.
check_credentials(dplyr)
#> -- With provided credentials -- #> + Login: emilyriederer #> + Type: User #> -- In the dplyr repo -- #> + Admin: FALSE #> + Push: FALSE #> + Pull: FALSE
check_rate_limit()
checks how many more API requests you can send and when that count will reset. Note that this accepts a repo reference as a parameter only to get authentication information for an account. Limits to requests are established at the account-level not the repository level.
check_rate_limit(dplyr)
#> 4998 / 5000 (Resets at 22:26:20)
get_
functions retrieve information from the GitHub API. The first argument is the repository reference and additional named query parameters can be passed subsequently.
For example, here, we request issues of the first milestone
with either an open or closed state
. (Note that, in keeping with the GitHub API, only open issues are returned when state
is not specified. If you're trying to build a productivity report, i.e. about everything that's been completed, it's very important to specify state
as "closed" or "all".)
dplyr_issues_list <- get_issues(dplyr, milestone = 1, state = 'all')
If you don't know what parameters are available, the help_<function_name>
family provides more information on valid arguments to include in get_
and post_
functions.
help_get_issues()
Or take a guess. The get_
functions will check that all of your named query parameters are accepted by the API and throw an error for any that are unrecognized.
get_issues(dplyr, not_a_real_parameter = 'abc')
#> Error: The following user-inputted variables are not relevant to this API request: #> + not_a_real_parameter #> Allowed variables are: #> + milestone,state,assignee,creator,mentioned,labels,sort,direction,since #> Please remove unallowed fields and try again. #> Use the browse_docs() function or visit https://developer.github.com/v3/ for full API documentation.
As the get_issues()
error message says, detailed documentation can also be viewed using the browse_docs()
function. This function launches your browser to the appropriate part of the GitHub API documentation. For example, one might run:
browse_docs(action = 'get', object = 'issue')
Results are returned as a list, closely mirror the JSON output from the actual API.
str(dplyr_issues_list[[1]], max.level = 1)
It's likely you may prefer to work with them as dataframes instead. The parse_
family of functions converts "raw" list output from get_
into a dataframe.
dplyr_issues <- parse_issues(dplyr_issues_list) head(dplyr_issues)
Columns that can contain multiple values (e.g. multiple label names or assignees) appear as lists.
head( dplyr_issues[,c("labels_name", "assignees_login")] ) dplyr_issues$assignees_login
Dataframes can be "expanded" to the issue-assignee or issue-label granulariy using the tidyr::unnest()
function. For example, the following code expands the dataframe to one row per issue-assignee. For example, after running that command, you can see the first two rows refer to the same assign but different assignees.
dplyr_issues %>% tidyr::unnest(assignees_login) %>% dplyr::select(number, title, assignees_login) %>% head()
In summary, the general process is calling a get_
function followed by a parse_
function. We will get milestones for a second example.
dplyr_milestones <- get_milestones(dplyr, state = 'all') %>% parse_milestones()
The report_
function family offers an alternative to visualizations. These functions generate HTML for aethetic output in RMarkdown reports. Output is automatically tagged so knitr
knows to interpret it as HTML, so it is not necessary to manually add the results = 'asis'
chunk option. (Don't worry if you don't know what this means. You don't need to do anything!)
report_progress(dplyr_issues)
The post_
function family helps add new objects to a GitHub repo. For example, the following command adds a new issue to a repository. After posting new content, post_
functions return the identification number for the new object.
experigit <- create_repo_ref('emilyriederer', 'experigit') post_issue(experigit, title = "Add unit tests for post_issues when title duplicated", body = "Check that code appropriately warns users when attempting to post a duplicate issue", labels = c("enhancement", "test"), assignees = "emilyriederer" )
#> [1] 150
The GitHub API allows multiple issues to have the same title. However, you may want to disable this functionality (for example, if a post_
function is in a script that may be re-run.) In this case, the distinct
parameter allows you to chose whether or not to allow the posting of new issues with the same title as open existing issues. When distinct = TRUE
(as it is by default), the function throws an error and does not post the issue.
post_issue(experigit, title = "Add unit tests for post_issues when title duplicated")
#> Error: New issue title is not distinct with current open issues. Please change title or set distinct = FALSE.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.