git: Git querying

gitR Documentation

Git querying

Description

Various tools for querying Git history as a Directed Acyclic Graph (DAG).

  • git_log() - a low-level interface

  • git_commit_edgelist() - assemble an edgelist of the commit graph in which a directed edge connects a commit to its parent.

  • git_commits() - assemble a database of git commits with hash and author date time

  • git_refs() - fetch information about Git refs

  • git_commit_graph() - create an igraph object with the complete history of the repository.

  • Vref() - custom creation of vertex sequences

Usage

git_log(dir = ".", format_log, delim = " ", ...)

git_commit_edgelist(dir = ".")

git_commits(dir = ".", col_types)

git_refs(dir = ".")

git_commit_graph(dir = ".")

Vref(g, refs = NULL, ...)

Arguments

dir

directory with git repo, defaults to current directory

format_log

character vector of ⁠git log⁠ format options

delim, ...

passed to readr::read_delim() usually col_names or col_types need to be specified too

col_types

passed to readr::read_delim(), defaults to "ciT"

g

a graph built with git_commit_graph()

refs

commit refs

Details

Function git_log() runs ⁠git log⁠ in dir passing format_log collapsed with white spaces to the --format option. The command is run with --all option to include all the branches.

Do note, wrt git_commit_edgelist(), that a commit can be a parent of more than one commit (i.e., when the history "forks") and a commit can have multiple parents (i.e. in case of merge commits).

For git_commits() if col_types is missing (default) it is assumed to be "ciT"

Value

Function git_log() returns a tibble with as many columns as there are fields requested with format_log and parsed by readr::read_delim() using delim.

Function git_commit_edgelist() returns a two-column tibble with columns:

  • .commit - hash of a commit

  • .parent - hash of a parent of the .commit

Function git_commits() returns a tibble with columns:

  • .commit - commit hash

  • author_timestamp - Linux timestamp of author date

  • author_datetime - author date in strict ISO 8601 format

Function git_refs() returns a tibble with a row for each ref and the following columns:

  • .commit - commit hash

  • ref - full name of the ref, e.g. refs/heads/master or refs/remotes/origin/HEAD

The igraph object returned by git_commit_graph() has vertices correspond to commits and edges point from commits to their parents. It has additionally the following attributes defined:

  • name - vertex attribute with commit hash

  • author_timestamp - vertex attribute with author date timestamp

  • author_datetime - vertex attribute with author date in ISO 8601 format

  • refs - vertex attribute with a list of either NULL or character vector of refs pointing to the particular commit

Function Vref() works similarly to igraph::V() returning vertex sequence for vertices in g corresponding to refs specifed by refs.


mbojan/mbtools documentation built on Oct. 16, 2023, 8:18 p.m.