knitr::opts_chunk$set( cache = FALSE, collapse = FALSE, warning = FALSE, message = FALSE, tidy = FALSE, fig.align='center', comment = "#>", fig.path = "man/figures/README-", R.options = list(width = 200) )
Understanding the gaps and connections across existing theories and findings is a perennial challenge in scientific research. Systematically reviewing scholarship is especially challenging for researchers who may lack domain expertise, including junior scholars or those exploring new substantive territory. Conversely, senior scholars may rely on longstanding assumptions and social networks that exclude new research. In both cases, ad hoc literature reviews hinder accumulation of knowledge. Scholars are rarely systematic in selecting relevant prior work or then identifying patterns across their sample. To encourage systematic, replicable, and transparent methods for assessing literature, we propose an accessible network-based framework for reviewing scholarship. In our method, we consider a literature as a network of recurring concepts (nodes) and theorized relationships among them (edges). Network statistics and visualization allow researchers to see patterns and offer reproducible characterizations of assertions about the major themes in existing literature.
netlit
provides functions to generate network statistics from a literature review. Specifically, it processes a dataset where each row is a proposed relationship ("edge") between two concepts or variables ("nodes").
The aim is to offer easy tools to begin using the power of network analysis in R for literature reviews. Using netlit
simply requires researchers to enter relationships they observe in prior studies into a simple spreadsheet.
To install netlit
from CRAN, run the following:
install.packages("netlit")
The review()
function takes in a dataframe, data
, that includes from
and to
columns (a directed graph structure).
In the example below, we use example data from this project on redistricting. These data are a set of related concepts (from
and to
) in the redistricting literature and citations for these relationships (cites
and cites_empirical
). See the main netlit
vignette for more details on this example.
library(netlit) data("literature") head(literature)
netlit
offers four functions: make_edgelist()
, make_nodelist()
, augment_nodelist()
, and review()
.
review()
is the primary function (and probably the only one you need). The others are helper functions that perform the individual steps that review()
does all at once. review()
takes in a dataframe with at least two columns representing linked concepts (e.g., a cause and an effect) and returns data augmented with network statistics. Users must either specify "from" nodes and "to" nodes with the from
and to
arguments or include columns named from
and to
in the supplied data
object.
review()
returns a list of three objects:
edgelist
(a list of relationships with edge_betweenness
calculated), nodelist
(a list of concepts with degree
and betweenness
calculated), and graph
object suitable for use in other igraph
functions or other network visualization packages. Users may wish to include edge attributes (e.g., information about the relationship between the two concepts) or node attributes (information about each concept). We show how to do so below. But first, consider the basic use of review()
:
lit <- review(literature, from = "from", to = "to") lit head(lit$edgelist) head(lit$nodelist)
Edge and node attributes can be added using the edge_attributes
and node_attributes
arguments.
edge_attributes
is a vector that identifies columns in the supplied data frame that the user would like to retain. (To retain all variables from literature
, use edge_attributes = names(literature)
.)
node_attributes
is a separate dataframe that contains attributes for each node in the primary data set. node_attributes
must be a dataframe with a column node
with values matching the to
or from
columns of the data
argument.
The example node_attributes
data include one column type
indicating a type for each each node/variable/concept.
data("node_attributes") head(node_attributes) lit <- review(literature, edge_attributes = c("cites", "cites_empirical"), node_attributes = node_attributes) lit head(lit$edgelist) head(lit$nodelist)
Below is a plot of redistricting literature network from the main netlit
vignette using the graph
object returned by the netlit::review()
function as the input to network graphing functions from packages like ggnetwork
. The nodelist
and edgelist
objects also provide required inputs for other network visualization packages, e.g. ggraph
or visNetwork
(vignettes on how to make similar plots in ggraph
and visNetwork
will be posted shortly).
Nodes represent theoretical concepts, shaded by total degree centrality. Arrows connect concepts theorized as directional relationships in works, colored by number of works. Solid edges indicate empirically studied connections; dashed are relationships that have been theorized but not studied empirically.
knitr::include_graphics("man/figures/ggraph-1.png")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.