build_graph: Builds a citation graph.

Description Usage Arguments Value Author(s) See Also Examples

View source: R/Diderot.R

Description

Builds a citation graph based on a database of bibliographic records generated with create_bibliography. This process is automatically parallelized on multicore hardware. By default, matching between title and references is done based on the full title, publication year, and three first authors. Publication attributes present in the dataframe can be copied to graph nodes using the attrs argument.

Usage

1
2
3
4
5
build_graph(db, title = "Cite Me As", year = "Year", authors = "Authors", 
            ref = "Cited References", set.title.as.name = F, attrs = NULL, 
            verbose = F, makeCluster.type = "PSOCK", nb.cores=NA, 
            fine.check.threshold = 1000, fine.check.nb.authors = 3, 
            small.year.mismatch = T, debug = F)

Arguments

db

Bibliographic database created with created_bibliography.

title

Name of the data frame column in which publication titles are listed.

year

Name of the data frame column in which publication years are listed.

authors

Name of the data frame column in which publication authors are listed.

ref

Name of the data frame column in which publication references are listed.

set.title.as.name

Set graph vertex ID to publication title

attrs

Attributes of the bibliographic database (i.e. data frame column names, such as "Authors"", "Year") to be set as vertex attributes.

verbose

Verbosity flag triggering a more detailed output during graph building.

makeCluster.type

Type of cluster to be used to parallelize the graph building process. For more options, see makeCluster in the doParallel library.

nb.cores

Number of cores to be used for parallel computation.

fine.check.threshold

Title length under which citation matching is further confirmed based on publication year. This value can be reduced to increase performance on large bibliographic databases. By default, publication year check is always performed.

fine.check.nb.authors

Maximum number of authors to check against for citation matching. This value can be reduced to increase performance on large bibliographic databases. Default value is three authors.

small.year.mismatch

Flag indicating whether small year mismatches (+- 1 year) should be tolerated. It is recommended to keep this this flag to TRUE to accomodate usual inconsistencies in bibliographic databases.

debug

Debug flag allowing the user to browse function calls upon execution error. For more details, see recover in the utils library.

Value

Returns a graph object.

Author(s)

Christian Vincenot (christian@vincenot.biz)

See Also

create_bibliography

Examples

1
2
3
4
5
6
7
8
9
labels<-c("Corpus1","Corpus2")

# Build a bibliographical dataset from Scopus exports
db<-create_bibliography(corpora_files=c(tempfi1,tempfi2), 
                        labels=labels, keywords=NA)


# Build graph
gr<-build_graph(db=db,small.year.mismatch=TRUE, attrs=c("Corpus","Year","Authors"), nb.cores=1)

Diderot documentation built on April 19, 2020, 4:16 p.m.