read_bibliograpy: Import bibliographic data

Description Usage Arguments Details Value See Also Examples

Description

Import standard formats from academic search engines and referencing software.

Usage

1
read_bibliography(filename, return_df = TRUE)

Arguments

filename

A vector or list containing paths to one or more bibliographic files.

return_df

logical; should the object returned be a data.frame? Defaults to TRUE. If FALSE, returns an object of class bibliography.

Details

This function aims to import bibliographic data from a range of formats in a consistent manner.

If the user provides a number of paths in a vector or list, then files are imported in sequence and joined into a single data.frame (or a list if return_df = FALSE) using merge_columns. If return_df = TRUE (the default) then an extra column called 'file_name' is appended to the resulting data.frame to show the file in which each entry originated.

If the file is in .csv format, then read_bibliography will import the file using read.csv, with three changes. First, it ensures that the first column contains an index (i.e. a unique value for each row), and creates one if it is absent. Second, it converts column names to lower case and switches all delimiters to underscores. Third, it ensures that author names are delimited by 'and' for consistency with format_citation.

If the file is of any type other than .csv, read_bibliography auto-detects document formatting, first by detecting whether the document is ris-like or bib-like, and then running an appropriate import function depending on the result. In the case of ris-like files (including files from 'Endnote' & 'Web of Science'), this involves attempting to detect both the delimiter between successive entries, and the means of separating tag labels (e.g. 'AU', 'TI') from their information content. Attempts have been made to ensure consistency with .ris, .bib, medline (.nbib) or web of science (.ciw) formats. Except for .csv, file extensions are not used to determine file type, and are ignored except to locate the file.

If the imported file is in a ris-like format, then the object returned by read_bibliography will have different headings from the source document. This feature attempts to ensure consistency across file types. Tag substitutions are made using a lookup table, which can be viewed by calling tag_lookup. Unrecognized tags are grouped in the resulting bibliography object under the heading 'further_info'.

Value

Returns an object of class data.frame if return_df is TRUE; otherwise an object of class bibliography.

See Also

bibliography-class, tag_lookup

Examples

1
2
3
4
5
6
7
8
9
file_location <- system.file(
  "extdata",
  "avian_ecology_bibliography.ris",
  package = "revtools")
x <- read_bibliography(file_location)
class(x) # = data.frame
x <- read_bibliography(file_location, return_df = FALSE)
class(x) # = bibliography
summary(x)

Example output

sh: 1: wc: Permission denied
Could not detect number of cores, defaulting to 1.
[1] "data.frame"
[1] "bibliography"
Object of class 'bibliography' containing 20 entries.
  Number containing abstracts: 20 (100%)
Number of sources: 16
Most common sources:
  Journal of Applied Ecology (n = 3)
  Forest Ecology and Management (n = 2)
  Landscape Ecology (n = 2)
  Acta Oecologica (n = 1)
  African Journal of Ecology (n = 1)

revtools documentation built on Jan. 8, 2020, 5:10 p.m.