read_dfr_citations: Read a single 'citations.CSV' or 'citations.tsv' file.

read_dfr_citationsR Documentation

Read a single citations.CSV or citations.tsv file.

Description

This function reads in a single citations.CSV (2013 and earlier) or citations.tsv (2014 and after) from JSTOR DfR. It knows about the eccentricities of these formats. Use read_dfr_metadata to load and aggregate multiple files.

Usage

read_dfr_citations(filename, strip.white = TRUE, ...)

Arguments

filename

the file to read. If NA, opens the file dialog.

strip.white

passed to read.table: by default, white space is stripped.

...

Passed on to read.csv or read.table.

Details

This function assumes that each file has a trailing delimeter at the end of every line. DfR has changed their output data format before, so check results carefully.

We do some minimal post-processing of the data. White space is trimmed by default. Publication dates in the pubdate column are converted to Date objects (but beware the false precision of these dates; see pubdate_Date. The type column is converted to a factor.

Notes about other fields: the doi column is, in my experience, always identical to the id field, but it is kept here just in case. The title and abstract fields may contain markup (HTML or even LaTeX). Most DfR documents lack abstracts in the metadata.

The author column may contain multiple names, but must be inspected carefully before processing. The separator among names may be either a tab or ", ". A single name may contain the separator character without disambiguation ("Rudolf Tombo, Jr.").

Extra parameters to this function are passed on to read.csv or read.table.

Value

A dataframe of metadata.

See Also

read_dfr_metadata, pubdate_Date


agoldst/dfrtopics documentation built on July 15, 2022, 4:13 p.m.