Fulltext search and retrieval of scholarly texts.

Share:

Description

fulltext is a single interface to many sources of scholarly texts. In practice, this means only ones that are legally useable. We will support sources that require authentication on a case by case basis - that is, if more than just a few people will use it, and it's not too burdensome to include, then we can include that source.

What's included

We currently include support for search and full text retrieval for a variety of publishers. See ft_search for what we include for search, and ft_get for what we include for full text retrieval.

Use cases

The following are tasks/use cases supported:

  • search - ft_search

  • get texts - ft_get

  • get full text links - ft_links

  • extract text from pdfs - ft_extract

  • serialize to different data formats - ft_serialize

  • extract certain article sections (e.g., authors) - chunks

  • grab supplementary materials for (re-)analysis of data - ft_get_si accepts article identifiers, and output from ft_search and ft_get

DOI delays

Beware that DOIs are not searchable via Crossref/Entrez immediately. The delay may be as much as a few days, though should be less than a day. This delay should become shorter as services improve. The point of this is that you man not find a match for a relatively new DOI (e.g., for an article published the same day). We've tried to account for this for some publishers. For example, for Crossref we search Crossref for a match for a DOI, and if none is found we attempt to retrieve the full text from the publisher directly.

Feedback

Let us know what you think at https://github.com/ropensci/fulltext/issues

Author(s)

Scott Chamberlain <myrmecocystus@gmail.com>