inst/onboarding-submission.md

Submitting Author: Carl Boettiger (@cboettig) Repository: https://github.com/cboettig/virtuoso Version submitted: 0.1.1 (tagged) Editor: TBD Reviewer 1: TBD Reviewer 2: TBD Archive: TBD Version accepted: TBD

Package: virtuoso
Type: Package
Title: R interface to Virtuoso using ODBC
Version: 0.1.1
Authors@R: c(person("Carl", "Boettiger", 
                  email = "cboettig@gmail.com", 
                  role = c("aut", "cre", "cph"),
                  comment = c(ORCID = "0000-0002-1642-628X")),
             person("Bryce", "Mecum", 
                    role = "ctb", 
                    email = "brycemecum@gmail.com",
                    comment = c(ORCID = "0000-0002-0381-3766")))
Description: Virtuoso is a high-performance "universal server," which can act
             as both a relational database (supporting standard SQL queries),
             and an Resource Description Framework (RDF) triplestore, supporting 
             SPARQL queries and semantic reasoning. The virtuoso package R provides
             R users with a DBI-compatible connection to the Virtuoso database. 
             The package also provides helper routines to install, launch, and manage
             a Virtuoso server locally on Mac, Windows and Linux platforms using
             the standard interactive installers from the R command-line.  By 
             automatically handling these setup steps, the package can make Virtuoso
             considerably faster and easier for a most users to deploy in a local
             environment. While this can be used as a normal dplyr backend, Virtuoso 
             excels when used as a RDF triplestore.  Managing the bulk import of triples
             from common serializations with a single intuitive command is another key
             feature of the Virtuoso R package.  Bulk import performance can be tens to
             hundreds of times faster than the comparable imports using existing R tools,
             including rdflib and redland packages.  
License: MIT + file LICENSE
Encoding: UTF-8
LazyData: true
Imports: 
    odbc,
    processx,
    DBI,
    utils,
    ini,
    rappdirs,
    curl,
    fs,
    digest
RoxygenNote: 6.1.1
Suggests: 
    knitr,
    rmarkdown,
    nycflights13,
    testthat,
    covr,
    jsonld,
    rdftools,
    dplyr,
    spelling
VignetteBuilder: knitr
Remotes: cboettig/rdftools
Language: en-US

Scope

R users confronted with large dump of triples (e.g. nquad, owl, or other file) currently have few ways of reading in this data, and no performant option that can handle the huge file sizes frequently involved that do not fit into memory. This package provides a relatively convenient way to import this data into an RDF-capable database and query that data directly from R.

Researchers working with RDF / semantic data.

This package overlaps with ropensci package rdflib (and thus redland, which is rdflib uses under the hood.), which primarily provides an in-memory model for working with RDF data, which fails with large triplestores. rdflib & redland do have a pluggable backend that can connect to Virtuoso and other databases, but this is not only very complicated to set up (not only does redland R package need to be built from source, but so does the redland C library in some cases) but is also much slower. This package handles the installation easily in a user-friendly and more performant way, and the resulting Virtuoso server can then be used as a backend to rdflib (though there is usually little reason to do so since Virtuoso can be called directly though this package already).

Technical checks

Confirm each of the following by checking the box. This package:

Publication options

JOSS Options

MEE Options

Code of conduct



cboettig/virtuoso documentation built on April 23, 2024, 10:49 a.m.