knitr::opts_chunk$set(comment = "") options(width = 120, max.print = 100)
build_with_security = (Sys.getenv("SCIDB_TEST_WITH_SECURITY", "") == "true") db <- if (!build_with_security) { scidb::scidbconnect(host = Sys.getenv("SCIDB_TEST_HOST","localhost"), port = Sys.getenv("SCIDB_TEST_PORT", 8080)) } else { scidb::scidbconnect(host = Sys.getenv("SCIDB_TEST_HOST","localhost"), port = Sys.getenv("SCIDB_TEST_PORT",8083), username = Sys.getenv("SCIDB_USER"), password = Sys.getenv("SCIDB_TEST_PASSWORD"), protocol = "https") }
This vignette illustrates using SciDB from R by example. For more detailed information on the functions described in this vignette, see the manual pages in the package.
From CRAN (stable, but may lag features on GitHub by several months):
install.packages("scidb")
From the development repository on GitHub (stable branch):
devtools::install_github("Paradigm4/SciDBR")
"Stable" means that all CRAN checks and package unit tests pass when tested using the current SciDB release. We try to make sure that the scidb package works with all previous versions of SciDB but we only actively test against the current release version of the database. Other experimental and development branches exist; see the GitHub repository for a list.
The scidbconnect()
function establishes a connection to SciDB, either to a simple
HTTP network service called shim (https://github.com/Paradigm4/shim) running on a SciDB
coordinator instance or directly to SciDB's HTTP API (for SciDB version 23.2 and higher).
The function may be safely called multiple times. The
function return value contains the SciDB database connection state in an object
of class afl
.
The network interface optionally supports SSL encryption and SciDB authentication or HTTP digest user authentication.
Connect to SciDB on the default shim port and localhost
library("scidb") db <- scidbconnect()
Connect to shim on an encrypted port 8083 with example SciDB authentication
db <- scidbconnect(port=8083, username="root", password="password", protocol = 'https')
Use encrypted sessions when communicating with SciDB over public networks. SciDB user authentication is only supported by SciDB versions 15.7 and greater and only works over encrypted connections.
The scidb::scidbconnect()
function returns a SciDB connection object of class afl.
In addition to storing the connection state, the returned object has a few
special methods. Printing the object shows a summary of the connection state.
Applying the ls()
function to the object returns a list of SciDB arrays (subject
to any potential namespace-setting prefix expression). And the object itself is
really a list that contains available SciDB AFL operator and macro functions
established upon connection. Apply the ls.str()
to the object to list all AFL
operators and macros.
print(db) # summarize connection scidb::ls(db) # quick list of arrays ls.str(db) # quick list of AFL operators
The function ls.str(db)
shows the formal AFL operator
arguments for each function. These functions can be used to compose AFL
expressions from R, discussed in more detail below.
Additionally, each listed AFL operator is present as an R function for the object
db
. This experimental method for generating AFL operations functionally is
documented at vignette("afl_generation")
.
NOTE The list of operators and macros is established at connection time. If the database operators change after establishing the connection, for instance by loading a new SciDB plugin, then those changes will not be shown in the database connection object. New connection objects will show the current list of operators and macros.
iquery()
The simplest way to compose and execute SciDB queries is to use the iquery()
function. This directly runs arbitrary SciDB AFL queries supplied as
character strings or scidb objects, optionally returning results to R as a
data frame:
scidb::iquery(db, "build(<v:double>[i=1:2,2,0, j=1:3,1,0], i*j)", return=TRUE)
In real usage, queries usually are not entirely literal AFL expressions but depend
on parameters and variables in the R environment, such as the name of an array
or one of its attributes. This simple example shows query involving the
grouped_aggregate
operator applied to an array created by uploading an R
data frame (see vignette("advanced")
) by interpolating its name into the query:
x <- scidb::as.scidb(db, iris) # upload the iris data frame to SciDB scidb::iquery(db, paste("grouped_aggregate(", x@name, ", Species, avg(Petal_Length) as avg)"), return=TRUE)
Experimental methods for building AFL more programatically are shown in vignette("afl_generation")
.
When returning the data from a call to iquery()
, the options binary=FALSE, arrow=TRUE
steer the communication and parsing of the dataset to use the arrow IPC format, which
can be much more performant when the returned array is large.
Support for uploading R data frames to SciDB arrays, and conversely for directly downloading
SciDB arrays as dataframes outside of the iquery()
convenience functions, is
documented atvignette("advanced")
.
In addition to directly running AFL expressions via iquery()
, the scidb package
supports for programatically composing AFL expressions. These
composition methods include using the database connection AFL functions and
mapping R expressions to AFL expressions, and are documented at vignette("afl_generation")
.
Detailed package options, especially regarding authentication, namespaces, and roles,
is available in vignette("options")
.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.