knitr::opts_chunk$set(comment = "")
options(width = 120, max.print = 100)
build_with_security = (Sys.getenv("SCIDB_TEST_WITH_SECURITY", "") == "true")
db <- if (!build_with_security) {
  scidb::scidbconnect(host = Sys.getenv("SCIDB_TEST_HOST","localhost"),
                      port = Sys.getenv("SCIDB_TEST_PORT", 8080))
} else {
  scidb::scidbconnect(host = Sys.getenv("SCIDB_TEST_HOST","localhost"),
                      port = Sys.getenv("SCIDB_TEST_PORT",8083),
                      username = Sys.getenv("SCIDB_USER"),
                      password = Sys.getenv("SCIDB_TEST_PASSWORD"),
                      protocol = "https")
}

Binding SciDB arrays and query expressions to R variables

SciDB array view objects

The scidb() function returns an object of class scidb that contains a reference to a SciDB array or query expression. The returned object is presented with a data frame-like view that lists the SciDB schema components.

x <- scidb::scidb(db, "build(<v:double>[i=1:2,2,0, j=1:3,1,0], i*j)")
x

The R variable x is a sort of SciDB array view; an un-evaluated SciDB query expression. The value of x is evaluated by SciDB either lazily when needed or when explicitly requested. It is an S4 R object; the "name" slot contains the AFL expression corresponding to the object.

Storing query results in SciDB

Use the store() function to evaluate and materialize views in SciDB. The store() function stores the evaluation result into a new named SciDB array in the database, returning a new scidb R object that points to the SciDB array. The array name may be optionally specified or automatically generated and the array may optionally be stored in a SciDB temporary array.

y <- scidb::store(db, x, temp=TRUE)
y

Note that the SciDB expression associated with the y R variable is now a named SciDB array (automatically named in this case); compare with the SciDB expression for x above.

Lifetime and garbage collection

SciDB array values associated with R variables are tied to R's garbage collector by default (unless the argument gc=FALSE is specified). When the R variable's contents are garbage-collected by R, the associated SciDB array is removed.

# observe that y's corresponding array is in the list)
yname <- y@name
yname %in% scidb::ls(db)$name
# Remove and garbage collect
rm(y)
gc()
# Observe that y's corresponding array is no longer in the list
yname %in% scidb::ls(db)$name  

The schema() function

Use the schema() function to display the SciDB schema of scidb objects verbatim or in parsed detail for attributes and dimensions.

scidb::schema(x)
scidb::schema(x, "attributes")
scidb::schema(x, "dimensions")

Uploading R data frames to SciDB

The package provides limited convenience functions for converting and uploading R values to SciDB. The upload mechanism is much less efficient than available SciDB bulk load methods and should only be used for small to moderate-sized data. The following R objects are supported:

Factor values are uploaded as character value, replacing factor levels with their corresponding character strings.

The as.scidb() function returns a reference to a SciDB array containing the uploaded data. The following example uploads a data frame to SciDB, warning us that variable names were changed because SciDB does not support dots in names, and then downloads the resulting SciDB object data back into R.

x <- scidb::as.scidb(db, head(iris))
scidb::as.R(x)

Note that SciDB dimension indices (i above) are appended to the data when downloaded.



Paradigm4/SciDBR documentation built on Nov. 9, 2023, 4:58 a.m.