exa.createScript: Deploys an R function as an UDF in the EXASolution database.
In EXASOL/r-exasol: R Interface & SDK for the EXASOL Database

exa.createScript

R Documentation

Deploys an R function as an UDF in the EXASolution database.

Description

This function takes an R function and creates a R UDF script on the EXASolution database. A CREATE SCRIPT call will be used behind the scenes. The return value is a function that, when executed, will execute the script on the database server.

Usage

exa.createScript(
  channel,
  name,
  func = NA,
  env = list(),
  initCode = NA,
  cleanCode = NA,
  inType = SET,
  inArgs = list(),
  outType = EMITS,
  outArgs = list(),
  outputAddress = NA,
  replaceIfExists = TRUE,
  mockOnly = FALSE
)

Arguments

`channel`	The RODBC connection channel, typically created via odbcConnect.
`name`	The script will be created in the database with this name.
`func`	The R function to be created as a UDF R script in the database.
`env`	A list of values which will be available in the UDF function under the same name. For example if you pass `list(a=2)` for this argument, you can access the value a in the function via `env$a`.
`initCode`	This code block will be executed once on each parallel instance of the database running the script, before the first call of the function.
`cleanCode`	This code block will be executed once on each parallel instance of the database running the script, after the function was called the last time.
`inType`	The input type of the UDF script, either `SET` or `SCALAR`. `SET` will call the function once for each group, `SCALAR` will call the function once for each record. Please read the EXASolution manual about UDF scripts for details.
`inArgs`	Vector of strings specifying the names and types of the input arguments for the UDF script. Example: `inArgs = c("k INT", "v VARCHAR(10)")`
`outType`	The output type of the UDF script, either `EMITS` or `RETURNS`. For `EMITS`, the function emits any number of values. For `RETURNS`, the function emits just a single value.
`outArgs`	Vector of strings specifying the names and types of the output arguments of the UDF script. Example: `inArgs = c("outputstring VARCHAR(10)")`
`outputAddress`	This parameters specifies the address and port of the optional python output service is listening on. For example: `c("192.168.1.10", 3000)`.Please read the README.txt of this R package for details.
`replaceIfExists`	Boolean whether the script shall be replaced if it already exists. Either `TRUE` or `FALSE`.
`mockOnly`	Boolean, default FALSE. This parameter is useful for unit-testing if the ODBC connection is not available. Setting mockOnly=TRUE will not install the UDF function to the EXASOL database.

Details

We recommend to read the EXASolution manual about UDF scripts for a better understanding.

Value

This function returns a function that, when called, will execute the script on the server. With the call you have to specify to which data it shall be applied. The returned function generates and executes a SELECT SQL statement behind the scenes. It has the following signature:

function(..., table = NA, where = NA, groupBy = NA, restQuery = "", returnSQL = FALSE, reader = NA, server = NA)

`...`	The first string parameters define the SQL expressions that will be used as the input for the UDF script. Typically this is one or more column names as you see in the example below.
`table`	A string with the table name to which the function shall be applied to. You can specify quoted names the following: table='myschema."MyQuotedTable"'
`where`	A string with the where clause (SQL) to filter the records.
`groupBy`	A string with the group-by clause (SQL) that will be used to group the. This is especially important for SET UDF scripts that will be called once for each group.
`returnSQL`	Boolean value. For TRUE, the autogenerated SQL statement will be returned, but NOT executed.
`restQuery`	A string with additional SQL code that will be appended at the end of the autogenerated query, e.g. ORDER BY or HAVING.
`reader`	For internal usage only.
`server`	For internal usage only.
`RETURN VALUE`	The return value of the function is the result of the SELECT query. The query will be executed internally with the exa.readData function.

Author(s)

EXASOL AG <opensource@exasol.com>

Examples

## Not run: 

# This example creates a simple SET-EMITS script and executes
# it the table footable.
require(RODBC)
require(exasol)

# Connect via RODBC with configured DSN
C <- odbcConnect("exasolution")

# Generate example data frame with two groups
# of random values with different means.
valsMean0  <- rnorm(10, 0)
valsMean50 <- rnorm(10, 50)
twogroups <- data.frame(group = rep(1:2, each = 10),
                        value = c(valsMean0, valsMean50))

# Write example data to a table
odbcQuery(C, "CREATE SCHEMA test")
odbcQuery(C, "CREATE TABLE test.twogroups (groupid INT, val DOUBLE)")
exa.writeData(C, twogroups, tableName = "test.twogroups")

# Create the R function as an UDF R script in the database
# In our case it computes the mean for each group.
testscript <- exa.createScript(
  C,
  "test.mymean",
  function(data) {
    data$next_row(NA); # read all values from this group into a single vector
    data$emit(data$groupid[[1]], mean(data$val))
  },
  inArgs = c( "groupid INT", "val DOUBLE" ),
  outArgs = c( "groupid INT", "mean DOUBLE" ) )

# Run the function, grouping by the groupid column
# and aggregating on the "val" column. This returns
# two values which are close to the means of the two groups.
testscript("groupid", "val", table = "test.twogroups" , groupBy = "groupid")

## End(Not run)

EXASOL/r-exasol documentation built on June 2, 2025, 1:44 p.m.