Loads a model definition into Causata for scoring.

Share:

Description

Three different sets of configuration information are combined to upload a model to Causata for scoring.

Usage

1
2
3
4
UploadModel(causata.config, model.definition, variable.definition, verbose=FALSE)

UploadModelWithValidation(causata.config, model.definition, variable.definition,
  connection, query.function, record.error.max, verbose=FALSE, ...)

Arguments

causata.config

An object from CausataConfig.

model.definition

An object from ModelDefinition.

variable.definition

An object from VariableDefinition.

verbose

If TRUE then information is printed to the console.

connection

An object from Connect.

query.function

A function that returns a query string or Query object. The first argument to this function must accept a character string representing a variable name that will be added to the query. See the Details section below for more information.

record.error.max

The absolute value of the largest acceptable error.

...

Extra arguments are passed to the query.function.

Details

UploadModel translates a model into PMML and uploads it to Causata, where it will become available as a new variable.

UploadModelWithValidation adds validation to the upload process. The process works as follows:

  1. The model is uploaded to a random variable name.

  2. A new query is executed using the provided query.function. The new query will include the variables originally used to train the model, and the new model variable from Causata. The R scoring process is re-applied to the new data, and the results from R and Causata are compared. The validation is deemed successful if the difference in results is below the value provided in record.error.max.

    If the validation was successful then the model is re-uploaded using the variable name provided in model.definition. If the validation failed then

There are two important requirements for the query function:

  1. The query function must accept a variable name as its first argument – this argument is used to add the score variable to the query.

  2. The query function must return a query including all of the variables that were originally used to train the model. The recommended best-practice is to use a function to extract the training data, then re-use the same function for the validation process.

Value

For UploadModel, if the upload was successful then a boolean TRUE is returned. If the upload failed then an error message is returned.

UploadModelWithValidation returns a list with the following elements:

result

A boolean that is TRUE if the validation was successful and FALSE otherwise.

validation.data

A dataframe containing data used in the validation process.

errors

An array of error values, which are the absolute value of the difference between prediction and actuals.

prediction

The model scores as calculated by R.

model.matrix

The model matrix used by R to generate scores.

actuals

The model scores as calcualted by Causata.

problematic.indices

An array of indices that are TRUE if the error value exceeds the record.error.max and FALSE otherwise.

Author(s)

David Barker, Justin Hemann <support@causata.com>

See Also

CausataConfig, ModelDefinition, VariableDefinition, Connect, Query, CausataData.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
# An example query function for UploadModelWithValidation
# The focal point query below returns profiles from the most recent
# ad impression where the product name is "Test Product".
query.function <- function(variables, more.variables=c(), limit=100){
  query <- paste(
    "select", BacktickCollapse(c(variables, more.variables)),
    "from Scenarios S,",
    "     `ad-impression` E",
    "where S.profile_id = E.profile_id",
    "  and S.focal_point = E.timestamp",
    "  and is_last(E.timestamp)",
    "and exists",
    "( select *",
    "  from `ad-impression` A",
    "  where A.`product-name` = 'Test Product'",
    ")",
    "Limit", limit)
  return(query)
}

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.