Description Usage Arguments Details Examples
Export the calculations in an R script to an .rda file that can be imported into the Clyde Analytical Platform. This allows models fitted in R to be used as derived columns or packaged into RESTful web services for real-time and batch scoring.
1 | clydeExport(exportFileName, predFuncName, predColumnList, libraryList = NULL)
|
exportFileName |
The file name used to export, with .rda extension. |
predFuncName |
The name of the user function making the prediction, as a string (see 'Details'). |
predColumnList |
The list of names returned by the prediction function, as a character vector. |
libraryList |
The list of libraries needed by the prediction function, as a character vector. If NULL, only the packages attached by default can be used inside the prediction function. |
The script needs to contain a user-defined function that takes an explicit list of formal arguments
and returns a dataframe. The predFuncName
stores the name of this function as a
string (see 'Examples'). The predColumnList
stores the list of columns calculated by the
predFuncName
, and can be either a character vector or a list. If this argument is a
character vector, the returned type of all the computed columns is assumed to be 'numeric'.
Otherwise, this argument should be a named list specifying the return types of all computed
columns (one of either 'integer', 'numeric' or 'factor', see the Random Forest example).
The libraryList
argument should be set to the list of packages needed for the predFuncName
,
if these packages are not attached by default.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 | # GLM
data <- data.frame(
x1 = c(5,10,15,20,30,40,60,80,100),
x2 = c(118,58,42,35,27,25,21,19,18),
y = c(69,35,26,21,18,16,13,12,12))
g1 <- glm(y ~ x1 + x2, data=data)
# function storing the calculations, takes as arguments the predictors used in the glm model
# returns a dataframe with one column named 'y_pred'
glmPredict <- function(x1, x2) {
df <- as.data.frame(cbind(x1, x2))
res <- as.data.frame(predict(g1, newdata=df, type="response"))
# set the column name(s) for the returned data frame
names(res) <- "y_pred"
return(res)
}
# the argument predFuncName points to the user-defined function storing the calculations
# the argument predColumnList is set to the columns names of the function result
# no need to specify the libraryList argument, since the glm prediction uses only the base and stats
# packages
clydeExport("glm.rda", "glmPredict", c("y_pred"))
# Random Forest
library(randomForest)
data(iris)
# replace dot(.) in names(data) with underscore(_)
names(iris) <- gsub("\\.", "_", names(iris))
set.seed(71)
rf <- randomForest(Species ~ ., data=iris)
# this function takes as arguments the predictors used in the RF model
# returns a data frame with one column named 'Species_pred'
rfPredict <- function(Sepal_Length, Sepal_Width, Petal_Length, Petal_Width) {
df <- as.data.frame(cbind(Sepal_Length, Sepal_Width, Petal_Length, Petal_Width));
res <- as.data.frame(predict(rf, newdata=df))
names(res) <- "Species_pred"
return(res)
}
# the predColumnList is a named list, specifying that the returned value is a factor
# package 'randomForest' is needed in the 'rfPredict' function, so set the libraryList argument
clydeExport("rf.rda", "rfPredict", list(Species_pred = "factor"), libraryList = c("randomForest"))
# use multiple models, return multiple columns in prediction function
library(rpart)
data(iris)
names(iris) <- gsub("\\.", "_", names(iris))
g2 <- glm(I(Species == "virginica") ~ ., data=iris, family=binomial(logit));
t1 <- rpart(Species ~ ., data=iris)
allPredict <- function(Sepal_Length, Sepal_Width, Petal_Length, Petal_Width) {
df <- as.data.frame(cbind(Sepal_Length, Sepal_Width, Petal_Length, Petal_Width));
res <- as.data.frame(predict(g2, type="response"))
names(res) <- c("viginica_pred")
res$Species_pred <- predict(t1, newdata=df, type="class")
res$agree <- as.integer((res$viginica_pred > 0.5) == (res$Species_pred == "virginica"))
return(res)
}
clydeExport("all.rda", "allPredict",
list(viginica_pred = "numeric", Species_pred = "factor", agree = "integer"),
c("rpart"))
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.