This tutorial demonstrates how to deploy a model as a web service on Azure Kubernetes Service (AKS). AKS is good for high-scale production deployments; use it if you need one or more of the following capabilities:
You will learn to:
If you don’t have access to an Azure ML workspace, follow the setup tutorial to configure and create a workspace.
Start by setting up your environment. This includes importing the azuremlsdk package and connecting to your workspace.
library(azuremlsdk)
Instantiate a workspace object from your existing workspace. The following code will load the workspace details from a config.json file if you previously wrote one out with write_workspace_config()
.
ws <- load_workspace_from_config()
Or, you can retrieve a workspace by directly specifying your workspace details:
ws <- get_workspace("<your workspace name>", "<your subscription ID>", "<your resource group>")
In this tutorial we will deploy a model that was trained in one of the samples. The model was trained with the Iris dataset and can be used to determine if a flower is one of three Iris flower species (setosa, versicolor, virginica). We have provided the model file (model.rds
) for the tutorial; it is located in the deploy-to-aks
subfolder of this vignette.
First, register the model to your workspace with register_model()
. A registered model can be any collection of files, but in this case the R model file is sufficient. Azure ML will use the registered model for deployment.
model <- register_model(ws, model_path = "deploy-to-aks/model.rds", model_name = "iris_model", description = "Predict an Iris flower type")
When deploying a web service to AKS, you deploy to an AKS cluster that is connected to your workspace. There are two ways to connect an AKS cluster to your workspace:
attach_aks_compute()
method.Creating or attaching an AKS cluster is a one-time process for your workspace. You can reuse this cluster for multiple deployments. If you delete the cluster or the resource group that contains it, you must create a new cluster the next time you need to deploy.
In this tutorial, we will go with the first method of provisioning a new cluster. See the create_aks_compute()
reference for the full set of configurable parameters. If you pick custom values for the agent_count
and vm_size
parameters, you need to make sure agent_count
multiplied by vm_size
is greater than or equal to 12
virtual CPUs.
``` {r provision_cluster, eval=FALSE} aks_target <- create_aks_compute(ws, cluster_name = 'myakscluster')
wait_for_provisioning_completion(aks_target, show_output = TRUE)
The Azure ML SDK does not provide support for scaling an AKS cluster. To scale the nodes in the cluster, use the UI for your AKS cluster in the Azure portal. You can only change the node count, not the VM size of the cluster. ## Deploy as a web service ### Define the inference dependencies To deploy a model, you need an **inference configuration**, which describes the environment needed to host the model and web service. To create an inference config, you will first need a scoring script and an Azure ML environment. The scoring script (`entry_script`) is an R script that will take as input variable values (in JSON format) and output a prediction from your model. For this tutorial, use the provided scoring file `score.R`. The scoring script must contain an `init()` method that loads your model and returns a function that uses the model to make a prediction based on the input data. See the [documentation](https://azure.github.io/azureml-sdk-for-r/reference/inference_config.html#details) for more details. Next, define an Azure ML **environment** for your script’s package dependencies. With an environment, you specify R packages (from CRAN or elsewhere) that are needed for your script to run. You can also provide the values of environment variables that your script can reference to modify its behavior. By default Azure ML will build a default Docker image that includes R, the Azure ML SDK, and additional required dependencies for deployment. See the documentation here for the full list of dependencies that will be installed in the default container. You can also specify additional packages to be installed at runtime, or even a custom Docker image to be used instead of the base image that will be built, using the other available parameters to [`r_environment()`](https://azure.github.io/azureml-sdk-for-r/reference/r_environment.html). ```r r_env <- r_environment(name = "deploy_env")
Now you have everything you need to create an inference config for encapsulating your scoring script and environment dependencies.
``` {r create_inference_config, eval=FALSE} inference_config <- inference_config( entry_script = "score.R", source_directory = "deploy-to-aks", environment = r_env)
### Deploy to AKS Now, define the deployment configuration that describes the compute resources needed, for example, the number of cores and memory. See the [`aks_webservice_deployment_config()`](https://azure.github.io/azureml-sdk-for-r/reference/aks_webservice_deployment_config.html) for the full set of configurable parameters. ``` {r deploy_config, eval=FALSE} aks_config <- aks_webservice_deployment_config(cpu_cores = 1, memory_gb = 1)
Now, deploy your model as a web service to the AKS cluster you created earlier.
aks_service <- deploy_model(ws, 'my-new-aksservice', models = list(model), inference_config = inference_config, deployment_config = aks_config, deployment_target = aks_target) wait_for_deployment(aks_service, show_output = TRUE)
To inspect the logs from the deployment:
get_webservice_logs(aks_service)
If you encounter any issue in deploying the web service, please visit the troubleshooting guide.
Now that your model is deployed as a service, you can test the service from R using invoke_webservice()
. Provide a new set of data to predict from, convert it to JSON, and send it to the service.
``` {r test_service, eval=FALSE} library(jsonlite)
plant <- data.frame(Sepal.Length = 6.4, Sepal.Width = 2.8, Petal.Length = 4.6, Petal.Width = 1.8)
predicted_val <- invoke_webservice(aks_service, toJSON(plant)) message(predicted_val)
You can also get the web service’s HTTP endpoint, which accepts REST client calls. You can share this endpoint with anyone who wants to test the web service or integrate it into an application. ``` {r eval=FALSE} aks_service$scoring_uri
When deploying to AKS, key-based authentication is enabled by default. You can also enable token-based authentication. Token-based authentication requires clients to use an Azure Active Directory account to request an authentication token, which is used to make requests to the deployed service.
To disable key-based auth, set the auth_enabled = FALSE
parameter when creating the deployment configuration with aks_webservice_deployment_config()
.
To enable token-based auth, set token_auth_enabled = TRUE
when creating the deployment config.
If key authentication is enabled, you can use the get_webservice_keys()
method to retrieve a primary and secondary authentication key. To generate a new key, use generate_new_webservice_key()
.
If token authentication is enabled, you can use the get_webservice_token()
method to retrieve a JWT token and that token's expiration time. Make sure to request a new token after the token's expiration time.
Delete the resources once you no longer need them. Do not delete any resource you plan on still using.
Delete the web service:
delete_webservice(aks_service)
Delete the registered model:
delete_model(model)
Delete the AKS cluster:
delete_compute(aks_target)
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.