SageMakerClarifyProcessor: SageMakerClarifyProcessor Class

SageMakerClarifyProcessorR Documentation

SageMakerClarifyProcessor Class

Description

Handles SageMaker Processing task to compute bias metrics and explain a model.

Super class

sagemaker.common::Processor -> SageMakerClarifyProcessor

Public fields

job_name_prefix

Processing job name prefix

Methods

Public methods

Inherited methods

Method new()

Initializes a “Processor“ instance, computing bias metrics and model explanations.

Usage
SageMakerClarifyProcessor$new(
  role,
  instance_count,
  instance_type,
  volume_size_in_gb = 30,
  volume_kms_key = NULL,
  output_kms_key = NULL,
  max_runtime_in_seconds = NULL,
  sagemaker_session = NULL,
  env = NULL,
  tags = NULL,
  network_config = NULL,
  job_name_prefix = NULL,
  version = NULL
)
Arguments
role

(str): An AWS IAM role name or ARN. Amazon SageMaker Processing uses this role to access AWS resources, such as data stored in Amazon S3.

instance_count

(int): The number of instances to run a processing job with.

instance_type

(str): The type of EC2 instance to use for processing, for example, 'ml.c4.xlarge'.

volume_size_in_gb

(int): Size in GB of the EBS volume to use for storing data during processing (default: 30).

volume_kms_key

(str): A KMS key for the processing volume (default: None).

output_kms_key

(str): The KMS key ID for processing job outputs (default: None).

max_runtime_in_seconds

(int): Timeout in seconds (default: None). After this amount of time, Amazon SageMaker terminates the job, regardless of its current status. If 'max_runtime_in_seconds' is not specified, the default value is 24 hours.

sagemaker_session

(:class:'~sagemaker.session.Session'): Session object which manages interactions with Amazon SageMaker and any other AWS services needed. If not specified, the processor creates one using the default AWS configuration chain.

env

(dict[str, str]): Environment variables to be passed to the processing jobs (default: None).

tags

(list[dict]): List of tags to be passed to the processing job (default: None). For more, see https://docs.aws.amazon.com/sagemaker/latest/dg/API_Tag.html.

network_config

(:class:'~sagemaker.network.NetworkConfig'): A :class:'~sagemaker.network.NetworkConfig' object that configures network isolation, encryption of inter-container traffic, security group IDs, and subnets.

job_name_prefix

(str): Processing job name prefix.

version

(str): Clarify version want to be used.


Method run()

Overriding the base class method but deferring to specific run_* methods.

Usage
SageMakerClarifyProcessor$run()

Method run_pre_training_bias()

Runs a ProcessingJob to compute the requested bias 'methods' of the input data. Computes the requested methods that compare 'methods' (e.g. fraction of examples) for the sensitive group vs the other examples.

Usage
SageMakerClarifyProcessor$run_pre_training_bias(
  data_config,
  data_bias_config,
  methods = "all",
  wait = TRUE,
  logs = TRUE,
  job_name = NULL,
  kms_key = NULL,
  experiment_config = NULL
)
Arguments
data_config

(:class:'~sagemaker.clarify.DataConfig'): Config of the input/output data.

data_bias_config

(:class:'~sagemaker.clarify.BiasConfig'): Config of sensitive groups.

methods

(str or list[str]): Selector of a subset of potential metrics:

Defaults to computing all.

wait

(bool): Whether the call should wait until the job completes (default: True).

logs

(bool): Whether to show the logs produced by the job. Only meaningful when “wait“ is True (default: True).

job_name

(str): Processing job name. If not specified, a name is composed of "Clarify-Pretraining-Bias" and current timestamp.

kms_key

(str): The ARN of the KMS key that is used to encrypt the user code file (default: None).

experiment_config

(dict[str, str]): Experiment management configuration. Dictionary contains three optional keys: 'ExperimentName', 'TrialName', and 'TrialComponentDisplayName'.


Method run_post_training_bias()

Runs a ProcessingJob to compute the requested bias 'methods' of the model predictions. Spins up a model endpoint, runs inference over the input example in the 's3_data_input_path' to obtain predicted labels. Computes a the requested methods that compare 'methods' (e.g. accuracy, precision, recall) for the sensitive group vs the other examples.

Usage
SageMakerClarifyProcessor$run_post_training_bias(
  data_config,
  data_bias_config,
  model_config,
  model_predicted_label_config,
  methods = "all",
  wait = TRUE,
  logs = TRUE,
  job_name = NULL,
  kms_key = NULL,
  experiment_config = NULL
)
Arguments
data_config

(:class:'~sagemaker.clarify.DataConfig'): Config of the input/output data.

data_bias_config

(:class:'~sagemaker.clarify.BiasConfig'): Config of sensitive groups.

model_config

(:class:'~sagemaker.clarify.ModelConfig'): Config of the model and its endpoint to be created.

model_predicted_label_config

(:class:'~sagemaker.clarify.ModelPredictedLabelConfig'): Config of how to extract the predicted label from the model output.

methods

(str or list[str]): Selector of a subset of potential metrics:

Defaults to computing all.

wait

(bool): Whether the call should wait until the job completes (default: True).

logs

(bool): Whether to show the logs produced by the job. Only meaningful when “wait“ is True (default: True).

job_name

(str): Processing job name. If not specified, a name is composed of "Clarify-Posttraining-Bias" and current timestamp.

kms_key

(str): The ARN of the KMS key that is used to encrypt the user code file (default: None).

experiment_config

(dict[str, str]): Experiment management configuration. Dictionary contains three optional keys: 'ExperimentName', 'TrialName', and 'TrialComponentDisplayName'.


Method run_bias()

Runs a ProcessingJob to compute the requested bias 'methods' of the model predictions. Spins up a model endpoint, runs inference over the input example in the 's3_data_input_path' to obtain predicted labels. Computes a the requested methods that compare 'methods' (e.g. accuracy, precision, recall) for the sensitive group vs the other examples.

Usage
SageMakerClarifyProcessor$run_bias(
  data_config,
  bias_config,
  model_config,
  model_predicted_label_config = NULL,
  pre_training_methods = "all",
  post_training_methods = "all",
  wait = TRUE,
  logs = TRUE,
  job_name = NULL,
  kms_key = NULL,
  experiment_config = NULL
)
Arguments
data_config

(:class:'~sagemaker.clarify.DataConfig'): Config of the input/output data.

bias_config

(:class:'~sagemaker.clarify.BiasConfig'): Config of sensitive groups.

model_config

(:class:'~sagemaker.clarify.ModelConfig'): Config of the model and its endpoint to be created.

model_predicted_label_config

(:class:'~sagemaker.clarify.ModelPredictedLabelConfig'): Config of how to extract the predicted label from the model output.

pre_training_methods

(str or list[str]): Selector of a subset of potential metrics:

Defaults to computing all.

post_training_methods

(str or list[str]): Selector of a subset of potential metrics:

Defaults to computing all.

wait

(bool): Whether the call should wait until the job completes (default: True).

logs

(bool): Whether to show the logs produced by the job. Only meaningful when “wait“ is True (default: True).

job_name

(str): Processing job name. If not specified, a name is composed of "Clarify-Bias" and current timestamp.

kms_key

(str): The ARN of the KMS key that is used to encrypt the user code file (default: None).

experiment_config

(dict[str, str]): Experiment management configuration. Dictionary contains three optional keys: 'ExperimentName', 'TrialName', and 'TrialComponentDisplayName'.


Method run_explainability()

Runs a ProcessingJob computing for each example in the input the feature importance. Currently, only SHAP is supported as explainability method. Spins up a model endpoint. For each input example in the 's3_data_input_path' the SHAP algorithm determines feature importance, by creating 'num_samples' copies of the example with a subset of features replaced with values from the 'baseline'. Model inference is run to see how the prediction changes with the replaced features. If the model output returns multiple scores importance is computed for each of them. Across examples, feature importance is aggregated using 'agg_method'.

Usage
SageMakerClarifyProcessor$run_explainability(
  data_config,
  model_config,
  explainability_config,
  model_scores = NULL,
  wait = TRUE,
  logs = TRUE,
  job_name = NULL,
  kms_key = NULL,
  experiment_config = NULL
)
Arguments
data_config

(:class:'~sagemaker.clarify.DataConfig'): Config of the input/output data.

model_config

(:class:'~sagemaker.clarify.ModelConfig'): Config of the model and its endpoint to be created.

explainability_config

(:class:'~sagemaker.clarify.ExplainabilityConfig'): Config of the specific explainability method. Currently, only SHAP is supported.

model_scores

: Index or JSONPath location in the model output for the predicted scores to be explained. This is not required if the model output is a single score.

wait

(bool): Whether the call should wait until the job completes (default: True).

logs

(bool): Whether to show the logs produced by the job. Only meaningful when “wait“ is True (default: True).

job_name

(str): Processing job name. If not specified, a name is composed of "Clarify-Explainability" and current timestamp.

kms_key

(str): The ARN of the KMS key that is used to encrypt the user code file (default: None).

experiment_config

(dict[str, str]): Experiment management configuration. Dictionary contains three optional keys: 'ExperimentName', 'TrialName', and 'TrialComponentDisplayName'.


Method clone()

The objects of this class are cloneable with this method.

Usage
SageMakerClarifyProcessor$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.


DyfanJones/sagemaker-r-common documentation built on June 14, 2022, 10:31 p.m.