ScriptProcessor: Script Processor class

ScriptProcessorR Documentation

Script Processor class

Description

Handles Amazon SageMaker processing tasks for jobs using a machine learning framework.

Super class

sagemaker.common::Processor -> ScriptProcessor

Methods

Public methods

Inherited methods

Method new()

Initializes a “ScriptProcessor“ instance. The “ScriptProcessor“ handles Amazon SageMaker Processing tasks for jobs using a machine learning framework, which allows for providing a script to be run as part of the Processing Job.

Usage
ScriptProcessor$new(
  role,
  image_uri,
  command,
  instance_count,
  instance_type,
  volume_size_in_gb = 30,
  volume_kms_key = NULL,
  output_kms_key = NULL,
  max_runtime_in_seconds = NULL,
  base_job_name = NULL,
  sagemaker_session = NULL,
  env = NULL,
  tags = NULL,
  network_config = NULL
)
Arguments
role

(str): An AWS IAM role name or ARN. Amazon SageMaker Processing uses this role to access AWS resources, such as data stored in Amazon S3.

image_uri

(str): The URI of the Docker image to use for the processing jobs.

command

([str]): The command to run, along with any command-line flags. Example: ["python3", "-v"].

instance_count

(int): The number of instances to run a processing job with.

instance_type

(str): The type of EC2 instance to use for processing, for example, 'ml.c4.xlarge'.

volume_size_in_gb

(int): Size in GB of the EBS volume to use for storing data during processing (default: 30).

volume_kms_key

(str): A KMS key for the processing volume (default: NULL).

output_kms_key

(str): The KMS key ID for processing job outputs (default: NULL).

max_runtime_in_seconds

(int): Timeout in seconds (default: NULL). After this amount of time, Amazon SageMaker terminates the job, regardless of its current status. If 'max_runtime_in_seconds' is not specified, the default value is 24 hours.

base_job_name

(str): Prefix for processing name. If not specified, the processor generates a default job name, based on the processing image name and current timestamp.

sagemaker_session

(:class:'~sagemaker.session.Session'): Session object which manages interactions with Amazon SageMaker and any other AWS services needed. If not specified, the processor creates one using the default AWS configuration chain.

env

(dict[str, str]): Environment variables to be passed to the processing jobs (default: NULL).

tags

(list[dict]): List of tags to be passed to the processing job (default: NULL). For more, see https://docs.aws.amazon.com/sagemaker/latest/dg/API_Tag.html.

network_config

(:class:'~sagemaker.network.NetworkConfig'): A :class:'~sagemaker.network.NetworkConfig' object that configures network isolation, encryption of inter-container traffic, security group IDs, and subnets.


Method get_run_args()

Returns a RunArgs object. For processors (:class:'~sagemaker.spark.processing.PySparkProcessor', :class:'~sagemaker.spark.processing.SparkJar') that have special run() arguments, this object contains the normalized arguments for passing to :class:'~sagemaker.workflow.steps.ProcessingStep'.

Usage
ScriptProcessor$get_run_args(
  code,
  inputs = NULL,
  outputs = NULL,
  arguments = NULL
)
Arguments
code

(str): This can be an S3 URI or a local path to a file with the framework script to run.

inputs

(list[:class:'~sagemaker.processing.ProcessingInput']): Input files for the processing job. These must be provided as :class:'~sagemaker.processing.ProcessingInput' objects (default: None).

outputs

(list[:class:'~sagemaker.processing.ProcessingOutput']): Outputs for the processing job. These can be specified as either path strings or :class:'~sagemaker.processing.ProcessingOutput' objects (default: None).

arguments

(list[str]): A list of string arguments to be passed to a processing job (default: None).


Method run()

Runs a processing job.

Usage
ScriptProcessor$run(
  code,
  inputs = NULL,
  outputs = NULL,
  arguments = NULL,
  wait = TRUE,
  logs = TRUE,
  job_name = NULL,
  experiment_config = NULL,
  kms_key = NULL
)
Arguments
code

(str): This can be an S3 URI or a local path to a file with the framework script to run.

inputs

(list[:class:'~sagemaker.processing.ProcessingInput']): Input files for the processing job. These must be provided as :class:'~sagemaker.processing.ProcessingInput' objects (default: NULL).

outputs

(list[:class:'~sagemaker.processing.ProcessingOutput']): Outputs for the processing job. These can be specified as either path strings or :class:'~sagemaker.processing.ProcessingOutput' objects (default: NULL).

arguments

(list[str]): A list of string arguments to be passed to a processing job (default: NULL).

wait

(bool): Whether the call should wait until the job completes (default: True).

logs

(bool): Whether to show the logs produced by the job. Only meaningful when wait is True (default: True).

job_name

(str): Processing job name. If not specified, the processor generates a default job name, based on the base job name and current timestamp.

experiment_config

(dict[str, str]): Experiment management configuration. Dictionary contains three optional keys: 'ExperimentName', 'TrialName', and 'TrialComponentDisplayName'.

kms_key

(str): The ARN of the KMS key that is used to encrypt the user code file (default: None).


Method clone()

The objects of this class are cloneable with this method.

Usage
ScriptProcessor$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.

See Also

Other Processor: ProcessingInput, ProcessingJob, ProcessingOutput, Processor


DyfanJones/sagemaker-r-common documentation built on June 14, 2022, 10:31 p.m.