TrainingPipeline: TrainingPipeline for Sagemaker

Description Super class Methods

Description

Creates a standard training pipeline with the following steps in order:

Super class

stepfunctions::WorkflowTemplate -> TrainingPipeline

Methods

Public methods

Inherited methods

Method new()

Initialize TrainingPipeline class

Usage
TrainingPipeline$new(
  estimator,
  role,
  inputs,
  s3_bucket,
  client = NULL,
  pipeline_name = NULL
)
Arguments
estimator

(sagemaker.estimator.EstimatorBase): The estimator to use for training. Can be a BYO estimator, Framework estimator or Amazon algorithm estimator.

role

(str): An AWS IAM role (either name or full Amazon Resource Name (ARN)). This role is used to create, manage, and execute the Step Functions workflows.

inputs

: Information about the training data. Please refer to the 'fit()' method of the associated estimator, as this can take any of the following forms:

  • (str) - The S3 location where training data is saved.

  • (list[str, str] or list[str, 'sagemaker.inputs.TrainingInput']) - If using multiple channels for training data, you can specify a list mapping channel names to strings or 'sagemaker.inputs.TrainingInput' objects.

  • ('sagemaker.inputs.TrainingInput') - Channel configuration for S3 data sources that can provide additional information about the training dataset. See 'sagemaker.inputs.TrainingInput' for full details.

  • ('sagemaker.amazon.amazon_estimator.RecordSet') - A collection of Amazon 'Record' objects serialized and stored in S3. For use with an estimator for an Amazon algorithm.

  • (list['sagemaker.amazon.amazon_estimator.RecordSet']) - A list of 'sagemaker.amazon.amazon_estimator.RecordSet' objects, where each instance is a different channel of training data.

s3_bucket

(str): S3 bucket under which the output artifacts from the training job will be stored. The parent path used is built using the format: “s3://s3_bucket/pipeline_name/models/job_name/“. In this format, 'pipeline_name' refers to the keyword argument provided for TrainingPipeline. If a 'pipeline_name' argument was not provided, one is auto-generated by the pipeline as 'training-pipeline-<timestamp>'. Also, in the format, 'job_name' refers to the job name provided when calling the :meth:'TrainingPipeline.run()' method.

client

(SFN.Client, optional): sfn client to use for creating and interacting with the training pipeline in Step Functions. (default: None)

pipeline_name

(str, optional): Name of the pipeline. This name will be used to name jobs (if not provided when calling execute()), models, endpoints, and S3 objects created by the pipeline. If a 'pipeline_name' argument was not provided, one is auto-generated by the pipeline as 'training-pipeline-<timestamp>'. (default:None)


Method build_workflow_definition()

Build the workflow definition for the training pipeline with all the states involved.

Usage
TrainingPipeline$build_workflow_definition()
Returns

:class:'~stepfunctions.steps.states.Chain': Workflow definition as a chain of states involved in the the training pipeline.


Method execute()

Run the training pipeline.

Usage
TrainingPipeline$execute(job_name = NULL, hyperparameters = NULL)
Arguments
job_name

(str, optional): Name for the training job. If one is not provided, a job name will be auto-generated. (default: None)

hyperparameters

(list, optional): Hyperparameters for the estimator training. (default: None)

Returns

:R:class:'~stepfunctions.workflow.Execution': Running instance of the training pipeline.


Method clone()

The objects of this class are cloneable with this method.

Usage
TrainingPipeline$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.


DyfanJones/aws-step-functions-data-science-sdk-r documentation built on Dec. 17, 2021, 5:31 p.m.