Framework: FrameWork Class

FrameworkR Documentation

FrameWork Class

Description

Base class that cannot be instantiated directly. Subclasses define functionality pertaining to specific ML frameworks, such as training/deployment images and predictor instances.

Super class

sagemaker.mlcore::EstimatorBase -> Framework

Public fields

LAUNCH_PS_ENV_NAME

class metadata

LAUNCH_MPI_ENV_NAME

class metadata

LAUNCH_SM_DDP_ENV_NAME

class metadata

INSTANCE_TYPE

class metadata

MPI_NUM_PROCESSES_PER_HOST

class metadata

MPI_CUSTOM_MPI_OPTIONS

class metadata

SM_DDP_CUSTOM_MPI_OPTIONS

class metadata

CONTAINER_CODE_CHANNEL_SOURCEDIR_PATH

class metadata

.module

mimic python module

Methods

Public methods

Inherited methods

Method new()

Base class initializer. Subclasses which override “__init__“ should invoke “super()“

Usage
Framework$new(
  entry_point,
  source_dir = NULL,
  hyperparameters = NULL,
  container_log_level = "INFO",
  code_location = NULL,
  image_uri = NULL,
  dependencies = NULL,
  enable_network_isolation = FALSE,
  git_config = NULL,
  checkpoint_s3_uri = NULL,
  checkpoint_local_path = NULL,
  enable_sagemaker_metrics = NULL,
  ...
)
Arguments
entry_point

(str): Path (absolute or relative) to the local Python source file which should be executed as the entry point to training. If “source_dir“ is specified, then “entry_point“ must point to a file located at the root of “source_dir“. If 'git_config' is provided, 'entry_point' should be a relative location to the Python source file in the Git repo. Example: With the following GitHub repo directory structure: >>> |—– README.md >>> |—– src >>> |—– train.py >>> |—– test.py You can assign entry_point='src/train.py'.

source_dir

(str): Path (absolute, relative or an S3 URI) to a directory with any other training source code dependencies aside from the entry point file (default: None). If “source_dir“ is an S3 URI, it must point to a tar.gz file. Structure within this directory are preserved when training on Amazon SageMaker. If 'git_config' is provided, 'source_dir' should be a relative location to a directory in the Git repo. .. admonition:: Example With the following GitHub repo directory structure: >>> |—– README.md >>> |—– src >>> |—– train.py >>> |—– test.py and you need 'train.py' as entry point and 'test.py' as training source code as well, you can assign entry_point='train.py', source_dir='src'.

hyperparameters

(dict): Hyperparameters that will be used for training (default: None). The hyperparameters are made accessible as a dict[str, str] to the training code on SageMaker. For convenience, this accepts other types for keys and values, but “str()“ will be called to convert them before training.

container_log_level

(str): Log level to use within the container (default: "INFO")

code_location

(str): The S3 prefix URI where custom code will be uploaded (default: None) - don't include a trailing slash since a string prepended with a "/" is appended to “code_location“. The code file uploaded to S3 is 'code_location/job-name/source/sourcedir.tar.gz'. If not specified, the default “code location“ is s3://output_bucket/job-name/.

image_uri

(str): An alternate image name to use instead of the official Sagemaker image for the framework. This is useful to run one of the Sagemaker supported frameworks with an image containing custom dependencies.

dependencies

(list[str]): A list of paths to directories (absolute or relative) with any additional libraries that will be exported to the container (default: []). The library folders will be copied to SageMaker in the same folder where the entrypoint is copied. If 'git_config' is provided, 'dependencies' should be a list of relative locations to directories with any additional libraries needed in the Git repo. .. admonition:: Example The following call >>> Estimator(entry_point='train.py', ... dependencies=['my/libs/common', 'virtual-env']) results in the following inside the container: >>> $ ls >>> opt/ml/code >>> |—— train.py >>> |—— common >>> |—— virtual-env This is not supported with "local code" in Local Mode.

enable_network_isolation

(bool): Specifies whether container will run in network isolation mode. Network isolation mode restricts the container access to outside networks (such as the internet). The container does not make any inbound or outbound network calls. If True, a channel named "code" will be created for any user entry script for training. The user entry script, files in source_dir (if specified), and dependencies will be uploaded in a tar to S3. Also known as internet-free mode (default: 'False').

git_config

(dict[str, str]): Git configurations used for cloning files, including “repo“, “branch“, “commit“, “2FA_enabled“, “username“, “password“ and “token“. The “repo“ field is required. All other fields are optional. “repo“ specifies the Git repository where your training script is stored. If you don't provide “branch“, the default value 'master' is used. If you don't provide “commit“, the latest commit in the specified branch is used. .. admonition:: Example The following config: >>> git_config = 'repo': 'https://github.com/aws/sagemaker-python-sdk.git', >>> 'branch': 'test-branch-git-config', >>> 'commit': '329bfcf884482002c05ff7f44f62599ebc9f445a' results in cloning the repo specified in 'repo', then checkout the 'master' branch, and checkout the specified commit. “2FA_enabled“, “username“, “password“ and “token“ are used for authentication. For GitHub (or other Git) accounts, set “2FA_enabled“ to 'True' if two-factor authentication is enabled for the account, otherwise set it to 'False'. If you do not provide a value for “2FA_enabled“, a default value of 'False' is used. CodeCommit does not support two-factor authentication, so do not provide "2FA_enabled" with CodeCommit repositories. For GitHub and other Git repos, when SSH URLs are provided, it doesn't matter whether 2FA is enabled or disabled; you should either have no passphrase for the SSH key pairs, or have the ssh-agent configured so that you will not be prompted for SSH passphrase when you do 'git clone' command with SSH URLs. When HTTPS URLs are provided: if 2FA is disabled, then either token or username+password will be used for authentication if provided (token prioritized); if 2FA is enabled, only token will be used for authentication if provided. If required authentication info is not provided, python SDK will try to use local credentials storage to authenticate. If that fails either, an error message will be thrown. For CodeCommit repos, 2FA is not supported, so '2FA_enabled' should not be provided. There is no token in CodeCommit, so 'token' should not be provided too. When 'repo' is an SSH URL, the requirements are the same as GitHub-like repos. When 'repo' is an HTTPS URL, username+password will be used for authentication if they are provided; otherwise, python SDK will try to use either CodeCommit credential helper or local credential storage for authentication.

checkpoint_s3_uri

(str): The S3 URI in which to persist checkpoints that the algorithm persists (if any) during training. (default: “None“).

checkpoint_local_path

(str): The local path that the algorithm writes its checkpoints to. SageMaker will persist all files under this path to 'checkpoint_s3_uri' continually during training. On job startup the reverse happens - data from the s3 location is downloaded to this path before the algorithm is started. If the path is unset then SageMaker assumes the checkpoints will be provided under '/opt/ml/checkpoints/'. (default: “None“).

enable_sagemaker_metrics

(bool): enable SageMaker Metrics Time Series. For more information see: https://docs.aws.amazon.com/sagemaker/latest/dg/API_AlgorithmSpecification.html#SageMaker-Type-AlgorithmSpecification-EnableSageMakerMetricsTimeSeries (default: “None“).

...

: Additional kwargs passed to the “EstimatorBase“ constructor. .. tip:: You can find additional parameters for initializing this class at :class:'~sagemaker.estimator.EstimatorBase'.


Method .prepare_for_training()

Set hyperparameters needed for training. This method will also validate “source_dir“.

Usage
Framework$.prepare_for_training(job_name = NULL)
Arguments
job_name

(str): Name of the training job to be created. If not specified, one is generated, using the base name given to the constructor if applicable.


Method hyperparameters()

Return the hyperparameters as a dictionary to use for training. The :meth:'~sagemaker.estimator.EstimatorBase.fit' method, which trains the model, calls this method to find the hyperparameters.

Usage
Framework$hyperparameters()
Returns

dict[str, str]: The hyperparameters.


Method training_image_uri()

Return the Docker image to use for training. The :meth:'~sagemaker.estimator.EstimatorBase.fit' method, which does the model training, calls this method to find the image to use for model training.

Usage
Framework$training_image_uri()
Returns

str: The URI of the Docker image.


Method attach()

Attach to an existing training job. Create an Estimator bound to an existing training job, each subclass is responsible to implement “_prepare_init_params_from_job_description()“ as this method delegates the actual conversion of a training job description to the arguments that the class constructor expects. After attaching, if the training job has a Complete status, it can be “deploy()“ ed to create a SageMaker Endpoint and return a “Predictor“. If the training job is in progress, attach will block and display log messages from the training job, until the training job completes. Examples: >>> my_estimator.fit(wait=False) >>> training_job_name = my_estimator.latest_training_job.name Later on: >>> attached_estimator = Estimator.attach(training_job_name) >>> attached_estimator.deploy()

Usage
Framework$attach(
  training_job_name,
  sagemaker_session = NULL,
  model_channel_name = "model"
)
Arguments
training_job_name

(str): The name of the training job to attach to.

sagemaker_session

(sagemaker.session.Session): Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, the estimator creates one using the default AWS configuration chain.

model_channel_name

(str): Name of the channel where pre-trained model data will be downloaded (default: 'model'). If no channel with the same name exists in the training job, this option will be ignored.

Returns

Instance of the calling “Estimator“ Class with the attached training job.


Method transformer()

Return a “Transformer“ that uses a SageMaker Model based on the training job. It reuses the SageMaker Session and base job name used by the Estimator.

Usage
Framework$transformer(
  instance_count,
  instance_type,
  strategy = NULL,
  assemble_with = NULL,
  output_path = NULL,
  output_kms_key = NULL,
  accept = NULL,
  env = NULL,
  max_concurrent_transforms = NULL,
  max_payload = NULL,
  tags = NULL,
  role = NULL,
  model_server_workers = NULL,
  volume_kms_key = NULL,
  entry_point = NULL,
  vpc_config_override = "VPC_CONFIG_DEFAULT",
  enable_network_isolation = NULL,
  model_name = NULL
)
Arguments
instance_count

(int): Number of EC2 instances to use.

instance_type

(str): Type of EC2 instance to use, for example, ml.c4.xlarge'.

strategy

(str): The strategy used to decide how to batch records in a single request (default: None). Valid values: 'MultiRecord' and 'SingleRecord'.

assemble_with

(str): How the output is assembled (default: None). Valid values: 'Line' or 'None'.

output_path

(str): S3 location for saving the transform result. If not specified, results are stored to a default bucket.

output_kms_key

(str): Optional. KMS key ID for encrypting the transform output (default: None).

accept

(str): The accept header passed by the client to the inference endpoint. If it is supported by the endpoint, it will be the format of the batch transform output.

env

(dict): Environment variables to be set for use during the transform job (default: None).

max_concurrent_transforms

(int): The maximum number of HTTP requests to be made to each individual transform container at one time.

max_payload

(int): Maximum size of the payload in a single HTTP request to the container in MB.

tags

(list[dict]): List of tags for labeling a transform job. If none specified, then the tags used for the training job are used for the transform job.

role

(str): The “ExecutionRoleArn“ IAM Role ARN for the “Model“, which is also used during transform jobs. If not specified, the role from the Estimator will be used.

model_server_workers

(int): Optional. The number of worker processes used by the inference server. If None, server will use one worker per vCPU.

volume_kms_key

(str): Optional. KMS key ID for encrypting the volume attached to the ML compute instance (default: None).

entry_point

(str): Path (absolute or relative) to the local Python source file which should be executed as the entry point to training. If “source_dir“ is specified, then “entry_point“ must point to a file located at the root of “source_dir“. If not specified, the training entry point is used.

vpc_config_override

(dict[str, list[str]]): Optional override for the VpcConfig set on the model. Default: use subnets and security groups from this Estimator. * 'Subnets' (list[str]): List of subnet ids. * 'SecurityGroupIds' (list[str]): List of security group ids.

enable_network_isolation

(bool): Specifies whether container will run in network isolation mode. Network isolation mode restricts the container access to outside networks (such as the internet). The container does not make any inbound or outbound network calls. If True, a channel named "code" will be created for any user entry script for inference. Also known as Internet-free mode. If not specified, this setting is taken from the estimator's current configuration.

model_name

(str): Name to use for creating an Amazon SageMaker model. If not specified, the name of the training job is used.

Returns

sagemaker.transformer.Transformer: a “Transformer“ object that can be used to start a SageMaker Batch Transform job.


Method clone()

The objects of this class are cloneable with this method.

Usage
Framework$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.


DyfanJones/sagemaker-r-mlcore documentation built on May 3, 2022, 10:08 a.m.