sagemaker_create_processing_job: Creates a processing job

Description Usage Arguments Value Request syntax

View source: R/sagemaker_operations.R

Description

Creates a processing job.

Usage

1
2
3
4
sagemaker_create_processing_job(ProcessingInputs,
  ProcessingOutputConfig, ProcessingJobName, ProcessingResources,
  StoppingCondition, AppSpecification, Environment, NetworkConfig,
  RoleArn, Tags, ExperimentConfig)

Arguments

ProcessingInputs

List of input configurations for the processing job.

ProcessingOutputConfig

Output configuration for the processing job.

ProcessingJobName

[required] The name of the processing job. The name must be unique within an AWS Region in the AWS account.

ProcessingResources

[required] Identifies the resources, ML compute instances, and ML storage volumes to deploy for a processing job. In distributed training, you specify more than one instance.

StoppingCondition

The time limit for how long the processing job is allowed to run.

AppSpecification

[required] Configures the processing job to run a specified Docker container image.

Environment

Sets the environment variables in the Docker container.

NetworkConfig

Networking options for a processing job.

RoleArn

[required] The Amazon Resource Name (ARN) of an IAM role that Amazon SageMaker can assume to perform tasks on your behalf.

Tags

(Optional) An array of key-value pairs. For more information, see Using Cost Allocation Tags in the AWS Billing and Cost Management User Guide.

ExperimentConfig

Value

A list with the following syntax:

1
2
3
list(
  ProcessingJobArn = "string"
)

Request syntax

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
svc$create_processing_job(
  ProcessingInputs = list(
    list(
      InputName = "string",
      AppManaged = TRUE|FALSE,
      S3Input = list(
        S3Uri = "string",
        LocalPath = "string",
        S3DataType = "ManifestFile"|"S3Prefix",
        S3InputMode = "Pipe"|"File",
        S3DataDistributionType = "FullyReplicated"|"ShardedByS3Key",
        S3CompressionType = "None"|"Gzip"
      ),
      DatasetDefinition = list(
        AthenaDatasetDefinition = list(
          Catalog = "string",
          Database = "string",
          QueryString = "string",
          WorkGroup = "string",
          OutputS3Uri = "string",
          KmsKeyId = "string",
          OutputFormat = "PARQUET"|"ORC"|"AVRO"|"JSON"|"TEXTFILE",
          OutputCompression = "GZIP"|"SNAPPY"|"ZLIB"
        ),
        RedshiftDatasetDefinition = list(
          ClusterId = "string",
          Database = "string",
          DbUser = "string",
          QueryString = "string",
          ClusterRoleArn = "string",
          OutputS3Uri = "string",
          KmsKeyId = "string",
          OutputFormat = "PARQUET"|"CSV",
          OutputCompression = "None"|"GZIP"|"BZIP2"|"ZSTD"|"SNAPPY"
        ),
        LocalPath = "string",
        DataDistributionType = "FullyReplicated"|"ShardedByS3Key",
        InputMode = "Pipe"|"File"
      )
    )
  ),
  ProcessingOutputConfig = list(
    Outputs = list(
      list(
        OutputName = "string",
        S3Output = list(
          S3Uri = "string",
          LocalPath = "string",
          S3UploadMode = "Continuous"|"EndOfJob"
        ),
        FeatureStoreOutput = list(
          FeatureGroupName = "string"
        ),
        AppManaged = TRUE|FALSE
      )
    ),
    KmsKeyId = "string"
  ),
  ProcessingJobName = "string",
  ProcessingResources = list(
    ClusterConfig = list(
      InstanceCount = 123,
      InstanceType = "ml.t3.medium"|"ml.t3.large"|"ml.t3.xlarge"|"ml.t3.2xlarge"|"ml.m4.xlarge"|"ml.m4.2xlarge"|"ml.m4.4xlarge"|"ml.m4.10xlarge"|"ml.m4.16xlarge"|"ml.c4.xlarge"|"ml.c4.2xlarge"|"ml.c4.4xlarge"|"ml.c4.8xlarge"|"ml.p2.xlarge"|"ml.p2.8xlarge"|"ml.p2.16xlarge"|"ml.p3.2xlarge"|"ml.p3.8xlarge"|"ml.p3.16xlarge"|"ml.c5.xlarge"|"ml.c5.2xlarge"|"ml.c5.4xlarge"|"ml.c5.9xlarge"|"ml.c5.18xlarge"|"ml.m5.large"|"ml.m5.xlarge"|"ml.m5.2xlarge"|"ml.m5.4xlarge"|"ml.m5.12xlarge"|"ml.m5.24xlarge"|"ml.r5.large"|"ml.r5.xlarge"|"ml.r5.2xlarge"|"ml.r5.4xlarge"|"ml.r5.8xlarge"|"ml.r5.12xlarge"|"ml.r5.16xlarge"|"ml.r5.24xlarge",
      VolumeSizeInGB = 123,
      VolumeKmsKeyId = "string"
    )
  ),
  StoppingCondition = list(
    MaxRuntimeInSeconds = 123
  ),
  AppSpecification = list(
    ImageUri = "string",
    ContainerEntrypoint = list(
      "string"
    ),
    ContainerArguments = list(
      "string"
    )
  ),
  Environment = list(
    "string"
  ),
  NetworkConfig = list(
    EnableInterContainerTrafficEncryption = TRUE|FALSE,
    EnableNetworkIsolation = TRUE|FALSE,
    VpcConfig = list(
      SecurityGroupIds = list(
        "string"
      ),
      Subnets = list(
        "string"
      )
    )
  ),
  RoleArn = "string",
  Tags = list(
    list(
      Key = "string",
      Value = "string"
    )
  ),
  ExperimentConfig = list(
    ExperimentName = "string",
    TrialName = "string",
    TrialComponentDisplayName = "string"
  )
)

paws.machine.learning documentation built on Aug. 23, 2021, 9:14 a.m.