IPInsights: An unsupervised learning algorithm that learns the usage...

IPInsightsR Documentation

An unsupervised learning algorithm that learns the usage patterns for IPv4 addresses.

Description

It is designed to capture associations between IPv4 addresses and various entities, such as user IDs or account numbers.

Super classes

sagemaker.mlcore::EstimatorBase -> sagemaker.mlcore::AmazonAlgorithmEstimatorBase -> IPInsights

Public fields

repo_name

sagemaker repo name for framework

repo_version

version of framework

MINI_BATCH_SIZE

The size of each mini-batch to use when training. If None, a default value will be used.

.module

mimic python module

Active bindings

num_entity_vectors

The number of embeddings to train for entities accessing online resources

vector_dim

The size of the embedding vectors for both entity and IP addresses

batch_metrics_publish_interval

The period at which to publish metrics

epochs

Maximum number of passes over the training data.

learning_rate

Learning rate for the optimizer.

num_ip_encoder_layers

The number of fully-connected layers to encode IP address embedding.

random_negative_sampling_rate

The ratio of random negative samples to draw during training.

shuffled_negative_sampling_rate

The ratio of shuffled negative samples to draw during training.

weight_decay

Weight decay coefficient. Adds L2 regularization

Methods

Public methods

Inherited methods

Method new()

This estimator is for IP Insights, an unsupervised algorithm that learns usage patterns of IP addresses. This Estimator may be fit via calls to :meth:'~sagemaker.amazon.amazon_estimator.AmazonAlgorithmEstimatorBase.fit'. It requires CSV data to be stored in S3. After this Estimator is fit, model data is stored in S3. The model may be deployed to an Amazon SageMaker Endpoint by invoking :meth:'~sagemaker.amazon.estimator.EstimatorBase.deploy'. As well as deploying an Endpoint, deploy returns a :class:'~sagemaker.amazon.IPInsightPredictor' object that can be used for inference calls using the trained model hosted in the SageMaker Endpoint. IPInsights Estimators can be configured by setting hyperparamters. The available hyperparamters are documented below. For further information on the AWS IPInsights algorithm, please consult AWS technical documentation: https://docs.aws.amazon.com/sagemaker/latest/dg/ip-insights-hyperparameters.html

Usage
IPInsights$new(
  role,
  instance_count,
  instance_type,
  num_entity_vectors,
  vector_dim,
  batch_metrics_publish_interval = NULL,
  epochs = NULL,
  learning_rate = NULL,
  num_ip_encoder_layers = NULL,
  random_negative_sampling_rate = NULL,
  shuffled_negative_sampling_rate = NULL,
  weight_decay = NULL,
  ...
)
Arguments
role

(str): An AWS IAM role (either name or full ARN). The Amazon SageMaker training jobs and APIs that create Amazon SageMaker endpoints use this role to access training data and model artifacts. After the endpoint is created, the inference code might use the IAM role, if accessing AWS resource.

instance_count

(int): Number of Amazon EC2 instances to use for training.

instance_type

(str): Type of EC2 instance to use for training, for example, 'ml.m5.xlarge'.

num_entity_vectors

(int): Required. The number of embeddings to train for entities accessing online resources. We recommend 2x the total number of unique entity IDs.

vector_dim

(int): Required. The size of the embedding vectors for both entity and IP addresses.

batch_metrics_publish_interval

(int): Optional. The period at which to publish metrics (batches).

epochs

(int): Optional. Maximum number of passes over the training data.

learning_rate

(float): Optional. Learning rate for the optimizer.

num_ip_encoder_layers

(int): Optional. The number of fully-connected layers to encode IP address embedding.

random_negative_sampling_rate

(int): Optional. The ratio of random negative samples to draw during training. Random negative samples are randomly drawn IPv4 addresses.

shuffled_negative_sampling_rate

(int): Optional. The ratio of shuffled negative samples to draw during training. Shuffled negative samples are IP addresses picked from within a batch.

weight_decay

(float): Optional. Weight decay coefficient. Adds L2 regularization.

...

: base class keyword argument values.


Method create_model()

Create a model for the latest s3 model produced by this estimator.

Usage
IPInsights$create_model(vpc_config_override = "VPC_CONFIG_DEFAULT", ...)
Arguments
vpc_config_override

(dict[str, list[str]]): Optional override for VpcConfig set on the model. Default: use subnets and security groups from this Estimator. * 'Subnets' (list[str]): List of subnet ids. * 'SecurityGroupIds' (list[str]): List of security group ids.

...

: Additional kwargs passed to the IPInsightsModel constructor.

Returns

:class:'~sagemaker.amazon.IPInsightsModel': references the latest s3 model data produced by this estimator.


Method .prepare_for_training()

Set hyperparameters needed for training. This method will also validate “source_dir“.

Usage
IPInsights$.prepare_for_training(
  records,
  mini_batch_size = NULL,
  job_name = NULL
)
Arguments
records

(RecordSet) – The records to train this Estimator on.

mini_batch_size

(int or None) – The size of each mini-batch to use when training. If None, a default value will be used.

job_name

(str): Name of the training job to be created. If not specified, one is generated, using the base name given to the constructor if applicable.


Method clone()

The objects of this class are cloneable with this method.

Usage
IPInsights$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.


DyfanJones/sagemaker-r-mlframework documentation built on March 18, 2022, 7:41 a.m.