SetTarget: Set the target variable (and by default, start the DataRobot...

View source: R/StartAutopilot.R

SetTargetR Documentation

Set the target variable (and by default, start the DataRobot Autopilot)

Description

This function sets the target variable for the project defined by project, starting the process of building models to predict the response variable target. Both of these parameters - project and target - are required and they are sufficient to start a modeling project with DataRobot default specifications for the other optional parameters.

Usage

SetTarget(
  project,
  target,
  metric = NULL,
  weights = NULL,
  partition = NULL,
  mode = AutopilotMode$Quick,
  seed = NULL,
  targetType = NULL,
  positiveClass = NULL,
  blueprintThreshold = NULL,
  responseCap = NULL,
  featurelistId = NULL,
  smartDownsampled = NULL,
  majorityDownsamplingRate = NULL,
  accuracyOptimizedBlueprints = NULL,
  offset = NULL,
  exposure = NULL,
  eventsCount = NULL,
  monotonicIncreasingFeaturelistId = NULL,
  monotonicDecreasingFeaturelistId = NULL,
  onlyIncludeMonotonicBlueprints = FALSE,
  maxWait = 600
)

Arguments

project

character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier.

target

character. String giving the name of the response variable to be predicted by all project models.

metric

character. Optional. String specifying the model fitting metric to be optimized; a list of valid options for this parameter, which depends on both project and target, may be obtained with the function GetValidMetrics.

weights

character. Optional. String specifying the name of the column from the modeling dataset to be used as weights in model fitting.

partition

partition. Optional. S3 object of class 'partition' whose elements specify a valid partitioning scheme. See help for functions CreateGroupPartition, CreateRandomPartition, CreateStratifiedPartition, CreateUserPartition and CreateDatetimePartitionSpecification.

mode

character. Optional. Specifies the autopilot mode used to start the modeling project; See AutopilotMode for valid options; AutopilotMode$Quick is default.

seed

integer. Optional. Seed for the random number generator used in creating random partitions for model fitting.

targetType

character. Optional. Used to specify the targetType to use for a project. Valid options are "Binary", "Multiclass", "Regression". Set to "Multiclass" to enable multiclass modeling. Otherwise, it can help to disambiguate, i.e. telling DataRobot how to handle a numeric target with a few unique values that could be used for either multiclass or regression. See TargetType for an easier way to keep track of the options.

positiveClass

character. Optional. Target variable value corresponding to a positive response in binary classification problems.

blueprintThreshold

integer. Optional. The maximum time (in hours) that any modeling blueprint is allowed to run before being excluded from subsequent autopilot stages.

responseCap

numeric. Optional. Floating point value, between 0.5 and 1.0, specifying a capping limit for the response variable. The default value NULL corresponds to an uncapped response, equivalent to responseCap = 1.0.

featurelistId

numeric. Specifies which feature list to use. If NULL (default), a default featurelist is used.

smartDownsampled

logical. Optional. Whether to use smart downsampling to throw away excess rows of the majority class. Only applicable to classification and zero-boosted regression projects.

majorityDownsamplingRate

numeric. Optional. Floating point value, between 0.0 and 100.0. The percentage of the majority rows that should be kept. Specify only if using smart downsampling. May not cause the majority class to become smaller than the minority class.

accuracyOptimizedBlueprints

logical. Optional. When enabled, accuracy optimized blueprints will run in autopilot for the project. These are longer-running model blueprints that provide increased accuracy over normal blueprints that run during autopilot.

offset

character. Optional. Vector of the names of the columns containing the offset of each row.

exposure

character. Optional. The name of a column containing the exposure of each row.

eventsCount

character. Optional. The name of a column specifying the events count.

monotonicIncreasingFeaturelistId

character. Optional. The id of the featurelist that defines the set of features with a monotonically increasing relationship to the target. If NULL (default), no such constraints are enforced. When specified, this will set a default for the project that can be overridden at model submission time if desired. The featurelist itself can also be passed as this parameter.

monotonicDecreasingFeaturelistId

character. Optional. The id of the featurelist that defines the set of features with a monotonically decreasing relationship to the target. If NULL (default), no such constraints are enforced. When specified, this will set a default for the project that can be overridden at model submission time if desired. The featurelist itself can also be passed as this parameter.

onlyIncludeMonotonicBlueprints

logical. Optional. When TRUE, only blueprints that support enforcing monotonic constraints will be available in the project or selected for the autopilot.

maxWait

integer. Specifies how many seconds to wait for the server to finish analyzing the target and begin the modeling process. If the process takes longer than this parameter specifies, execution will stop (but the server will continue to process the request).

Examples

## Not run: 
  projectId <- "59a5af20c80891534e3c2bde"
  SetTarget(projectId, "targetFeature")
  SetTarget(projectId, "targetFeature", metric = "LogLoss")
  SetTarget(projectId, "targetFeature", mode = AutopilotMode$Manual)
  SetTarget(projectId, "targetFeature", targetType = TargetType$Multiclass)

## End(Not run)

datarobot documentation built on May 29, 2024, 4:36 a.m.