Graph | R Documentation |
A Graph
is a representation of a machine learning pipeline graph. It can be trained, and subsequently used for prediction.
A Graph
is most useful when used together with Learner
objects encapsulated as PipeOpLearner
. In this case,
the Graph
produces Prediction
data during its $predict()
phase and can be used as a Learner
itself (using the GraphLearner
wrapper). However, the Graph
can also be used without Learner
objects to simply
perform preprocessing of data, and, in principle, does not even need to handle data at all but can be used for general processes with
dependency structure (although the PipeOp
s for this would need to be written).
R6Class
.
Graph$new()
A Graph
is made up of a list of PipeOp
s, and a data.table
of edges. Both for training and prediction, the Graph
performs topological sorting of the PipeOp
s and executes their respective $train()
or $predict()
functions in order, moving
the PipeOp
results along the edges as input to other PipeOp
s.
pipeops
:: named list
of PipeOp
Contains all PipeOp
s in the Graph
, named by the PipeOp
's $id
s.
edges
:: data.table
with columns src_id
(character
), src_channel
(character
), dst_id
(character
), dst_channel
(character
)
Table of connections between the PipeOp
s. A data.table
. src_id
and dst_id
are $id
s of PipeOp
s that must be present in
the $pipeops
list. src_channel
and dst_channel
must respectively be $output
and $input
channel names of the
respective PipeOp
s.
is_trained
:: logical(1)
Is the Graph
, i.e. are all of its PipeOp
s, trained, and can the Graph
be used for prediction?
lhs
:: character
Ids of the 'left-hand-side' PipeOp
s that have some unconnected input channels and therefore act as Graph
input layer.
rhs
:: character
Ids of the 'right-hand-side' PipeOp
s that have some unconnected output channels and therefore act as Graph
output layer.
input
:: data.table
with columns name
(character
), train
(character
), predict
(character
), op.id
(character
), channel.name
(character
)
Input channels of the Graph
. For each channel lists the name, input type during training, input type during prediction,
PipeOp
$id
of the PipeOp
the channel pertains to, and channel name as the PipeOp
knows it.
output
:: data.table
with columns name
(character
), train
(character
), predict
(character
), op.id
(character
), channel.name
(character
)
Output channels of the Graph
. For each channel lists the name, output type during training, output type during prediction,
PipeOp
$id
of the PipeOp
the channel pertains to, and channel name as the PipeOp
knows it.
packages
:: character
Set of all required packages for the various methods in the Graph
, a set union of all required packages of all contained
PipeOp
objects.
state
:: named list
Get / Set the $state
of each of the members of PipeOp
.
param_set
:: ParamSet
Parameters and parameter constraints. Parameter values are in $param_set$values
. These are the union of $param_set
s
of all PipeOp
s in the Graph
. Parameter names
as seen by the Graph
have the naming scheme <PipeOp$id>.<PipeOp original parameter name>
.
Changing $param_set$values
also propagates the changes directly to the contained
PipeOp
s and is an alternative to changing a PipeOp
s $param_set$values
directly.
hash
:: character(1)
Stores a checksum calculated on the Graph
configuration, which includes all PipeOp
hashes
(and therefore their $param_set$values
) and a hash of $edges
.
phash
:: character(1)
Stores a checksum calculated on the Graph
configuration, which includes all PipeOp
hashes
except their $param_set$values
, and a hash of $edges
.
keep_results
:: logical(1)
Whether to store intermediate results in the PipeOp
's $.result
slot, mostly for debugging purposes. Default FALSE
.
man
:: character(1)
Identifying string of the help page that shows with help()
.
ids(sorted = FALSE)
(logical(1)
) -> character
Get IDs of all PipeOp
s. This is in order that PipeOp
s were added if
sorted
is FALSE
, and topologically sorted if sorted
is TRUE
.
add_pipeop(op, clone = TRUE)
(PipeOp
| Learner
| Filter
| ...
, logical(1)
) -> self
Mutates Graph
by adding a PipeOp
to the Graph
. This does not add any edges, so the new PipeOp
will not be connected within the Graph
at first.
Instead of supplying a PipeOp
directly, an object that can naturally be converted to a PipeOp
can also
be supplied, e.g. a Learner
or a Filter
; see as_pipeop()
.
The argument given as op
is cloned if clone
is TRUE
(default); to access a Graph
's PipeOp
s
by-reference, use $pipeops
.
Note that $add_pipeop()
is a relatively low-level operation, it is recommended to build graphs using %>>%
.
add_edge(src_id, dst_id, src_channel = NULL, dst_channel = NULL)
(character(1)
, character(1)
,
character(1)
| numeric(1)
| NULL
,
character(1)
| numeric(1)
| NULL
) -> self
Add an edge from PipeOp
src_id
, and its channel src_channel
(identified by its name or number as listed in the PipeOp
's $output
), to PipeOp
dst_id
's
channel dst_channel
(identified by its name or number as listed in the PipeOp
's $input
).
If source or destination PipeOp
have only one input / output channel and src_channel
/ dst_channel
are therefore unambiguous, they can be omitted (i.e. left as NULL
).
chain(gs, clone = TRUE)
(list
of Graph
s, logical(1)
) -> self
Takes a list of Graph
s or PipeOp
s (or objects that can be automatically converted into Graph
s or PipeOp
s,
see as_graph()
and as_pipeop()
) as inputs and joins them in a serial Graph
coming after self
, as if
connecting them using %>>%
.
plot(html)
(logical(1)
) -> NULL
Plot the Graph
, using either the igraph package (for html = FALSE
, default) or
the visNetwork
package for html = TRUE
producing a htmlWidget
.
The htmlWidget
can be rescaled using visOptions
.
print(dot = FALSE, dotname = "dot", fontsize = 24L)
(logical(1)
, character(1)
, integer(1)
) -> NULL
Print a representation of the Graph
on the console. If dot
is FALSE
, output is a table with one row for each contained PipeOp
and
columns ID
($id
of PipeOp
), State
(short representation of $state
of PipeOp
), sccssors
(PipeOp
s that
take their input directly from the PipeOp
on this line), and prdcssors
(the PipeOp
s that produce the data
that is read as input by the PipeOp
on this line). If dot
is TRUE
, print a DOT representation of the Graph
on the console.
The DOT output can be named via the argument dotname
and the fontsize
can also be specified.
set_names(old, new)
(character
, character
) -> self
Rename PipeOp
s: Change ID of each PipeOp
as identified by old
to the corresponding item in new
. This should be used
instead of changing a PipeOp
's $id
value directly!
update_ids(prefix = "", postfix = "")
(character
, character
) -> self
Pre- or postfix PipeOp
's existing ids. Both prefix
and postfix
default to ""
, i.e. no changes.
train(input, single_input = TRUE)
(any
, logical(1)
) -> named list
Train Graph
by traversing the Graph
s' edges and calling all the PipeOp
's $train
methods in turn.
Return a named list
of outputs for each unconnected
PipeOp
out-channel, named according to the Graph
's $output
name
column. During training, the $state
member of each PipeOp
s will be set and the $is_trained
slot of the Graph
(and each individual PipeOp
) will
consequently be set to TRUE
.
If single_input
is TRUE
, the input
value will be sent to each unconnected PipeOp
's input channel
(as listed in the Graph
's $input
). Typically, input
should be a Task
, although this is dependent
on the PipeOp
s in the Graph
. If single_input
is FALSE
, then
input
should be a list
with the same length as the Graph
's $input
table has rows; each list item will be sent
to a corresponding input channel of the Graph
. If input
is a named list
, names must correspond to input channel
names ($input$name
) and inputs will be sent to the channels by name; otherwise they will be sent to the channels
in order in which they are listed in $input
.
predict(input, single_input = TRUE)
(any
, logical(1)
) -> list
of any
Predict with the Graph
by calling all the PipeOp
's $train
methods. Input and output, as well as the function
of the single_input
argument, are analogous to $train()
.
help(help_type)
(character(1)
) -> help file
Displays the help file of the concrete PipeOp
instance. help_type
is one of "text"
, "html"
, "pdf"
and behaves
as the help_type
argument of R's help()
.
Other mlr3pipelines backend related:
PipeOp
,
PipeOpTargetTrafo
,
PipeOpTaskPreproc
,
PipeOpTaskPreprocSimple
,
mlr_graphs
,
mlr_pipeops
,
mlr_pipeops_updatetarget
library("mlr3")
g = Graph$new()$
add_pipeop(PipeOpScale$new(id = "scale"))$
add_pipeop(PipeOpPCA$new(id = "pca"))$
add_edge("scale", "pca")
g$input
g$output
task = tsk("iris")
trained = g$train(task)
trained[[1]]$data()
task$filter(1:10)
predicted = g$predict(task)
predicted[[1]]$data()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.