listOMLDataSets: List the first 5000 OpenML data sets.

Description Usage Arguments Value Note See Also Examples

Description

The returned data.frame contains the data set id “data.id”, the “status” (“active”, “deactivated”, “in_preparation”) and describing data qualities. Note that by default only the first 5000 data sets will be returned (due to the argument “limit = 5000”).

Usage

1
2
3
4
listOMLDataSets(number.of.instances = NULL, number.of.features = NULL,
  number.of.classes = NULL, number.of.missing.values = NULL,
  tag = NULL, data.name = NULL, limit = 5000, offset = NULL,
  status = "active", verbosity = NULL)

Arguments

number.of.instances

[numeric(1) | numeric(2)]
If not NULL, subsets the entries with respect to the given values or, if a vector of length 2 is passed, the given ranges.

number.of.features

[numeric(1) | numeric(2)]
If not NULL, it subsets the entries with respect to the given values or, if a vector of length 2 is passed, the given range.

number.of.classes

[numeric(1) | numeric(2)]
If not NULL, subsets the entries with respect to the given values or, if a vector of length 2 is passed, the given ranges.

number.of.missing.values

[numeric(1) | numeric(2)]
If not NULL, subsets the entries with respect to the given values or, if a vector of length 2 is passed, the given ranges.

tag

[character]
If not NULL only entries with the corresponding tags are listed.

data.name

[character(1)]
Name of the data set.

limit

[numeric(1)]
Optional. The maximum number of entries to return. Without specifying offset, it returns the first 'limit' entries. Setting limit = NULL returns all available entries.

offset

[numeric(1)]
Optional. The offset to start from. Should be indices starting from 0, which do not refer to IDs. Is ignored when no limit is given.

status

[character]
Subsets the results according to the status. Possible values are {"active", "deactivated", "in_preparation", "all"}. Default is "active".

verbosity

[integer(1)]
Print verbose output on console? Possible values are:
0: normal output,
1: info output,
2: debug output.
Default is set via setOMLConfig.

Value

[data.frame].

Note

This function is memoised. I.e., if you call this function twice in a running R session, the first call will query the server and store the results in memory while the second and all subsequent calls will return the cached results from the first call. You can reset the cache by calling forget on the function manually.

See Also

Other listing functions: chunkOMLlist, listOMLDataSetQualities, listOMLEstimationProcedures, listOMLEvaluationMeasures, listOMLFlows, listOMLRuns, listOMLSetup, listOMLStudies, listOMLTaskTypes, listOMLTasks

Other data set-related functions: OMLDataSetDescription, OMLDataSet, convertMlrTaskToOMLDataSet, convertOMLDataSetToMlr, deleteOMLObject, getOMLDataSet, tagOMLObject, uploadOMLDataSet

Examples

1
2
3
4
# \dontrun{
# 	datasets = listOMLDataSets()
# 	tail(datasets)
# }

Example output

Loading required package: mlr
Loading required package: ParamHelpers
Please use the 'setOMLConfig' or 'saveOMLConfig' function to set the API key.
You can generate the API key from your OpenML account at http://www.openml.org/u#!api

OpenML documentation built on Sept. 21, 2019, 5:02 p.m.