Description Usage Arguments Value See Also Examples
This looks at several locations to find a name.train.csv
and name.test.csv
.
If it does then it loads or downloads it, parses it, and returns an liquidData
-object.
The files also can be gzipped having names name.train.csv.gz
and name.test.csv.gz
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | liquidData(name, factor_cols, header = FALSE, loc = c(".",
"~/liquidData", system.file("data", package = "liquidSVM"),
"../../../data", "https://www.isa.uni-stuttgart.de/liquidData"),
prob = NULL, testSize = NULL, trainSize = NULL,
stratified = NULL)
ttsplit(data, target = NULL, testProb = 0.2, testSize = NULL,
stratified = NULL)
sample.liquidData(liquidData, prob = 0.2, trainSize = NULL,
testSize = NULL, stratified = NULL)
## S3 method for class 'liquidData'
print(x, ...)
|
name |
name of the data set. If not given then a list of available names in |
factor_cols |
list of column numbers that are factors (or list of header names, if |
header |
do the data files have headers |
loc |
vector of locations where the data should be searched for |
prob |
probability of sample being put into test set |
testSize |
size of the test set. If stratified, this will only be approximately fulfilled. |
trainSize |
size of the train set. If stratified, this will only be approximately fulfilled. |
stratified |
whether sampling should be done separately in every bin defined by
the unique values of the target column.
Also can be index or name of the column in |
data |
the given data set |
target |
optional name or index of the target variable.
If both this and |
testProb |
probability of sample being put into test set |
liquidData |
the given liquidData |
x |
the model to print |
... |
other arguments to print.default |
if name is specified an liquidData object: an environment with $train and $test datasets as well as $name and optionally $target as name of the target variable.
If no name is spacified a character vector of available names in loc
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | banana <- liquidData('banana-mc')
## to get a smaller sample
liquidData('banana-mc',prob=0.2)
## if you disable stratified then there is some variance in the group sizes:
liquidData('banana-mc',prob=0.2, stratified=FALSE)
## Not run:
## to downlad a file from our web directory
liquidData("gisette")
## To get a list of available names:
liquidData()
## End(Not run)
## to produce an liquidData from some dataset
ttsplit(iris)
# the following will be stratified
ttsplit(iris,'Species')
# specify a testSize:
ttsplit(trees, testSize=10)
## example for sample.liquidData
banana <- liquidData('banana-mc')
sample.liquidData(banana, prob=0.1)
# this is equivalent to
liquidData('banana-mc', prob=0.1)
## example for print
banana <- liquidData("banana-mc")
print(banana)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.