liquidData: Loads or downloads training and testing data
In liquidSVM: A Fast and Versatile SVM Package

Description Usage Arguments Value See Also Examples

This looks at several locations to find a name.train.csv and name.test.csv. If it does then it loads or downloads it, parses it, and returns an liquidData-object. The files also can be gzipped having names name.train.csv.gz and name.test.csv.gz.

liquidData(name, factor_cols, header = FALSE, loc = c(".",
  "~/liquidData", system.file("data", package = "liquidSVM"),
  "../../../data", "https://www.isa.uni-stuttgart.de/liquidData"),
  prob = NULL, testSize = NULL, trainSize = NULL,
  stratified = NULL)

ttsplit(data, target = NULL, testProb = 0.2, testSize = NULL,
  stratified = NULL)

sample.liquidData(liquidData, prob = 0.2, trainSize = NULL,
  testSize = NULL, stratified = NULL)

## S3 method for class 'liquidData'
print(x, ...)

`name`	name of the data set. If not given then a list of available names in `loc` is returned
`factor_cols`	list of column numbers that are factors (or list of header names, if `header=TRUE`)
`header`	do the data files have headers
`loc`	vector of locations where the data should be searched for
`prob`	probability of sample being put into test set
`testSize`	size of the test set. If stratified, this will only be approximately fulfilled.
`trainSize`	size of the train set. If stratified, this will only be approximately fulfilled.
`stratified`	whether sampling should be done separately in every bin defined by the unique values of the target column. Also can be index or name of the column in `data` that should be used to define bins.
`data`	the given data set
`target`	optional name or index of the target variable. If both this and `stratified` are not specified there will be no stratification.
`testProb`	probability of sample being put into test set
`liquidData`	the given liquidData
`x`	the model to print
`...`	other arguments to print.default

if name is specified an liquidData object: an environment with $train and $test datasets as well as $name and optionally $target as name of the target variable. If no name is spacified a character vector of available names in loc.

ttsplit

banana <- liquidData('banana-mc')

## to get a smaller sample
liquidData('banana-mc',prob=0.2)
## if you disable stratified then there is some variance in the group sizes:
liquidData('banana-mc',prob=0.2, stratified=FALSE)

## Not run: 
## to downlad a file from our web directory

liquidData("gisette")

## To get a list of available names:
liquidData()

## End(Not run)
## to produce an liquidData from some dataset
ttsplit(iris)
# the following will be stratified
ttsplit(iris,'Species')

# specify a testSize:
ttsplit(trees, testSize=10)
## example for sample.liquidData
banana <- liquidData('banana-mc')
sample.liquidData(banana, prob=0.1)
# this is equivalent to
liquidData('banana-mc', prob=0.1)
## example for print
banana <- liquidData("banana-mc")
print(banana)