h2o.splitFrame: Split an H2O Data Set

Description Usage Arguments Value Examples

View source: R/frame.R

Description

Split an existing H2O data set according to user-specified ratios. The number of subsets is always 1 more than the number of given ratios. Note that this does not give an exact split. H2O is designed to be efficient on big data using a probabilistic splitting method rather than an exact split. For example, when specifying a split of 0.75/0.25, H2O will produce a test/train split with an expected value of 0.75/0.25 rather than exactly 0.75/0.25. On small datasets, the sizes of the resulting splits will deviate from the expected value more than on big data, where they will be very close to exact.

Usage

1
h2o.splitFrame(data, ratios = 0.75, destination_frames, seed = -1)

Arguments

data

An H2OFrame object representing the dataste to split.

ratios

A numeric value or array indicating the ratio of total rows contained in each split. Must total up to less than 1.

destination_frames

An array of frame IDs equal to the number of ratios specified plus one.

seed

Random seed.

Value

Returns a list of split H2OFrame's

Examples

1
2
3
4
5
6
7
library(h2o)
h2o.init()
irisPath <- system.file("extdata", "iris.csv", package = "h2o")
iris.hex <- h2o.importFile(path = irisPath)
iris.split <- h2o.splitFrame(iris.hex, ratios = c(0.2, 0.5))
head(iris.split[[1]])
summary(iris.split[[1]])

h2o documentation built on Sept. 25, 2018, 5:07 p.m.