train_test_split: Split data into training and test sets

View source: R/train_test_split.R

train_test_splitR Documentation

Split data into training and test sets

Description

Randomly splits a feature matrix or data.frame and its corresponding response vector into training and test subsets.

Usage

train_test_split(X, y, test_size = 0.2, seed = NULL)

Arguments

X

A matrix or data.frame of features.

y

A vector of responses (numeric or factor). Must have the same number of rows as X.

test_size

Proportion of observations to use as the test set. A number in (0, 1). Default is 0.2 (80/20 split).

seed

An optional integer random seed for reproducibility. If NULL (default) the current RNG state is used.

Value

A named list with four elements:

X_train

Training features (same type as X).

X_test

Test features (same type as X).

y_train

Training response.

y_test

Test response.

Examples

# matrix input
X <- iris[, 1:4]
y <- iris$Species
d <- unifiedml::train_test_split(X, y, test_size = 0.3, seed = 42)
dim(d$X_train)  # 105 x 4
dim(d$X_test)   #  45 x 4

# data.frame input
d2 <- unifiedml::train_test_split(iris[, 1:4], iris$Species, test_size = 0.2)
is.data.frame(d2$X_train)  # TRUE


unifiedml documentation built on May 5, 2026, 9:06 a.m.