sset: Cheaper subset

View source: R/sset.R

ssetR Documentation

Cheaper subset

Description

Cheaper alternative to [ that consistently subsets data frame rows, always returning a data frame. There are explicit methods for enhanced data frames like tibbles, data.tables and sf.

Usage

sset(x, ...)

## S3 method for class 'data.frame'
sset(x, i = NULL, j = NULL, ...)

## S3 method for class 'tbl_df'
sset(x, i = NULL, j = NULL, ...)

## S3 method for class 'POSIXlt'
sset(x, i = NULL, j = NULL, ...)

## S3 method for class 'data.table'
sset(x, i = NULL, j = NULL, ...)

## S3 method for class 'sf'
sset(x, i = NULL, j = NULL, ...)

Arguments

x

Vector or data frame.

...

Further parameters passed to [.

i

A logical or vector of indices.

j

Column indices, names or logical vector.

Details

sset is an S3 generic. You can either write methods for sset or [.
sset will fall back on using [ when no suitable method is found.

To get into more detail, using sset() on a data frame, a new list is always allocated through new_list().

Difference to base R

When i is a logical vector, it is passed directly to which_().
This means that NA values are ignored and this also means that i is not recycled, so it is good practice to make sure the logical vector matches the length of x. To return NA values, use sset(x, NA_integer_).

ALTREP range subsetting

When i is an ALTREP compact sequence which can be commonly created using e.g. 1:10 or using seq_len, seq_along and seq.int, sset internally uses a range-based subsetting method which is faster and doesn't allocate i into memory.

Value

A new vector, data frame, list, matrix or other R object.

Examples

library(cheapr)
library(bench)

# Selecting columns
sset(airquality, j = "Temp")
sset(airquality, j = 1:2)

# Selecting rows
sset(iris, 1:5)

# Rows and columns
sset(iris, 1:5, 1:5)
sset(iris, iris$Sepal.Length > 7, c("Species", "Sepal.Length"))

# Comparison against base
x <- rnorm(10^4)

mark(x[1:10^3], sset(x, 1:10^3))
mark(x[x > 0], sset(x, x > 0))

df <- data.frame(x = x)

mark(df[df$x > 0, , drop = FALSE],
     sset(df, df$x > 0),
     check = FALSE) # Row names are different


cheapr documentation built on April 4, 2025, 4:25 a.m.