DataFrame-comparison: DataFrame comparison methods

DataFrame-comparisonR Documentation

DataFrame comparison methods


The DataFrame class provides methods to compare across rows of the DataFrame, including ordering and matching. Each DataFrame is effectively treated as a vector of rows.


## S4 method for signature 'DataFrame'

## S4 method for signature 'DataFrame,DataFrame'
match(x, table, nomatch = NA_integer_, incomparables = NULL, ...)

## S4 method for signature 'DataFrame'
order(..., na.last = TRUE, decreasing = FALSE, method = c("auto", 
    "shell", "radix"))

## S4 method for signature 'DataFrame,DataFrame'
pcompare(x, y)

## S4 method for signature 'DataFrame,DataFrame'
e1 == e2

## S4 method for signature 'DataFrame,DataFrame'
e1 <= e2


x, table, y, e1, e2

A DataFrame object.

nomatch, incomparables

See ?base::match.


For match, further arguments to pass to match.

For order, one or more DataFrame objects.

decreasing, na.last, method

See ?base::order.


The treatment of a DataFrame as a “vector of rows” is useful in many cases, e.g., when each row is a record that needs to be ordered or matched. The methods provided here allow the use of all methods described in ?Vector-comparison, including sorting, matching, de-duplication, and so on.

Careful readers will notice this behaviour differs from the usual semantics of a data.frame, which acts as a list-like vector of columns. This discrepancy rarely causes problems, as it is not particularly common to compare columns of a data.frame in the first place.

Note that a match method for DataFrame objects is explicitly defined to avoid calling the corresponding method for List objects, which would yield the (undesired) list-like semantics. The same rationale is behind the explicit definition of <= and == despite the availability of pcompare.


For sameAsPreviousROW: see sameAsPreviousROW.

For match: see match.

For order: see order.

For pcompare, == and <=: see pcompare.


Aaron Lun


# Mocking up a DataFrame.
DF <- DataFrame(
    A=sample(LETTERS, 100, replace=TRUE),
    B=sample(5, 100, replace=TRUE)

# Matching:
match(DF, DF[1:10,])

# Ordering, alone and with other vectors:
order(DF, runif(nrow(DF)))

# Parallel comparison:

Bioconductor/S4Vectors documentation built on Feb. 11, 2025, 11:31 a.m.