as.keyed: Create and Manipulate Keyed Data Frames

Description Usage Arguments Details Value Note Author(s) References See Also Examples

Description

The class keyed is a subclass of data.frame with a key attribute. The key is a vector of column names which, taken together, should provide enough information to uniquely distinguish each row. Specific functions and methods take advantage of this information.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
## S3 method for class 'keyed'
 x[i, j, drop]
## S3 method for class 'keyed'
aggregate(x, by=x[,setdiff(key(x),across),drop=FALSE], FUN, across=character(0), ...)
## S3 method for class 'data.frame'
as.keyed(x, key=character(0), ...)
dupKeys(x, ...)
key(x, ...)
key(x) <- value
## S3 method for class 'keyed'
merge(x, y, ...)
naKeys(x, ...)
unsorted(x, ...)
## S3 method for class 'keyed.summary'
print(x, ...)
## S3 method for class 'keyed'
sort(x, decreasing = FALSE, ...)
## S3 method for class 'keyed'
summary(object, ...)
## S3 method for class 'keyed'
transform(`_data`, ...)
## S3 method for class 'keyed'
uniKey(x,key=NULL,...)

Arguments

x

a (keyed) data.frame

i

first index

j

second index

drop

whether to drop unused dimensions

by

a list of indices as long as nrow(x), whose interaction gives the aggregates (groups)

FUN

an aggregating function

across

column names in key(x) across which to aggregate; see details

...

extra arguments, usually ignored, but passed to FUN in aggregate

key

a character vector of column names in x that should uniquely distinguish each row

value

a key (character vector of column names)

y

the right argument in the merge

decreasing

(coercible to) logical; length 1

object

a keyed data.frame

_data

a keyed data.frame

Details

The generic as.keyed is the usual way of creating a keyed object. The method as.keyed.data.frame calls key<-. The function key allows checking an object's key. A data.frame can be re-keyed by a subsequent call. Generally, a data.frame should be keyed on columns that actually exist, but this is not enforced. as.data.frame.keyed removes the key and reverts the class.

In aggregate.keyed, the default behavior is to aggregate by the key columns, i.e., to eliminate duplicate keys by aggregation. by can be specified arbitrarily, but must be a named list (e.g., a data.frame) with each element as long as nrow(x). Each element in by will displace any like-named element in x, and names(by) will serve as the key of the result. If by has length zero, (as it does by default when across is key(x)) the entire data set is aggregated into a one row data.frame.

across is a convenience argument to aggregate.keyed. If specified, it must be a subset of (or all of) key(x). Columns indicated by across are dropped from x and from the default by value, and aggregation proceeds irrespective of those columns.

The function naKeys detects rows for which one or more key fields is NA.

The function dupKeys detects all rows for which there is another row (earlier or later) with identical key. That means it can never identify a single row, as duplicated can: it identifies the duplicates as well as those rows of which they are duplicates. It is recommended to test for NAs before testing for duplicates.

The keyed method for unsorted detects rows that would move on sort.

Methods for merge and transform are key-friendly. The method for summary is key-centric.

uniKey creates a character vector (class uniKey) by pasting the key columns with \r. (Keys containing \r are unsupported.) The as.character method substitutes a space character for \r.

Value

Most functions and methods documented here return objects with class c('keyed','data.frame').

key<- and methods for summary and print are used for side effects.

uniKey.keyed returns a character vector as long as nrow(x), class uniKey.

naKeys, dupKeys, and unsorted return logical vectors as long as nrow(x).

Note

Values in key columns should not contain \r, which is used as a delimiter in dupKeys and uniKey.

Author(s)

Tim Bergsma

References

http://metrumrg.googlecode.com

See Also

Examples

1
2
3
a <- sort(as.keyed(Theoph,key=c('Subject','Time')))
summary(a)
aggregate(a, across='Time',FUN=mean)

metrumrg documentation built on May 2, 2019, 5:55 p.m.