Description Usage Arguments Details Value Note Author(s) References See Also Examples
Break a data frame into components static on variants of a proposed key.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | ## S3 method for class 'digest'
as.best(x, ...)
## S3 method for class 'data.frame'
as.digest(x, key = character(0), strict = TRUE, ...)
## S3 method for class 'digest'
as.digest(x, ...)
## S3 method for class 'keyed'
as.digest(x, key = match.fun("key")(x), strict = TRUE, ...)
## S3 method for class 'nm'
as.digest(x,key=match.fun('key')(x),...)
## S3 method for class 'nm'
as.keyed(x, key = match.fun("key")(x), ...)
## S3 method for class 'digest'
head(x, ...)
|
x |
object of dispatch |
key |
a vector of column names in |
strict |
passed to |
... |
passed to or from other functions |
Well-constructed data tables typically admit a set of columns (a key), the interaction of which uniquely
distinguish all rows. The columns may be ordered from most general to most specific, in which
case they may be thought of as an object hierarchy. The hierarchy accounts for structural
redunancy of identifier variables across rows. When exploring data, it may be useful to
remove such redundancy to focus on singular relationships within the data (e.g., like static
).
digest
recursively cleaves a data frame using appropriate subsets of a key.
The original data frame and any dynamic residuals are cleaved using increasingly longer
left subsets (empty; 1; 1,2; 1,2,3; etc.) of the proposed key. Effectively, this is a search
for columns that are static on (i.e. are attributes of) various objects and sub-objects.
The static results of cleaving, if any, are further explored (if possible) with increasingly
shorter right subsets (e.g. 1,2,3; 2,3; 3) to detect any columns that are super-keyed:
i.e. are still strictly attributes of some sub-object, without appeal to more general
hierarchical levels. digest
returns a list of keyed data frames, such that each original
non-key column appears in exactly one data frame, together with the smallest necessary
set of key columns, and all siblings (like-keyed non-key columns). If indeed the proposed key
completely distinguishes all rows, the result consists only of static data frames.
Otherwise, the last data frame is dynamic. For columns that are constant in the data,
irrespective of the proposed key, the key of the sub-result has length zero.
The resulting key for a dynamic sub-result is the last key tried (possibly different from
the proposed key, as elements may be removed from consideration if they are themselves
static on some prior key). Elements are named with their keys, pasted together with dots;
except if the key is character(0), the name will be a single dot, or two dots for
the last element if it is dynamic on the proposed key.
as.digest
and as.best.digest
return an object of class digest
:
a list of keyed data frames, with names suggesting their keys ('.' for character(0), '..' for a dynamic data frame).
digest
is an alias for the generic as.digest
.
Tim Bergsma
http://metrumrg.googlecode.com
as.keyed
static
index
1 2 |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.