blk.singleValue | R Documentation |
Extract a single value for each block in a
block-format
data set.
blk.singleValue(x, id, ind = NULL, select = c("first", "last"), fill = NA)
blk.repeatValue(x, id, id2=id, ind = NULL, select = c("first", "last"), fill = NA)
x |
A vector in |
id |
A valid |
id2 |
A valid |
ind |
A vector of logicals used to filter the values of |
select |
When more than one value exists, which one to select. See details. |
fill |
A value to use when no other value is appropriate. See details. |
These functions allow one to extract a single value per block in a
block-format
data set. This can be useful in many
contexts. If the values of x
in each block are not unique,
then a specific value needs to be determined. The indicator vector
ind
can be used to filter out specific rows that contain
values of interest. If there is more than one value, then
select
is used to choose either the first or the last
(according to the ordering of x
). If, for a given block,
there is no value at all, either because none of the rows matched
the ind
criteria or because the block is of size zero (see
block-format
), then the value of fill
is used
for that block.
Both blk.singleValue
and blk.repeatValue
determine a
unique value for each block. The difference between them is how
many times each value is repeated in the result. The vector
returned by blk.singleValue
has length equal to the number of
blocks in id
, i.e. each value appears exactly once. For
blk.repeatValue
, each value is repeated the appropriate
number of times so that the result is a vector in block format with
respect to id2
. Thus, blk.repeatValue
can effectively
be used to perform a simple “left outer join” (or “merge”)
operation on a single variable (see example).
Neither of levels(id)
and levels(id2)
need be a proper
subset of the other. For levels of id2
(resp. id
) that
are not levels of id
(resp. id2
), the corresponding blocks
are assumed to be of size zero in id
(resp. id2
).
It is also possible to use blk.repeatValue
in a different
way, without specifying id
. In that case, x
must
have length equal to nlevels(id2)
, i.e. it must contain a
unique value for each possible value of id2
, and the
correspondence of values to ID's is taken from their respective
ordering. Then, each value in x
is repeated the number of
times that the corresponding ID appears in id2
.
This is equivalent to blk.repeatValue(x, id=asID(levels(id2)), id2, ...)
.
blk.singleValue
returns a vector containing one value for
each level of id
.
blk.repeatValue
returns a vector in
block-format
with respect to id2
.
Benjamin Rich <mail@benjaminrich.net>
block-format
tapply
merge
duplicated
# EXAMPLE 1
require(nlme)
data(Phenobarb)
dat <- Phenobarb[1:56,] # First 4 subjects
dat$id <- asID(dat$Subject)
attach(dat)
# A single row per subject
data.frame(id=levels(id),
Wt=blk.singleValue(Wt, id),
Apgar=blk.singleValue(Apgar, id),
final.dose=blk.singleValue(dose, id, ind=(!is.na(dose)), select="last"))
# Repeat a single value on each row for each subject
cbind(dat, data.frame(
first.dose=blk.repeatValue(dose, id, ind=(!is.na(dose))),
final.dose=blk.repeatValue(dose, id, ind=(!is.na(dose)), select="last")
))
detach(dat)
### Merging a time-fixed covariate (simple left outer join)
### -------------------------------------------------------
# Suppose subjects 1 and 2 are Male, and Subject 4 is Female, but the
# gender of subject 3 is not specified.
gender <- data.frame(
id=factor(c(1, 2, 4), levels=levels(dat$id)), # Note: keeping the same factor levels helps
gender=c("Male", "Male", "Female"))
gender
# Now, 'merge' the gender with the rest of the data.
# Since subject 3 is absent, it gets the value of 'fill', i.e. NA.
dat$gender <- blk.repeatValue(gender$gender, gender$id, dat$id)
dat
# Still returns 4 values:
blk.singleValue(gender$gender, gender$id)
### The other way of using blk.repeatValue (without specifying id)
### --------------------------------------------------------------
letter <- LETTERS[1:nlevels(dat$id)] # Exactly one value per id
cbind(dat, letter=blk.repeatValue(c("A", "B", "C", "D"), id2=dat$id))
cbind(gender, letter=blk.repeatValue(c("A", "B", "C", "D"), id2=gender$id))
# EXAMPLE 2
id <- gl(4, 4)
x <- LETTERS[1:16]
y <- Sys.time() + 1:16
data.frame(
id = levels(id),
first.x = blk.singleValue(x, id),
first.y = blk.singleValue(y, id))
data.frame(
id = id,
x = x,
first.x = blk.repeatValue(x, id),
y = y,
first.y = blk.repeatValue(y, id))
target.id <- gl(4, 6)
data.frame(
id = target.id,
first.x = blk.repeatValue(x, id, target.id),
first.y = blk.repeatValue(y, id, target.id))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.