Description Usage Arguments Details Value Warning Author(s) See Also Examples
Aggregate datasets with constraints on missing values
1 2 3 |
dat |
A data frame containing the data to be aggregated. |
subunits |
A data frame with subunit information. See ‘Details’. |
units |
A data frame with unit information. See ‘Details’. |
aggregatemissings |
Optional: A symmetrical n x n matrix with information on how missing values should be aggregated. If no matrix is given, the default will be used. See 'Examples'. |
rename |
Logical indicating whether units with only one subunit should be renamed to their unit name? Default is |
recodedData |
Logical indicating whether colnames in |
suppressErr |
Logical indicating whether aggregated cells with |
recodeErr |
Character vector of length 1 indicating to which |
verbose |
Logical. If |
aggregateData
aggregates units in data frames with special consideration of missing values.The aggregation of missing values is specified in the argument aggregatemissings
. The rownames and colnames of this n x n matrix correspond to the missing codes in the data (see collapseMissings
for supported missing values). Additionally, the values vc
(for valid code) and err
(for error) are used. If aggregatemissings
is a data frame, it will be coerced to a matrix with the first column of the data frame being transformed into the rownames of the matrix. A warning will be given if the matrix is not symmetrical.
aggregateData
combines the subunits one by one, i.e. it aggregates the first two subunits of a unit, then adds the third subunit to the new aggregated variable and continues in this manner until all subunits are aggregated. In every step during the process a value of the first variable (e.g., the aggregated variable from the previous step) is matched with the rownames of aggregatemissings
and the corresponding value of the second variable (e.g., the next subitem to be aggregated) is matched with the colnames of aggregatemissings
. The new value of the aggregated variable will therefore be the value in aggregatemissings[firstVar, secondVar]
.If the value in the final aggregated variable is vc
, either the mean or the sum of subunits will be calculated. The rule given in units$unitAggregateRule
determines which one will be chosen, with SUM
being the default if column units$unitAggregateRule
is empty.
The user can specify combinations of missing values that cannot occur simultaneously in one unit by setting the respective cell in aggregatemissings
to err
. For example, it is unlikely that one subunit is not administered (missing by design, mbd
) and another subunit of the same unit was intentionally left blank by the person working on the test booklet (missing by intention mbi
). Thus, this combination of missing values is defaulted to produce an error (err
) in the aggregated variable. If the aggregation produces err
at any point, it will produce a warning. Values err
can be recoded to a different value by specifying the arguments suppressErr
and recodeErr
.
Examples of data frames subunits
and units
can be found via data(inputList)
.
A data frame with aggregated units and, if rename = TRUE
, renamed subunits.
Missings are only correctly aggregated if their values correspond to the values in aggregatemissings
. aggregateData
does not check for value types or whether codes are valid. Use of checkData
and recodeData
before using aggregateData
is therefore strongly recommended.
Nicole Haag, Anna Lenski
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 | data(inputDat)
data(inputList)
dat1 <- inputDat[[1]] # get first dataset from inputDat
# recode data
datRec <- recodeData(dat1, inputList$values, inputList$subunits)
# define matrix for missing aggregation (note: this is the default matrix)
am <- matrix(c(
"vc" , "mvi", "vc" , "mci", "err", "vc" , "vc" , "err",
"mvi", "mvi", "err", "mci", "err", "err", "err", "err",
"vc" , "err", "mnr", "mci", "err", "mir", "mnr", "err",
"mci", "mci", "mci", "mci", "err", "mci", "mci", "err",
"err", "err", "err", "err", "mbd", "err", "err", "err",
"vc" , "err", "mir", "mci", "err", "mir", "mir", "err",
"vc" , "err", "mnr", "mci", "err", "mir", "mbi", "err",
"err", "err", "err", "err", "err", "err", "err", "err" ),
nrow = 8, ncol = 8, byrow = TRUE)
dimnames(am) <- list(
c("vc" ,"mvi", "mnr", "mci", "mbd", "mir", "mbi", "err"),
c("vc" ,"mvi", "mnr", "mci", "mbd", "mir", "mbi", "err"))
print(am)
datAggr <- aggregateData(datRec, inputList$subunits, inputList$units,
aggregatemissings = am, rename = TRUE, recodedData = TRUE,
suppressErr = TRUE, recodeErr = "mci", verbose = TRUE)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.