View source: R/nroDestratify.R
nroDestratify | R Documentation |
Removes differences in value distribution within subsets of data points.
nroDestratify(data, labels)
data |
A matrix or a data frame with M rows. |
labels |
A vector of M subset labels. |
Only non-binary numerical columns are processed, the rest will be excluded from the results.
The de-stratification algorithm is based on ranked data: the distribution of each subset will be mapped to the pooled distribution over all subsets by matching subset-specific ranking with ranking of all values.
A matrix of de-stratified values. The output also includes the attribute 'incomplete' that lists those columns where (some of) the values were set to missing due to processing failures.
# Import data.
fname <- system.file("extdata", "finndiane.txt", package = "Numero")
dataset <- read.delim(file = fname)
# Remove sex differences for creatinine.
creat <- nroDestratify(dataset$CREAT, dataset$MALE)
# Compare creatinine distributions.
men <- which(dataset$MALE == 1)
women <- which(dataset$MALE == 0)
print(summary(dataset[men,"CREAT"]))
print(summary(dataset[women,"CREAT"]))
print(summary(creat[men]))
print(summary(creat[women]))
# Remove sex differences (produces warnings for binary traits).
ds <- nroDestratify(dataset, dataset$MALE)
# Compare HDL2C distributions.
print(summary(dataset[men,"HDL2C"]))
print(summary(dataset[women,"HDL2C"]))
print(summary(ds[men,"HDL2C"]))
print(summary(ds[women,"HDL2C"]))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.