View source: R/ImputationTests.R
ImputationTests | R Documentation |
'ImputationTests' calculates various measures and applies goodness-of-fit statistical tests to check the quality of the imputed fuzzy values.
ImputationTests(
trueData,
imputedData,
imputedMask,
trapezoidal = TRUE,
cutsNumber = 100,
K = 50,
...
)
trueData |
Name of the input matrix (or data frame, or list) with the true values of the variables. |
imputedData |
Name of the input matrix (or data frame) with the imputed values. |
imputedMask |
Matrix (or data frame) with logical values where |
trapezoidal |
Logical value depending on the type of fuzzy values (triangular or trapezoidal ones) in the dataset. |
cutsNumber |
Number of cuts for the epistemic bootstrap tests. |
K |
Value of |
... |
Additional parameters passed to other functions. |
The procedure uses other functions embedded in this package to check the quality of the imputed fuzzy values if they
are compared with the original ones.
This procedure calculates number of non-FNs for each variable, error matrix (using ErrorMatrix
), various statistical measures
(with StatisticalMeasures
), applies epistemic goodness-of-fit tests (using ApplyStatisticalTests
), and evaluates the
fuzzy measures (with CalculateFuzzyMeasures
).
Therefore, this function can be directly applied as one-click benchmark tool.
To properly distinguish the real values with their imputed counterparts, the additional matrix imputedMask
should be provided.
In this matrix, the logical value TRUE
points out the cells with the imputed values.
Otherwise, FALSE
should be used.
All of the input datasets can be given as matrices or data frames.
To get overall comparison of the methods, summary(object,...)
can be used for the output object from this method.
The values diff
are equal to the differences of p-values between the respective tests for the parts
true
and imputed
there.
The output is an S3 object of the class impTest
given as a list of the matrices:
trueValues
- the true, input values (the same as trueData
),
mask
- the masked (NAs) values (the same as imputedMask
),
nonFNNumbers
- the vector with the numbers of non-FNs samples for each variable (with the overall mean),
errorMatrix
– the output from the function ErrorMatrix
,
statisticalMeasures
– the output from the function StatisticalMeasures
,
statisticalTests
– the output from the function ApplyStatisticalTests
,
fuzzyMeasures
– the output from the function CalculateFuzzyMeasures
.
MethodsComparison for the imputation benchmark for all methods, summary.impTest
.
# seed PRNG
set.seed(1234)
# load the necessary library
library(FuzzySimRes)
# generate sample of trapezoidal fuzzy numbers with FuzzySimRes library
list1<-SimulateSample(20,originalPD="rnorm",parOriginalPD=list(mean=0,sd=1),
incrCorePD="rexp", parIncrCorePD=list(rate=2),
suppLeftPD="runif",parSuppLeftPD=list(min=0,max=0.6),
suppRightPD="runif", parSuppRightPD=list(min=0,max=0.6),
type="trapezoidal")
# convert fuzzy data into a matrix
matrix1 <- FuzzyNumbersToMatrix(list1$value)
# check starting values
head(matrix1)
# add some NAs to the matrix
matrix1NA <- IntroducingNA(matrix1,percentage = 0.1)
head(matrix1NA)
# impute missing values
matrix1DImp <- ImputationDimp(matrix1NA)
# find cells with NAs
matrix1Mask <- is.na(matrix1NA)
# check the quality of the imputed values
ImputationTests(matrix1,matrix1DImp,matrix1Mask,trapezoidal=TRUE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.