NARemoveMultiBlock: NARemoveMultiBlock

View source: R/NARemoveMultiBlock.R

NARemoveMultiBlockR Documentation

NARemoveMultiBlock

Description

Remove NA from a MultiBlock object.

Usage

NARemoveMultiBlock(
  MB = MB,
  blocks = NULL,
  minfrac = 0.5,
  method = c("none", "zero", "median", "discard", "fixed.value", "fixed.value.all",
    "fixed.noise", "random.noise", "QRILC")[8],
  constant = 0,
  factor.NA = 0.5,
  sd.noise = 0.3,
  seed.number = NULL,
  tune.sigma = 1,
  showWarning = TRUE
)

Arguments

MB

The MultiBlock object.

blocks

The blocks to apply the NA imputation. It can be a vector of integers or a vector with the block names (Optional).

minfrac

minimum fraction of samples with elements different than NA and Infinite necessary. If this number is not reached for a variable, this variable will be discarded. The default value is 0.5.

method

The method to use for imputation of the NA values. 'none' will not do any transformation besides the variable discard based on the minfrac. 'zero' will convert all NA to 0 values. 'median' will replace the NA values with the median value of the column. 'fixed.value' will replace the NA by a constant number given by the user. 'fixed.value.all' will add a constant value to all the elements (NAs and not NAs) in the multi-block. 'discard' will delete the columns containing NA values. 'fixed.noise' and 'random.noise' will replace the NA values with noise. 'QRILC' will impute values using the Quantile Regression Imputation Left-Censored data method.

constant

For the 'fixed.value' method, the value by which the NAs are replaced. For the 'fixed.value.all', the value added to each element in the multi-block. The default number is 0.

factor.NA

For the 'fixed.noise' and the 'random.noise' method. For the 'fixed.noise' approach, this is the factor by which the minimal non-NA value of the column is multiplied. For the 'random.value' approach, this value will be multiplied to the minimal value of each column to determine the mean of the normal distribution. The default number is 0.5.

sd.noise

For the 'random.noise' method, this is the factor used to define the SD of the normal distribution of the random noise. The SD will be equal to sd.noise multiplied by the mean of the random noise. The default number is 0.3.

seed.number

For the 'random.noise' method, this is the seed to create the random numbers (Optional).

tune.sigma

For the 'QRILC' method. A scalar used to tune the standard deviation (if the complete data distribution is not Gaussian). The default value is tune.sigma = 1, and it corresponds to the case where the complete data distribution is Gaussian.

showWarning

If TRUE, it will return a warning in case there is any variable with only NAs in the multi-block.

Value

The MultiBlock

Examples

b1 = matrix(rnorm(500),10,50)
b2 = matrix(rnorm(800),10,80)
b2[c(2,3,5),c(1,2,3)] <- NA
# Build multi-block by adding one data block at a time:
mb <- BuildMultiBlock(b1, b2)
mb <- NARemoveMultiBlock(mb, method = 'zero')

f-puig/R.ComDim documentation built on Feb. 20, 2024, 6:49 a.m.