InfRemoveMultiBlock: InfRemoveMultiBlock

View source: R/InfRemoveMultiBlock.R

InfRemoveMultiBlockR Documentation

InfRemoveMultiBlock

Description

Remove NA from a MultiBlock structure.

Usage

InfRemoveMultiBlock(
  MB = MB,
  blocks = NULL,
  minfrac = 0.5,
  method = c("none", "fixed.noise", "random.noise")[3],
  factor.NA = 0.5,
  sd.noise = 0.3,
  seed.number = NULL,
  showWarning = TRUE
)

Arguments

MB

The MultiBlock structure.

blocks

The blocks to apply the NA imputation. It can be a vector of integers or a vector with the block names. Facultative.

minfrac

minimum fraction of samples with elements different than NA and Infinite necessary. If this number is not reached for a variable, this variable will be discarded. The default value is 0.5.

method

The method to use for imputation of the Inf values. 'none' will not do any transformation besides the variable discard based on the minfrac. 'fixed.noise' and 'random.noise' will replace the Inf values with noise, estimated from the minimum and maximum non-NA and non-inf values in each column. 'random.noise' differs from 'fixed.noise' in that the value to replace the Inf in 'fixed.noise' is constant while for 'random.noise' it is not. 'random.noise' obtain these values from two normal distributions (to replace the Inf and the -Inf, respectively).

factor.NA

Used in the 'fixed.noise' and the 'random.noise' method. For the 'fixed.noise' approach, -Inf will be replaced by the minimal non-NA value of the column multiplied by this number. Positive infinite values will be replaced by the max value multiplied by the inverse of this number. For the 'random.value' approach, these two numbers will be used as the mean of the normal distributions to impute the negative and the positive infinite values, respectively. The default number is 0.5.

sd.noise

For the 'random.noise' method, this is the factor used to define the SD of the normal distribution of the random noise. The SD will be equal to sd.noise multiplied by the mean of the random noise. The default number is 0.3.

seed.number

For the 'random.noise' method, this is the seed to create the random numbers. Facultative.

showWarning

If TRUE, it will return a warning in case there is any variable with only NAs in the multi-block.

Value

The multi-block

Examples

b1 = matrix(rnorm(500),10,50)
b2 = matrix(rnorm(800),10,80)
b2[c(2,3,5),c(1,2,3)] <- Inf
# Build multi-block by adding one data block at a time:
mb <- BuildMultiBlock(b1, newSamples = paste0('sample_',1:10))
mb <- BuildMultiBlock(b2, growingMB = mb)
mb <- InfRemoveMultiBlock(mb, method = 'zero')

f-puig/R.ComDim documentation built on Feb. 20, 2024, 6:49 a.m.