set.missing: Cope with possible missing data in the ultrasound profile

View source: R/MPMcore.R

set.missingR Documentation

Cope with possible missing data in the ultrasound profile

Description

This function imputes missing data in the ultrasound profile, creating a new profile with imputed missing values. If no missing values are found, it will simply send a message and return the input profile.

Usage

set.missing(
  v,
  ref = NULL,
  levels = NULL,
  con = 1:2,
  cat = 3:14,
  missing = -1,
  na = NA,
  asNumeric = TRUE,
  ...
)

Arguments

v

An ultrasound profile generated by new.profile, i.e., a list of two objects, containing a vector of length 14 (list$ultasound), corresponding to the input ultrasound profile (14 ultrasound variables or "features"), and a vector of missing value indices (list$missing).

ref

A data.frame representing the reference dataset. The ultrasound profile will be attached to the reference dataset before the imputation. This argument is required to impute missing features.

levels

A list of length 14, corresponding to the levels of each ultrasound variable. Needed for categorical variables (factors); for continuous variables, it should assume the nominal value of 0. The default levels variable mpm.levels can be used.

con

Vector of the indices corresponding to continuous variables in the list$ultasound vector (default = 1:2).

cat

Vector of the indices corresponding to categorical variables in the list$ultasound vector (default = 3:14).

missing

Value used to mark missing data (default = -1).

na

Value used for "not available" data (default = NA). This will be used to dubstitute missing within the ultrasound vector before the imputation.

asNumeric

Logical value used to convert every value in the ultrasound vector to class "numeric". This argument is used only if ref = NULL and levels = NULL.

...

Currently ignored.

Details

Automatic imputation is necessary to improve RFC-based (malignancy prediction) and RBM-based (metastatic risk evaluation) estimations. Imputation is currently forbidden for short axis and cortical thickness (i.e., the first two ultrasound features), since they have a critical role in the prediction, estimation and signature detection processes. Hence, their actual value must be entered for a reliable prediction. Although permitted, the imputation is discouraged for the following three features: nodal core sign (i.e., hilum presence), perinodal hyperechogenic ring (i.e., the presence of inflammatory stroma), and cortical interruption (i.e., extracapsular spread). These features define a strongly metastatic profile with possible multiple metastases (i.e., the "MET" signature) that are hardly imputable from the other ultrasound variables.

Value

An ultrasound profile with imputed missing values.

Author(s)

Fernando Palluzzi fernando.palluzzi@gmail.com

See Also

See set.rfcdata for random forest (RFC) data preparation and set.rbmdata for robust binomial model (RBM) data preparation.

Examples


# Create an ultrasound profile with missing values

u <- new.profile(c(10.0, 6.3, 1, 0, 0, 0, -1, 1, 2, 2, 3, 1, -1, -1))
print(u)

# Fix missing values with the default simulated dataset as reference
# (ultrasound features only: \code{mpm.us} attributes 2 to 15).
# Default levels are provided by the \code{mpm.levels} object.

v <- set.missing(u, ref = mpm.us[, 2:15], levels = mpm.levels)
print(v)


Morphonodepredictivemodel/morphonode documentation built on Feb. 15, 2023, 4:51 a.m.