DRPCA: Distributed Robust Principal Component Analysis (DRPCA) for...

View source: R/DRPCA.R

DRPCAR Documentation

Distributed Robust Principal Component Analysis (DRPCA) for Handling Missing Data

Description

This function performs DRPCA to handle missing data by dividing the dataset into D blocks, applying the Robust Principal Component Analysis (RPCA) method to each block, and then combining the results. It calculates various evaluation metrics including RMSE, MMAE, RRE, and Generalized Cross-Validation (GCV) using different hierarchical clustering methods.

Usage

DRPCA(data0, data.sample, data.copy, mr, km, D)

Arguments

data0

The original dataset containing the response variable and features.

data.sample

The dataset used for sampling, which may contain missing values.

data.copy

A copy of the original dataset, used for comparison or validation.

mr

Indices of the rows with missing values that need to be predicted.

km

The number of clusters for k-means clustering.

D

The number of blocks to divide the data into.

Value

A list containing:

XDRPCA

The imputed dataset.

RMSEDRPCA

The Root Mean Squared Error.

MAEDRPCA

The Mean Absolute Error.

REDRPCA

The Relative Eelative Error.

GCVDRPCA

Distributed DRPCA Imputation for Generalized Cross-Validation.

timeDRPCA

The DRPCA algorithm execution time.

See Also

RPCA for the original RPCA function.

Examples

# Create a sample dataset with missing values
set.seed(123)
n <- 100
p <- 10
D <- 2
data.sample <- matrix(rnorm(n * p), nrow = n)
data.sample[sample(1:(n-10), (p-2))] <- NA
data.copy <- data.sample
data0 <- data.frame(data.sample, response = rnorm(n))
mr <- sample(1:n, 10)  # Sample rows for evaluation
km <- 3  # Number of clusters
result <- DRPCA(data0, data.sample, data.copy, mr, km, D)
#Print the results
print(result$XDRPCA)

DTSR documentation built on April 3, 2025, 11:35 p.m.

Related to DRPCA in DTSR...