| DRPCA | R Documentation |
This function performs DRPCA to handle missing data by dividing the dataset into D blocks, applying the Robust Principal Component Analysis (RPCA) method to each block, and then combining the results. It calculates various evaluation metrics including RMSE, MMAE, RRE, and Generalized Cross-Validation (GCV) using different hierarchical clustering methods.
DRPCA(data0, data.sample, data.copy, mr, km, D)
data0 |
The original dataset containing the response variable and features. |
data.sample |
The dataset used for sampling, which may contain missing values. |
data.copy |
A copy of the original dataset, used for comparison or validation. |
mr |
Indices of the rows with missing values that need to be predicted. |
km |
The number of clusters for k-means clustering. |
D |
The number of blocks to divide the data into. |
A list containing:
XDRPCA |
The imputed dataset. |
RMSEDRPCA |
The Root Mean Squared Error. |
MAEDRPCA |
The Mean Absolute Error. |
REDRPCA |
The Relative Eelative Error. |
GCVDRPCA |
Distributed DRPCA Imputation for Generalized Cross-Validation. |
timeDRPCA |
The DRPCA algorithm execution time. |
RPCA for the original RPCA function.
# Create a sample dataset with missing values
set.seed(123)
n <- 100
p <- 10
D <- 2
data.sample <- matrix(rnorm(n * p), nrow = n)
# Randomly select 10 rows to have missing values
mr <- sort(sample(1:n, 10))
missing_cols <- sample(1:p, 2)
data.copy <- data.sample
for (i in mr) {
data.sample[i, missing_cols] <- NA
}
data0 <- data.frame(data.sample, response = rnorm(n))
km <- 3
result <- DRPCA(data0, data.sample, data.copy, mr, km, D)
print(result$XDRPCA)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.