| sample_na_loc | R Documentation |
Samples indices for NA injection into a matrix while maintaining row/column
missing value budgets and avoiding zero-variance columns.
sample_na_loc(
obj,
n_cols = NULL,
n_rows = 2L,
num_na = NULL,
n_reps = 1L,
rowmax = 0.9,
colmax = 0.9,
na_col_subset = NULL,
max_attempts = 100
)
obj |
A numeric matrix with samples in rows and features in columns. |
n_cols |
Integer. The number of columns to receive injected |
n_rows |
Integer. The target number of
|
num_na |
Integer. Total number of missing values to inject per
repetition. If supplied, |
n_reps |
Integer. Number of repetitions for random NA injection
(default |
rowmax, colmax |
Numbers between 0 and 1. NA injection cannot create rows/columns with a higher proportion of missing values than these thresholds. |
na_col_subset |
Optional integer or character vector restricting which
columns of
|
max_attempts |
Integer. Maximum number of resampling attempts per
repetition before giving up due to row-budget exhaustion (default |
The function uses a greedy stochastic search for valid NA locations. It
ensures that:
Total missingness per row and column does not exceed rowmax and colmax.
At least two distinct observed values are preserved in every column to ensure the column maintains non-zero variance.
A list of length n_reps. Each element is a two-column integer
matrix (row, col) representing the coordinates of the sampled NA
locations.
mat <- matrix(runif(100), nrow = 10)
# Sample 5 `NA` across 5 columns (1 per column)
locs <- sample_na_loc(mat, n_cols = 5, n_rows = 1)
locs
# Inject the `NA` from the first repetition
mat[locs[[1]]] <- NA
mat
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.