impute_missing | R Documentation |
Impute missing values from bootstrapped subsampling
impute_missing(E, data, nk)
E |
4D array of clusterings from |
data |
data matrix with samples as rows and genes/features as columns |
nk |
cluster size to extract data for (single value) |
The default output from consensus_cluster
will undoubtedly contain NA
entries because each replicate chooses a random subset (with replacement) of
all samples. Missing values should first be imputed using impute_knn()
. Not
all missing values are guaranteed to be imputed by KNN. See class::knn()
for details. Thus, any remaining missing values are imputed using majority
voting.
If flattened matrix consists of more than one repetition, i.e. it
isn't a column vector, then the function returns a matrix of clusterings
with complete cases imputed using majority voting, and relabelled, for
chosen k
.
Aline Talhouk
Other imputation functions:
impute_knn()
data(hgsc)
dat <- hgsc[1:100, 1:50]
E <- consensus_cluster(dat, nk = 3:4, reps = 10, algorithms = c("hc", "km",
"sc"), progress = FALSE)
sum(is.na(E))
E_imputed <- impute_missing(E, dat, 4)
sum(is.na(E_imputed))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.