bootstrap | R Documentation |
Bootstrap resampling approach to estimate the confidence intervals for the cluster prototypes.
bootstrap(data, k, H, mtimes = 50, lr = 0.01, ncore = 2)
data |
Data matrix or data frame. |
k |
The number of prototypes/clusters. |
H |
Matrix, input |
mtimes |
Integer, number of bootstrap samples. Default number is 50. |
lr |
Optimisation learning rate in ssmf(). |
ncore |
The number of cores to use for parallel execution. |
Create bootstrap samples of size n
by sampling from the data set with replacement and repeat the steps M
times.
The m^{th}
bootstrap sample is denoted as
X^{{\ast}(m)}=(x_1^{{\ast}(m)}, x_2^{{\ast}(m)},\ldots,x_n^{{\ast}(m)}),
where each x_i^{{\ast}(m)}
is a random sample (with replacement) from the data set.
Then, apply the SSMF algorithm to each bootstrap sample and calculate the m^{th}
bootstrap replicate of the prototypes matrix,
which is denoted as W^{{\ast}(m)}
.
The estimate standard deviation of M
bootstrap replicates can be calculated by
sd(W^{\ast}) =\sqrt {\frac{1}{M-1} \sum_{m=1}^{M} [W^{{\ast}(m)}-\overline{W}^{\ast}]^2 },
where \overline{W}^{\ast}=\frac{1}{M} \sum_{m=1}^{M} W^{{\ast}(m)}
. Therefore, the 95% CIs for the prototypes can be calculated by
(\overline{W}^{\ast}-t_{(0.025, M-1)} \cdot sd(W^{\ast}),\ \overline{W}^{\ast}+t_{(0.975, M-1)} \cdot sd(W^)),
where t_{(0.025, n-1)}
and t_{(0.975, n-1)}
is the quantiles of student t
distribution with 95% significance and (M-1)
degrees of freedom.
W.est
The W
matrix estimated by bootstrap.
lower
Lower bound of confidence intervals.
upper
Upper bound of confidence intervals.
Wenxuan Liu
Stine, R. (1989). An Introduction to Bootstrap Methods: Examples and Ideas. Sociological Methods & Research, 18(2-3), 243-291. <doi:10.1177/0049124189018002003>
# example code
data <- SimulatedDataset
k <- 4
fit <- ssmf(data = data, k = k)
bootstrap(data = data , k = k, H = fit$H)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.