Description Usage Arguments Details Value See Also
Initialize factor models "W" and "H" by sampling values from A, simulating comparable random distributions, or specifying uniform distribution bounds
1 2 3 4 5 6 7 8 9 |
A |
input dgCMatrix |
k |
rank |
W.method |
One of c("sample", "random", "bounded") |
H.method |
One of c("sample", "random", "bounded") |
n.sd |
number of standard deviations from the mean over which to sample values in A (excluding zeros). Applies only to method = c("sample", "random") |
bounds |
sample values between bounds if method = "bounded" |
seed |
never ever not remember to set the seed. It's NULL by default |
Methods support positive, non-zero initializations only.
random: samples values from a random normal distribution with the same mean as non-zero values in A and the same standard deviation
sample: samples non-zero values in A within n.sd
standard deviations of the mean of non-zero values in A
bounded: samples values from a random uniform distribution bounded between values specified in "bounds"
Values in W and H will automatically be scaled to approximate the distribution of A upon multiplication. Possible scenarios:
W.method and H.method = "sample" or "random": W and H are the square roots of their respective sampled values
W.method and H.method = "bounded" or other given distribution, no normalization is applied
W.method = "sample" or "random" and H.method = "bounded" or other given distribution (or the reverse): H is assigned first and is held constant, values in W are raised to the power of log(mean(A)/mean(H))/log(mean(W)) such that mean(W) * mean(H) approximates mean(A)
Basic usage:
For initialization of most models, "random" should suffice and is the ideal method for fast convergence to an accurate and robust local minima
For one-sided bernoulli factorization, initiate the bernoulli matrix bounded between for example bounds = c(0.4, 0.6)
and the non-bounded matrix with "random"
For one-sided multinomial factorization, initiate the multinomial matrix with the multinomial distribution or bounds corresponding to the limits of the distribution, and the non-bounded matrix with "random"
For models where "A" is comprised of many different values, "sample" may be a good option which is truer to the original distribution than a simple normal "random" method
If these initialization methods do not satisfy, LSMF also takes matrices as initializations for "W" and "H".
list of matrices "W" and "H"
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.