singR | R Documentation |
This function combines all steps from the SING paper
singR(
dX,
dY,
n.comp.X = NULL,
n.comp.Y = NULL,
df = 0,
rho_extent = c("small", "medium", "large"),
Cplus = TRUE,
tol = 1e-10,
stand = FALSE,
distribution = "JB",
maxiter = 1500,
individual = FALSE,
whiten = c("sqrtprec", "eigenvec", "none"),
restarts.dbyd = 0,
restarts.pbyd = 20
)
dX |
original dataset for decomposition, matrix of n x px. |
dY |
original dataset for decomposition, matrix of n x py. |
n.comp.X |
the number of non-Gaussian components in dataset X. If null, will estimate the number using ICtest::FOBIasymp. |
n.comp.Y |
the number of non-Gaussian components in dataset Y. If null, will estimate the number using ICtest::FOBIasymp. |
df |
default value=0 when use JB, if df>0, estimates a density for the loadings using a tilted Gaussian (non-parametric density estimate). |
rho_extent |
Controls similarity of the scores in the two datasets. Numerical value and three options in character are acceptable. small, medium or large is defined from the JB statistic. Try "small" and see if the loadings are equal, then try others if needed. If numeric input, it will multiply the input by JBall to get the rho. |
Cplus |
whether to use C code (faster) in curvilinear search. |
tol |
difference tolerance in curvilinear search. |
stand |
whether to use standardization, if true, it will make the column and row means to 0 and columns sd to 1. If false, it will only make the row means to 0. |
distribution |
"JB" or "tiltedgaussian"; "JB" is much faster. In SING, this refers to the "density" formed from the vector of loadings. "tiltedgaussian" with large df can potentially model more complicated patterns. |
maxiter |
the max iteration number for the curvilinear search. |
individual |
whether to return the individual non-Gaussian components, default value = F. |
whiten |
whitening method used in lngca. Defaults to "svd" which uses the n left eigenvectors divided by sqrt(px-1) by 'eigenvec'. Optionally uses the square root of the n x n "precision" matrix by 'sqrtprec'. |
restarts.dbyd |
default = 0. These are d x d initial matrices padded with zeros, which results in initializations from the principal subspace. Can speed up convergence but may miss low variance non-Gaussian components. |
restarts.pbyd |
default = 20. Generates p x d random orthogonal matrices. Use a large number for large datasets. Note: it is recommended that you run lngca twice with different seeds and compare the results, which should be similar when a sufficient number of restarts is used. In practice, stability with large datasets and a large number of components can be challenging. |
Function outputs a list including the following:
Sjx
variable loadings for joint NG components in dataset X with matrix rj x px.
Sjy
variable loadings for joint NG components in dataset Y with matrix rj x py.
Six
variable loadings for individual NG components in dataset X with matrix riX x px.
Siy
variable loadings for individual NG components in dataset Y with matrix riX x py.
Mix
scores of individual NG components in X with matrix n x riX.
Miy
scores of individual NG components in Y with matrix n x riY.
est.Mjx
Estimated subject scores for joint components in dataset X with matrix n x rj.
est.Mjy
Estimated subject scores for joint components in dataset Y with matrix n x rj.
est.Mj
Average of est.Mjx and est.Mjy as the subject scores for joint components in both datasets with matrix n x rj.
C_plus
whether to use C version of curvilinear search.
rho_extent
the weight of rho in search
df
degree of freedom, = 0 when use JB, >0 when use tiltedgaussian.
#get simulation data
data(exampledata)
# use JB stat to compute with singR
output_JB=singR(dX=exampledata$dX,dY=exampledata$dY,
df=0,rho_extent="small",distribution="JB",individual=TRUE)
# use tiltedgaussian distribution to compute with singR.
# tiltedgaussian may be more accurate but is considerably slower,
# and is not recommended for large datasets.
output_tilted=singR(dX=exampledata$dX,dY=exampledata$dY,
df=5,rho_extent="small",distribution="tiltedgaussian",individual=TRUE)
# use pmse to measure difference from the truth
pmse(M1 = t(output_JB$est.Mj),M2 = t(exampledata$mj),standardize = TRUE)
pmse(M1 = t(output_tilted$est.Mj),M2 = t(exampledata$mj),standardize = TRUE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.