semPower.powerRegression: semPower.powerRegression

View source: R/convenienceFunctions.R

semPower.powerRegressionR Documentation

semPower.powerRegression

Description

Convenience function for performing power analysis on slope(s) in a latent regression of the form Y = XB. This requires the lavaan package.

Usage

semPower.powerRegression(
  type,
  comparison = "restricted",
  slopes = NULL,
  corXX = NULL,
  nullEffect = "slope = 0",
  nullWhich = NULL,
  nullWhichGroups = NULL,
  standardized = TRUE,
  ...
)

Arguments

type

type of power analysis, one of 'a-priori', 'post-hoc', 'compromise'.

comparison

comparison model, one of 'saturated' or 'restricted' (the default). This determines the df for power analyses. 'saturated' provides power to reject the model when compared to the saturated model, so the df equal the one of the hypothesized model. 'restricted' provides power to reject the hypothesized model when compared to an otherwise identical model that just omits the restrictions defined in nullEffect, so the df equal the number of restrictions.

slopes

vector of slopes (or a single number for a single slope) of the k predictors for Y. A list of slopes for multigroup models.

corXX

correlation(s) between the k predictors (X). Either NULL for uncorrelated predictors, a single number (for k = 2 predictors), or a matrix. Can also be a list for multigroup models providing the correlations by group of matrices (otherwise, the same correlations are used in all groups).

nullEffect

defines the hypothesis of interest, must be one of 'slope = 0' (the default) to test whether a slope is zero, 'slopeX = slopeZ' to test for the equality of slopes, or 'slopeA = slopeB' to test for the equality of slopes across groups. Define the slopes to set to equality in nullWhich.

nullWhich

single number indicating which slope is hypothesized to equal zero when nullEffect = 'slope = 0', or indicating which slope to restrict to equality across groups when nullEffect = 'slopeA = slopeB', or vector defining the slopes to restrict to equality when nullEffect = 'slopeX = slopeZ'. Can also contain more than two slopes, e.g. c(1, 2, 3) to constrain the first three slopes to equality.

nullWhichGroups

for nullEffect = 'slopeA = slopeB', vector indicating the groups for which equality constrains should be applied, e.g. c(1, 3) to constrain the relevant parameters of the first and the third group. If NULL, all groups are constrained to equality.

standardized

whether all parameters should be standardized (TRUE, the default). If FALSE, all regression relations are unstandardized.

...

mandatory further parameters related to the specific type of power analysis requested, see semPower.aPriori(), semPower.postHoc(), and semPower.compromise(), and parameters specifying the factor model. The first factor is treated as Y and the subsequent factors as the predictors X_k. See details.

Details

This function performs a power analysis to reject various hypotheses arising in SEM models involving a simple regression relation of the form Y = b_1*X_1 + ... + b_k*X_k between the factors:

  • nullEffect = 'slope = 0': Tests the hypothesis that the slope for a predictor is zero.

  • nullEffect = 'slopeX = slopeZ': Tests the hypothesis that two or more slopes are equal to each other.

  • nullEffect = 'slopeA = slopeB': Tests the hypothesis that the slope for a predictor is equal in two or more groups (always assuming metric invariance).

For hypotheses regarding mediation effects, see semPower.powerMediation(). For hypothesis in autoregressive models, see semPower.powerAutoreg().

Beyond the arguments explicitly contained in the function call, additional arguments are required specifying the factor model and the requested type of power analysis.

Additional arguments related to the definition of the factor model:

  • Lambda: The factor loading matrix (with the number of columns equaling the number of factors).

  • loadings: Can be used instead of Lambda: Defines the primary loadings for each factor in a list structure, e. g. loadings = list(c(.5, .4, .6), c(.8, .6, .6, .4)) defines a two factor model with three indicators loading on the first factor by .5, , 4., and .6, and four indicators loading on the second factor by .8, .6, .6, and .4.

  • nIndicator: Can be used instead of Lambda: Used in conjunction with loadM. Defines the number of indicators by factor, e. g., nIndicator = c(3, 4) defines a two factor model with three and four indicators for the first and second factor, respectively. nIndicator can also be a single number to define the same number of indicators for each factor.

  • loadM: Can be used instead of Lambda: Used in conjunction with nIndicator. Defines the loading either for all indicators (if a single number is provided) or separately for each factor (if a vector is provided), e. g. loadM = c(.5, .6) defines the loadings of the first factor to equal .5 and those of the second factor do equal .6.

So either Lambda, or loadings, or nIndicator and loadM need to be defined. If the model contains observed variables only, use Lambda = diag(x) where x is the number of variables.

Note that the first factor acts as the criterion Y, the subsequent factors as predictors X_1 to X_k.

Additional arguments related to the requested type of power analysis:

  • alpha: The alpha error probability. Required for type = 'a-priori' and type = 'post-hoc'.

  • Either beta or power: The beta error probability and the statistical power (1 - beta), respectively. Only for type = 'a-priori'.

  • N: The sample size. Always required for type = 'post-hoc' and type = 'compromise'. For type = 'a-priori' and multiple group analysis, N is a list of group weights.

  • abratio: The ratio of alpha to beta. Only for type = 'compromise'.

If a simulated power analysis (simulatedPower = TRUE) is requested, optional arguments can be provided as a list to simOptions:

  • nReplications: The targeted number of simulation runs. Defaults to 250, but larger numbers greatly improve accuracy at the expense of increased computation time.

  • minConvergenceRate: The minimum convergence rate required, defaults to .5. The maximum actual simulation runs are increased by a factor of 1/minConvergenceRate.

  • type: specifies whether the data should be generated from a population assuming multivariate normality ('normal'; the default), or based on an approach generating non-normal data ('IG', 'mnonr', 'RC', or 'VM'). The approaches generating non-normal data require additional arguments detailed below.

  • missingVars: vector specifying the variables containing missing data (defaults to NULL).

  • missingVarProp: can be used instead of missingVars: The proportion of variables containing missing data (defaults to zero).

  • missingProp: The proportion of missingness for variables containing missing data (defaults to zero), either a single value or a vector giving the probabilities for each variable.

  • missingMechanism: The missing data mechanism, one of MCAR (the default), MAR, or NMAR.

  • nCores: The number of cores to use for parallel processing. Defaults to 1 (= no parallel processing). This requires the doSNOW package.

type = 'IG' implements the independent generator approach (IG, Foldnes & Olsson, 2016) approach specifying third and fourth moments of the marginals, and thus requires that skewness (skewness) and excess kurtosis (kurtosis) for each variable are provided as vectors. This requires the covsim package.

type = 'mnonr' implements the approach suggested by Qu, Liu, & Zhang (2020) and requires provision of Mardia's multivariate skewness (skewness) and kurtosis (kurtosis), where skewness must be non-negative and kurtosis must be at least 1.641 skewness + p (p + 0.774), where p is the number of variables. This requires the mnonr package.

type = 'RK' implements the approach suggested by Ruscio & Kaczetow (2008) and requires provision of the population distributions of each variable (distributions). distributions must be a list (if all variables shall be based on the same population distribution) or a list of lists. Each component must specify the population distribution (e.g. rchisq) and additional arguments (list(df = 2)).

type = 'VM' implements the third-order polynomial method (Vale & Maurelli, 1983) specifying third and fourth moments of the marginals, and thus requires that skewness (skewness) and excess kurtosis (kurtosis) for each variable are provided as vectors. This requires the semTools package.

Value

a list. Use the summary method to obtain formatted results. Beyond the results of the power analysis and a number of effect size measures, the list contains the following components:

Sigma

the population covariance matrix. A list for multiple group models.

mu

the population mean vector or NULL when no meanstructure is involved. A list for multiple group models.

SigmaHat

the H0 model implied covariance matrix. A list for multiple group models.

muHat

the H0 model implied mean vector or NULL when no meanstructure is involved. A list for multiple group models.

modelH0

lavaan H0 model string.

modelH1

lavaan H1 model string or NULL when the comparison refers to the saturated model.

simRes

detailed simulation results when a simulated power analysis (simulatedPower = TRUE) was performed.

See Also

semPower.genSigma() semPower.aPriori() semPower.postHoc() semPower.compromise()

Examples

## Not run: 
# latent regression of the form `Y = .2*X1 + .3*X2`, where X1 and X2 correlate by .4
# obtain required N to reject the hypothesis that the slope of X1 is zero 
# with a power of 95% on alpha = 5%,   
# where Y is measured by 3 indicators loading by .5 each,
# X1 by 5 indicators loading by .6 each, and
# X2 by 4 indicators loading by .7 each. 
powerReg <- semPower.powerRegression(type = 'a-priori',
                                     slopes = c(.2, .3), corXX = .4, 
                                     nullWhich = 1, 
                                     nIndicator = c(3, 5, 4), 
                                     loadM = c(.5, .6, .7),
                                     alpha = .05, beta = .05)
# show summary
summary(powerReg)
# optionally use lavaan to verify the model was set-up as intended
lavaan::sem(powerReg$modelH1, sample.cov = powerReg$Sigma, 
sample.nobs = powerReg$requiredN, sample.cov.rescale = FALSE)
lavaan::sem(powerReg$modelH0, sample.cov = powerReg$Sigma, 
sample.nobs = powerReg$requiredN, sample.cov.rescale = FALSE)

# same as above, but determine power with N = 500 on alpha = .05 
powerReg <- semPower.powerRegression(type = 'post-hoc',
                                     slopes = c(.2, .3), corXX = .4, 
                                     nullWhich = 1, 
                                     nIndicator = c(3, 5, 4), 
                                     loadM = c(.5, .6, .7),
                                     alpha = .05, N = 500)
                                     
# same as above, but determine the critical chi-square with N = 500 so that alpha = beta 
powerReg <- semPower.powerRegression(type = 'compromise',
                                     slopes = c(.2, .3), corXX = .4, 
                                     nullWhich = 1, 
                                     nIndicator = c(3, 5, 4), 
                                     loadM = c(.5, .6, .7),
                                     abratio = .05, N = 500)
                                     
# same as above, but ask for the required N to detect that the slope of X2 is zero
powerReg <- semPower.powerRegression(type = 'a-priori',
                                     slopes = c(.2, .3), corXX = .4, 
                                     nullWhich = 2, 
                                     nIndicator = c(3, 5, 4), 
                                     loadM = c(.5, .6, .7),
                                     alpha = .05, beta = .05)

# same as above, but define unstandardized slopes
powerReg <- semPower.powerRegression(type = 'a-priori',
                                     slopes = c(.2, .3), corXX = .4,
                                     nullWhich = 2, 
                                     standardized = FALSE,
                                     nIndicator = c(3, 5, 4), 
                                     loadM = c(.5, .6, .7),
                                     alpha = .05, beta = .05)
                                     
# same as above, but compare to the saturated model
# (rather than to the less restricted model)
powerReg <- semPower.powerRegression(type = 'a-priori', 
                                     comparison = 'saturated',
                                     slopes = c(.2, .3), corXX = .4, 
                                     nullWhich = 2, 
                                     nIndicator = c(3, 5, 4), 
                                     loadM = c(.5, .6, .7),
                                     alpha = .05, beta = .05)
                                     
# same as above, but provide a reduced loading matrix defining
# three indicators with loadings of .7, .6, .5 on the first factor (Y),
# four indicators with loadings of .5, .6, .4, .8 on the second factor (X1), and
# three indicators with loadings of .8, .7, .8 on the third factor (X2).
powerReg <- semPower.powerRegression(type = 'a-priori',
                                     slopes = c(.2, .3), corXX = .4, 
                                     nullWhich = 2, 
                                     loadings = list(
                                       c(.7, .6, .5), 
                                       c(.5, .6, .4, .8),
                                       c(.8, .7, .8)),
                                     alpha = .05, beta = .05)
                              
# latent regression of the form `Y = .2*X1 + .3*X2 + .4*X3`, 
# providing the predictor intercorrelation matrix,
# and ask for the required N to detect that the first slope differs from zero.
corXX <- matrix(c(
  #   X1    X2    X3
  c(1.00, 0.20, 0.30),  # X1
  c(0.20, 1.00, 0.10),  # X2
  c(0.30, 0.10, 1.00)   # X3
), ncol = 3,byrow = TRUE)
powerReg <- semPower.powerRegression(type = 'a-priori',
                                     slopes = c(.2, .3, .4), corXX = corXX, 
                                     nullWhich = 1,
                                     nIndicator = c(4, 3, 5, 4),
                                     loadM = c(.5, .5, .6, .7),
                                     alpha = .05, beta = .05)

# same as above, but ask for the required N to detect that 
# the slope for X1 (b = .2) and the slope for X2 (b = .3) differ from each other
powerReg <- semPower.powerRegression(type = 'a-priori',
                                     slopes = c(.2, .3, .4), corXX = corXX, 
                                     nullEffect = 'slopeX = slopeZ', 
                                     nullWhich = c(1, 2),
                                     nIndicator = c(4, 3, 5, 4),
                                     loadM = c(.5, .5, .6, .7),
                                     alpha = .05, beta = .05)
                                     
# same as above, but ask for the required N to reject the hypothesis that 
# all three slopes are equal to each other
powerReg <- semPower.powerRegression(type = 'a-priori',
                                     slopes = c(.2, .3, .4), corXX = corXX, 
                                     nullEffect = 'slopeX = slopeZ', 
                                     nullWhich = c(1, 2, 3),
                                     nIndicator = c(4, 3, 5, 4),
                                     loadM = c(.5, .5, .6, .7),
                                     alpha = .05, beta = .05)
    
# get required N to detect that 
# the slope for X2 group 1 (of b2 = .3) differs from the slope for X2 in group 2 (of b = .0). 
# The remaining slopes are equal in both groups (b1 = .2, b3 = .4).
# The measurement model is identical in both groups:
# The criterion (Y) is measured by 4 indicators loading by .5 each, 
# Predictors X1 and X3 are both measured by 5 indicators loading by .6 each,
# Predictor X2 is measured by 3 indicators loading by .7 each.
# Both groups are sized equally (N = list(1, 1)).
powerReg <- semPower.powerRegression(type = 'a-priori',
                                     slopes = list(c(.2, .3, .4), 
                                     c(.2, .0, .4)), 
                                     corXX = corXX, 
                                     nullEffect = 'slopeA = slopeB', 
                                     nullWhich = 2,
                                     nIndicator = c(4, 5, 3, 5),
                                     loadM = c(.5, .6, .7, .6),
                                     alpha = .05, beta = .05, 
                                     N = list(1, 1))

# request a simulated post-hoc power analysis with 500 replications 
# to detect that the slope of X1 differs from zero.
set.seed(300121)
powerReg <- semPower.powerRegression(type = 'post-hoc',
                                     slopes = c(.2, .1), 
                                     nullWhich = 1,
                                     nIndicator = c(4, 3, 3), loadM = .5,
                                     alpha = .05, N = 500, 
                                     simulatedPower = TRUE, 
                                     simOptions = list(nReplications = 500))

## End(Not run)

semPower documentation built on Sept. 30, 2024, 9:24 a.m.