pvals_nc_chisq: Calculate high precision p-values from Jackstraw data under...

View source: R/pvals_nc_chisq.R

pvals_nc_chisqR Documentation

Calculate high precision p-values from Jackstraw data under non-central chi squared distribution for null statistics

Description

This function takes a Jackstraw object for convenience, estimates the non-centrality parameter (NCP) from the null statistics (using ncp_est()), then uses this non-central chi squared model to calculate p-values for the observed statistics. The goal is to be able to calculate small p-values with arbitrary precision, which is normally not possible with the ordinary Jackstraw functions since the minimum empirical p-value is determined by the inverse of the number of null samples (1 /( s * B ) where sandB' are Jackstraw parameters (see jackstraw_lfa())). In contrast, large p-values values are typically similar between empirical and NCP model versions, and in fact this must hold if the null model is correctly specified. This model works well empirically for jackstraw_lfa() and jackstraw_alstructure(), whose statistics are deviances that would ordinarily be central chi-squared distributed were the test factors not estimated from the same data. This model is not appropriate for data from jackstraw_pca(), whose statistics have a roughly F distribution, and possibly many other functions from this package.

Usage

pvals_nc_chisq(out, df, null.stat = out$null.stat, obs.stat = out$obs.stat)

Arguments

out

The object returned by jackstraw_lfa() and jackstraw_alstructure(). In particular, a list with at least the following two named elements: null.stat containing the null statistics on which the NCP null model is estimated, and obs.stat containing the observed or actual statistics for which p-values are calculated.

df

The degrees of freedom of the test used to calculate the statistics. Note that for ordinary jackstraw runs this is r-1, since the null model/intercept is counted among the r latent variables, but alternate values can be provided for more complex tests.

null.stat

Alternate way to provide these values if the input is not a Jackstraw object, so out is not available.

obs.stat

Alternate way to provide these values if the input is not a Jackstraw object, so out is not available.

Value

A list with two named values:

  • p.value: the vector of p-values calculated for the input vector obs.stat

  • ncp: the numeric vector of two values of NCP estimates calculated from null.stat, namely the maximum likelihood (MLE) and method-of-moments estimates (see ncp_est()). Only MLE is used to calculate p-values.

See Also

ncp_est() for more notes on estimating non-centrality parameters only.

jackstraw_lfa(), jackstraw_alstructure()

Examples

# instead of running jackstraw here, we simulate toy data
df <- 1
ncp_true <- 1
# null data are truly non-central chi squared distributed
# observed data are the same but with huge power reflected in a larger NCP
out <- list(
    null.stat = rchisq( 100, df, ncp_true ),
    obs.stat = rchisq( 11, df, ncp_true + 10 )
)

# this calculates new p-values with much higher precision for highly significant cases
out2 <- pvals_nc_chisq( out, df )
# these are the desired p-values
out2$p.value
# and NCP estimates from two methods
out2$ncp

## Not run: 
# This is the more typical usage in practice
# first run Jackstraw-LFA, which returns empirical p-values with limited precision and raw stats
out <- jackstraw_lfa( dat, r, FUN = function(x) lfa( x, r ) )
# then fit NCP model and use it to calculate high-precision p-values!
out2 <- pvals_nc_chisq( out, r-1 )

# compare p-values!  In linear scale they should agree very well
plot( out$p.value, out2$p.value, pch = '.' )
# but in log scale it's clear the NCP values can take on much smaller values
plot( out$p.value, out2$p.value, pch = '.', log = 'xy' )

## End(Not run)


ncchung/jackstraw documentation built on Dec. 22, 2024, 8:33 p.m.