P_lambda: Calculation of the Lambda value

View source: R/script_v12-3_package.R

P_lambdaR Documentation

Calculation of the Lambda value

Description

The Lambda value represents the inflation of p-values compared to a normal distribution of p.

Usage

P_lambda(p)

Arguments

p

a numeric vector of p-values

Details

The function removes any missing values from p, and then returns:

median(qchisq(p, df=1, lower.tail=FALSE)) / qchisq(0.5, 1)

The lambda value represents the inflation of the p-values compared to a normal distribution. In a genome-wide study, one would expect the results for the vast majority of CpG sites to accord with the null hypothesis, i.e. the p-values are random, and have a normal distribution. Only sites that are significantly associated with the phenotype of interest should lie outside of the normal distribution.

Ideally the lambda value should be 1. Lambda represents the overall difference with the expected distribution - so the presence of a few significant results (i.e. p-values that do not follow the normal distribution) does not bias it.

However, if lambda is 2 or higher, it means that a substantial portion of your dataset is more significant than expected for a genome-wide study (i.e. oversignificance). This could mean your dataset has been filtered for low-significance markers. If this is not the case, you should consider doing a genomic control correction on the p-values, to correct the oversignificance.

Similary, values of 0.8 or lower indicate that your results are less significant than would be expected from a random distribution of p-values.

Value

A single numeric value, the lambda value.

Examples

  pvector <- ppoints(10000)
  P_lambda(pvector)
  # The lambda of a random distribution of p-values equals 1
  
  pvector[pvector > 0.9 & pvector < 0.91] <- NA
  P_lambda(pvector)
  # If low-significance results are removed (i.e. there are more
  # significant results than expected) lambda increases

QCEWAS documentation built on Feb. 16, 2023, 10:30 p.m.