Description Usage Arguments Details Value Author(s) References Examples

CLfdr returns the local false discovery rate (FDR) conditional on auxiliary covariate information

1 2 3 |

`x` |
covariates, could be a vector of length |

`y` |
a vector of |

`pval` |
a vector of p-values of length |

`pi0.method` |
method to estimate the overall true null proportion (pi0). "RB" for the right-boundary procedure (Liang and Nettleton, 2012, JRSSB) or "JC" (Jin and Cai, 2007, JASA). |

`bw.init` |
initial values for bandwidth, optional. If not specified, normal-reference rule will be used. |

`bw` |
bandwidth values. |

`reltol` |
relative tolerance in optim function. |

`n.subsample` |
size of the subsample when esitmating bandwidth. |

`check.gam` |
indicator to perform gam.check function on the nonparametric fit. |

`k.gam` |
tuning parameter for mgcv::gam. |

`info` |
indicator to print out fitting information. |

In many multiple testing applications, the auxiliary information is widely available and can be useful. Such information can be summary statistics from a similar experiment or disease, the lengths of gene coding regions, and minor allele frequencies of SNPs.

`y`

is a vector of *m* *z*-values, one of each hypothesis
under test. The *z*-values follow N(0,1) if their corresponding
null hypotheses are true. Other types of test statistics, such
as *t*-statistics and *p*-values can be transformed to
*z*-values. In practice, if the distribution of *z*-values is
far from N(0,1), recentering and rescaling of the *z*-values
may be necessary.

`x`

contains auxiliary covariate information. For a single covariate,
`x`

should be a vector of length *m*. For multiple covariates,
`x`

should be a matrix with *m* rows. The covariates can be either
continuous or ordered.

`pi0.method`

specifies the method used to estimate the overall true
null proportion. If the *z*-values are generated from the normal
means model, the "JC" method from Jin and Cai (2007) JASA can be
a good candidate. Otherwise, the right-boundary procedure ("RB",
Liang and Nettleton, 2012, JRSSB) is used.

`bw`

are bandwidth values for estimating local alternative density.
Suppose there are *p* covariates, then `bw`

should be a
vector of *p*+1 positive numerical values. By default, these
bandwidth values are chosen by cross-validation to minimize
a certain error measure. However, finding the optimal bandwidth
values by cross-validation can be computationally intensive,
especially when *p* is not small. If good estimates of bandwidth
values are available, for example, from the analysis of a similar
dataset, the bandwidth values can be specified explicitly to save time.

`reltol`

specifies the relative convergence tolerance when choosing the
bandwidth values (`bw`

). It will be passed on to
`stats::optim()`

. For most analyses, the default value
of 1e-4 provides reasonably good results. A smaller value such as 1e-5
or 1e-6 could be used for further improvement at the cost of more
computation time.

`fdr` |
a vector of local FDR estimates. fdr[i] is the posteiror probability of the ith null hypothesis is true given all the data. 1-fdr[i] is the posterior probability of being a signal (the corresponding null hypothesis is false). |

`FDR` |
a vector of FDR values (q-values), which can be used to control FDR at a certain level by thresholding the FDR values. |

`pi0` |
a vector of true null probability estimates. This contains the prior probabilities of being null. |

`bw` |
a vector of bandwidths for conditional alternative density estimation |

`fit.gam` |
an object of mgcv::gam |

Kun Liang, kun.liang@uwaterloo.ca

Liang (2019), Empirical Bayes analysis of RNA sequencing experiments with auxiliary information, to appear in Annals of Applied Statistics

1 2 3 4 5 6 7 8 9 10 11 |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.