g2l.proc: Procedures for global and local inference.
In LPRelevance: Relevance-Integrated Statistical Inference Engine

View source: R/g2l.proc.R

g2l.proc

R Documentation

Procedures for global and local inference.

Description

This function performs customized fdr analyses tailored to each individual cases.

Usage

g2l.proc(X, z, X.target = NULL, z.target = NULL, m = c(4, 6), alpha = 0.1,
	nbag = NULL, nsample = length(z), lp.reg.method = "lm",
	null.scale = "QQ", approx.method = "direct", ngrid = 2000,
	centering = TRUE, coef.smooth = "BIC", fdr.method = "locfdr",
	plot = TRUE, rel.null = "custom", locfdr.df = 10,
	fdr.th.fixed = NULL, parallel = FALSE, ...)

Arguments

`X`	A n-by-d matrix of covariate values
`z`	A length n vector containing observations of z values.
`X.target`	A k-by-d matrix providing k sets of covariates for target cases to investigate. Set to NULL to investigate all cases and provide global inference results.
`z.target`	A vector of length k, providing the target z values to investigate
`m`	An ordered pair. First number indicates how many LP-nonparametric basis to construct for each X, second number indicates how many to construct for z. Default: `m=c(4,6)`.
`alpha`	Confidence level for determining signals.
`nbag`	Number of bags of parametric bootstrapped samples to use for each target case, each time a new set of relevance samples will be generated for analysis, and the resulting fdr curves are aggregated together by taking the mean values. Set to `NULL` to disable.
`nsample`	Number of relevance samples generated for each case. The default is the size of the input z-statistic.
`lp.reg.method`	Method for estimating the relevance function and its conditional LP-Fourier coefficients. We currently support three options: lm (inbuilt with subset selection), glmnet, and knn.
`null.scale`	Method of estimating null standard deviation from the laser samples. Available options: "IQR", "QQ" and "locfdr"
`approx.method`	Method used to approximate customized fdr curve, default is "direct".When set to "indirect", the customized fdr is computed by modifying pooled fdr using relevant density function.
`ngrid`	Number of gridpoints to use for computing customized fdr curve.
`centering`	Whether to perform regression-adjustment to center the data, default is TRUE.
`coef.smooth`	Specifies the method to use for LP coefficient smoothing (AIC or BIC). Uses BIC by default.
`fdr.method`	Method for controlling false discoveries (either "locfdr" or "BH"), default choice is "locfdr".
`plot`	Whether to include plots in the results, default is `TRUE`.
`rel.null`	How the relevant null changes with x: "custom" denotes we allow it to vary with x, and "th" denotes fixed.
`locfdr.df`	Degrees of freedom to use for `locfdr()`
`fdr.th.fixed`	Use fixed fdr threshold for finding signals. Default set to `NULL`, which finds different thresholds for different cases.
`parallel`	Use parallel computing for obtaining the relevance samples, mainly used for very huge `nsample`, default is FALSE.
`...`	Extra parameters to pass to other functions. Currently only supports the arguments for `knn()`.

Value

A list containing the following items:

`macro`	Available when `X.target` set to `NULL`, contains the following items:
`$result`	A list of global inference results:
`$X`	Matrix of covariates, same as input `X`.
`$z`	Vector of observations, same as input `z`.
`$probnull`	A vector of length n, indicating how likely the observed z belongs to local null.
`$signal`	A binary vector of length n, discoveries are indicated by 1.
`plots`	A list of plots for global inference:
`$signal_x`	A plot of signals discovered, marked in red
`$dps_xz`	A scatterplot of z on x, colored based on the discovery propensity scores, only available when `fdr.method = "locfdr"`.
`$dps_x`	A scatterplot of discovery propensity scores on x, only available when `fdr.method = "locfdr"`.
`micro`	Available when `X.target` are provided with values, contains the following items:
`$result`	Customized estimates for null probabilities for target X and z
`$result$signal`	A binary vector of length k, discoveries in the target cases are indicated by 1
`$global`	Pooled global estimates for null probabilities for target X and z
`$plots`	Customized fdr plots for the target cases.
`m.lp`	Same as input `m`

Author(s)

Subhadeep Mukhopadhyay, Kaijun Wang

Maintainer: Kaijun Wang <kaijunwang.19@gmail.com>

References

Mukhopadhyay, S., and Wang, K (2021) "On The Problem of Relevance in Statistical Inference". <arXiv:2004.09588>

Examples


data(funnel)
X<-funnel$x
z<-funnel$z
##macro-inference using locfdr and LASER:
g2l_macro<-g2l.proc(X,z)
g2l_macro$macro$plots

#Microinference for the DTI data: case A with x=(18,55) and z=3.95
data(data.dti)
X<- cbind(data.dti$coordx,data.dti$coordy)
z<-data.dti$z
g2l_x<-g2l.proc(X,z,X.target=c(18,55),z.target=3.95,nsample =3000)
g2l_x$micro$plots$fdr.1+ggplot2::coord_cartesian(xlim=c(0,4))
g2l_x$micro$result[4]

LPRelevance documentation built on May 18, 2022, 9:05 a.m.