dhatL2: CD-plot and adjusted deviance test

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/LPBkg.R

Description

Construction of CD-plot and adjusted deviance test. The confidence bands are also adjusted for post-selection inference.

Usage

1
2
3
dhatL2(data, g, M = 6, Mmax = NULL, smooth = TRUE,
  criterion = "AIC", hist.u = TRUE, breaks = 20, ylim = c(0, 2.5),
  range = c(min(data),max(data)), sigma = 2)

Arguments

data

A vector of data. See details.

g

The postulated model from which we want to assess if deviations occur.

M

The desired size of the polynomial basis to be used.

Mmax

The maximum size of the polynomial basis from which M was selected (the default is 20). See details.

smooth

A logical argument indicating if a denoised solution should be implemented. The default is FALSE, meaning that the full solution should be implemented. See details.

criterion

If smooth=TRUE, the criterion with respect to which the denoising process should be implemented. The two possibilities are "AIC" or "BIC". See details.

hist.u

A logical argument indicating if the CD-plot should be displayed or not. The default is TRUE.

breaks

If hist.u=TRUE, the number of breaks of the CD-plot. The default is 20.

ylim

If hist.u=TRUE, the range of the y-axis of the CD-plot.

range

Range of the data/search region considered.

sigma

The significance level (in sigmas) with respect to which the confidence bands should be constructed. See details.

Details

The argument data collects the data for which we want to test if its distribution deviates from the one of the postulated model specified in the argument g. In Algeri, 2019, the sample specified under data corresponds to the source-free sample in the background calibration phase and to the physics sample in the signal search phase. The value M selected determines the smoothness of the estimated comparison density, with smaller values of M leading to smoother estimates. The deviance test is used to select the value M which leads to the most significant deviation from the postulated model. The default value for Mmax is set to 20. Notice that numerical issues may arise for larger values of Mmax. If smooth=TRUE the largest coefficient estimates are selected according to either the AIC or BIC criterion as described in Algeri, 2019 and Mukhopadhyay, 2017. If Mmax>1 and/or smooth=TRUE, post-selection Bonferroni's correction is automatically implemented to both the deviance test p-value and the confidence bands. The desired level of significance can be expressed as one minus the cdf of a standard normal evaluated at sigma (see Algeri, 2019).

Value

Deviance

Value of the deviance test statistic.

Dev_pvalue

Unadjusted p-value of the deviance test.

Dev_adj_pvalue

Post-selection Bonferroni adjusted p-value of the deviance test.

kstar

Number of coefficients selected by the denoising process. If smooth=FALSE, kstar=M.

dhat

Function corresponding to the estimated comparison density in the u domain.

dhat.x

Function corresponding to the estimated comparison density in the x domain.

SE

Function corresponding to the estimated standard errors of the comparison density in the u domain.

LBf1

Function corresponding to the lower bound of the confidence bands under in u domain.

UBf1

Function corresponding to the upper bound of the confidence bands in u domain.

f

Function corresponding to the estimated density of the data.

u

Vector of values corresponding to the cdf of the model specified in g evaluated at the vector data.

LP

Estimates of the coefficients.

G

Cumulative density function of the postulated model specified in the argument g.

Author(s)

Sara Algeri

References

S. Algeri, 2019. Detecting new signals under background mismodelling. <arXiv:1906.06615>.

S. Mukhopadhyay, 2017. Large-scale mode identification and data-driven sciences. Electronic Journal of Statistics 11 (2017), no. 1, 215–240.

See Also

Legj,BestM,denoise.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
#generaing data
x<-rnorm(1000,10,7)
xx<-x[x>=10 & x<=20]

#create suitable postulated quantile function of data
G<-pnorm(20,5,15)-pnorm(10,5,15)
g<-function(x){dnorm(x,5,15)/G}

#Choose best M
Mmax=20
range=c(10,20)
m<-BestM(data=xx,g, Mmax,range)

# vectorize postulated quantile function
g<-Vectorize(g)
u<-g(xx)

#M has to be sufficient big, otherwise dhatL2 function will crush.
#So,here we set m eqaul 6 as an example
m<-6
comp.density<-dhatL2(data=xx,g, M=m, Mmax=Mmax,smooth=FALSE,criterion="AIC",hist.u=TRUE,breaks=20,
         ylim=c(0,2.5),range=range,sigma=2)

LPBkg documentation built on Oct. 5, 2019, 1:05 a.m.

Related to dhatL2 in LPBkg...