Description Usage Arguments Details Value Note Author(s) See Also Examples
This function returns point Estimations of probability density function and hazard rate ratio function.
1 2 | HRR_pt_est(pt_int,cdf_sample,kernel = "gaussian",
hazard_bandwidth = NULL, knn = NULL)
|
pt_int |
a vector of estimated points. |
cdf_sample |
a sorted vector that needs to be estimated. |
kernel |
a character string giving the smoothing kernel to be used.
This must partially match one of " |
hazard_bandwidth |
the smoothing bandwidth to be used. |
knn |
number of neighbor points to be considered in smoothing for the
" |
Hazard rate ratio function is defined as:
HRR = HR(unif)/HR(est)
Here, the HR(unif)
is the hazard rate function of uniform distribution
while the HR(est)
is the hazard rate function of estimated density
function.
HR(est) = f(est)/(1-F(est))
f(est)
is the estimated probability density function and
F(est)
is the estimated cumulative distribution function.
f(est)
and F(est)
comes from the local
quadratic polynomial density estimation of cumulative distribution function.
f(est)
is the coefficient or the linear term while F(est)
is the constant term.
When kernel is "rectangular
" or "triangular
" and
hazard_bandwidth
is over small, number of observations that are
considered in points estimation may not be enough for solving a quadratic
polynomial equation. In this case, if design matrix rank is 2, the function
fit a linear polynomial equation. If design matrix rank is 1, f(est)
is the percentage of points occur in corresponding bin and F(est)
is mean of points in corresponding bin. If design matrix rank is 0,
f(est) = 0 and F(est)
is missing.
The domain of cdf_sample
is on (0,1), which is a bounded interval. To
elimiate the bias close to boundary points, reflection is being used here. All
the observations are reflected on points 0 and 1. The local quadratic
polynomial density estimation is done on the extended cdf_sample
.
HRR_pt_est
is done by solving systems of linear equations.
With "gaussian
" kernel, the design matrix always use all the obervations
even though the obervations that are far away from the estimated point and
make negligible contribution. However, the computation for large dimension
linear equations system is complicated. Thus, "gaussian
" is not
recommended from the efficiency perspective when length(cdf_sample)
is huge. Function HRR_sbsp_est
is designed to solve such problems.
This function returns a list with components:
fhat |
A function performing the linear interpolation of smoothed probability density function of given data points. |
HRR |
A function performing the linear interpolation of smoothed hazard rate ratio point estimations. |
If package Matrix
is installed, function
solve
is used for solving the linear equations.
If not, function qr.solve
is applied.
Zhicong Zhao
HRR_sbsp_est
for using this function
via subsampling.
1 2 3 4 5 6 7 8 9 10 | temp <- HRR_pt_est(pt_int = seq(0,1,0.1),
cdf_sample = sort(rbeta(10000,2,5)),
kernel = "triangular",
hazard_bandwidth = 0.1)
## plot ##
plot(temp$fhat,col = "blue",xlab = NA,ylab = NA)
points(seq(0,1,0.1),dbeta(seq(0,1,0.1),2,5),type = "l",col = "red")
legend("top",legend = c("estimated density","population density"),
lty = 1, col = c("blue","red"))
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.