Description Usage Arguments Value Examples
View source: R/Main_Functions.R
This function implements the marginal generalized gamma method for outlier detection among replicated data. It first fits each replicate (X_1 and X_2) to generalized gamma distributions (using the parameterization in the R package flexsurv, given by Kotz and Johnson (1970)) using MLE. It also fits the aboslute difference Delta (D = X_1 - X_2) between the two replicates to an Asymmetric Laplace Distribution using MLE. It then determines whether Delta's Laplace Distribution is Asymmetric or Symmetric and whether it has a significant displacement parameter. Then among the points outside of some central band (defined using the Laplace parameters fitted to Delta), we use the generalized gamma parameters fitted to the entire X_1 and X_2 vectors (see the paper in the citation) to determine the marginal probability that Z will take a value greater than its observed value. We use numerical integration (specifically the function adaptIntegrate in the package cubature) to integrate the marginal PDF for Z to get this probability. We assign the probability 1 to points in the middle band.
1 2 | q_gg_marg_DZ(X_1, X_2, p_theta = 0.05, p_kappa = 0.05, k = 1,
n_cores = detectCores() - 1)
|
X_1 |
The first (independent) replicate of the data. A vector of positive real numbers |
X_2 |
The second (independent) replicate of the data. A vector of positive real numbers |
p_theta |
We use the (1-p_theta)*100% two-sided confidence interval for theta in Delta = X_1 - X_2 + theta to determine if there is a significant translation of the absolute difference Delta. If this interval contains 0, then we set theta = 0. We set p_theta = 0.05 by default |
p_kappa |
We use the (1-p_kappa)*100% two-sided confidence interval for the asymmetry parameter kappa in the Asymmetric Laplace Distribution to which we fit Delta. If this interval for log(kappa) contains 0, then we set kappa = 0 and use a Symmetric Laplace Distribution for Delta. We set p_kappa = 0.05 by default |
k |
The number of standard deviations about the center (mean) of the Asymmetric Laplace Distribution for Delta that we use to define the "central band." We set k = 1 by default |
n_cores |
This function works by numerically integrating the joint PDF for each data point. To speed up this process, we run this process in parallel (using the package parallel), which requires specifying the number of cores (n_cores) on the computer to use. By default, we use all but one core on the machine (with the remaining one free for other functions). |
A numerical vector of equal length to the input X_1 and X_2 vectors. Using D = X_1 - X_2, Z = sqrt(2) * abs(X_1 - X_2) / (X_1 + X_2), and (d,z) for each (X_1,X_2) data point, we get the marginal probability q = P(z <= Z <= sqrt(2)) if (d,z) is not in the middle band and the assigned value 1 if it is in the middle band
1 2 3 4 5 | # Assume X_1 and X_2 are positive data vectors of the same length. These are the replicates
data(Sim_GG)
df <- data.frame(X_1=Sim_GG$X_1, X_2=Sim_GG$X_2)
# The function q_gg_marg_DZ calculates D and Z for us
# df$q_gg_m <- q_gg_marg_DZ(df$X_1, df$X_2) #Only run this on a cluster!
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.