fitFunc: A function to fit a parametric distribution to binned data.

Description Usage Arguments Details Value References See Also Examples

Description

This function fits a parametric distribution binned data. The data are subdivided using ID.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
fitFunc(ID, hb, bin_min, bin_max, obs_mean, ID_name,
  distribution = "LOGNO", distName = "LNO", links = c(muLink = 
  "identity", sigmaLink = "log", nuLink = NULL, tauLink = NULL),
   qFunc = qLOGNO, quantiles = seq(0.006, 0.996, length.out = 
   1000), linksq = c(identity, exp, NULL, NULL), con = 
   gamlss.control(c.crit=0.1,n.cyc=200, trace=FALSE), 
   saveQuants = FALSE, muStart = NULL, sigmaStart = NULL,
    nuStart = NULL, tauStart = NULL, muFix = FALSE, 
    sigmaFix = FALSE, nuFix = FALSE, tauFix = FALSE, 
    freeParams = c(TRUE, TRUE, FALSE, FALSE), 
    smartStart = FALSE, tstamp = as.numeric(Sys.time()))

Arguments

ID

a (non-empty) object containing the group ID for each row. Importantly, ID, bh, bin_min, bin_max, and obs_mean MUST be the same length and be in the SAME order.

hb

a (non-empty) object containing the number of observations in each bin. Importantly, ID, bh, bin_min, bin_max, and obs_mean MUST be the same length and be in the SAME order.

bin_min

a (non-empty) object containing the lower bound of each bin. Currently, this package cannot handle data with open lower bounds. Importantly, ID, bh, bin_min, bin_max, and obs_mean MUST be the same length and be in the SAME order.

bin_max

a (non-empty) object the upper bound of each bin. Currently, this package can only handle the upper-most bin being open ended. Importantly, ID, bh, bin_min, bin_max, and obs_mean MUST be the same length and be in the SAME order.

obs_mean

a (non-empty) object containing the mean for each group. Importantly, ID, bh, bin_min, bin_max, and obs_mean MUST be the same length and be in the SAME order.

ID_name

a (non-empty) object containing column name for the ID column.

distribution

a (non-empty) character naming a gamlss family.

distName

a (non-empty) character object with the name of the distribution.

links

a (non-empty) vector of link characters naming functions with the following items: muLink, sigmaLink, nuLink, and tauLink.

qFunc

a (non-empty)gamlss function for calculating quantiles, this should match the distribution in distribution.

quantiles

a (non-empty) numeric vectors of the desired quantiles, these are used in calculating metrics.

linksq

a (non-empty) vector of functions, which undue the link functions. For example, if muLink = log, then the first entry in linksq should be exp. If you are using an indentity link function in links, then the corresponding entry in linksq should be indentity.

con

an optional lists modifying gamlss.control.

saveQuants

an optional logical value indicating whether to save the quantiles.

muStart

an optional numerical value for the starting value of mu.

sigmaStart

an optional numerical value for the starting value of sigma.

nuStart

an optional numerical value for the starting value of nu.

tauStart

an optional numerical value for the starting value of tau.

muFix

an logical value indicating whether mu is fixed or is free to vary during the fitting process.

sigmaFix

an logical value indicating whether sigma is fixed or is free to vary during the fitting process.

nuFix

an logical value indicating whether nu is fixed or is free to vary during the fitting process.

tauFix

an logical value indicating whether tau is fixed or is free to vary during the fitting process.

freeParams

a vector of logical values indicating whether each of the four parameters is free == TRUE or fixed == FALSE.

smartStart

a logical indicating whether a smart starting place should be chosen, this applies only when fitting the GB2 distribution.

tstamp

a time stamp.

Details

Fits a GAMLSS and estimates a number of metrics, see value.

Value

returns a list with 'datOut' a data.frame with the IDs, observer mean, distribution, estimated mean, variance, coefficient of variation, cv squared, gini, theil, MLD, aic, bic, the results of a convergence test, log likelihood, number of parameters, median, and std. deviation; 'timeStamp' a time stamp; 'parameters' the estiamted parameter; and 'quantiles' the quantile estimates if saveQuants == TRUE)

References

FIXME - references

See Also

gamlss

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
data(state_bins)

use_states <- which(state_bins[,'State'] == 'Texas' | state_bins[,'State'] == 'California')

ID <- state_bins[use_states,'State']
hb <- state_bins[use_states,'hb']
bmin <- state_bins[use_states,'bin_min']
bmax <- state_bins[use_states,'bin_max']
omu <- rep(NA, length(use_states))
fitFunc(ID = ID, hb = hb, bin_min = bmin, bin_max = bmax, obs_mean = omu, ID_name = 'State')

Example output

Loading required package: gamlss
Loading required package: splines
Loading required package: gamlss.data

Attaching package:gamlss.dataThe following object is masked frompackage:datasets:

    sleep

Loading required package: gamlss.dist
Loading required package: MASS
Loading required package: nlme
Loading required package: parallel
 **********   GAMLSS Version 5.2-0  ********** 
For more on GAMLSS look at https://www.gamlss.com/
Type gamlssNews() to see new features/changes/bug fixes.

Loading required package: gamlss.cens
Loading required package: survival
Time difference of 0.2354329 secs
for LNO fit across 2 distributions 
 
$datOut
       State obsMean distribution  estMean        var       cv   cv_sqr
1 California      NA          LNO 74418.87 5993925739 1.040334 1.082294
2      Texas      NA          LNO 58612.01 3633652195 1.028454 1.057719
       gini     theil       MLD       SDL     aic     bic didConverge
1 0.4832827 0.3962717 0.4113666 0.9138386 1654408 1654409        TRUE
2 0.4794992 0.3893031 0.4038240 0.9052671 1085474 1085476        TRUE
  logLikelihood nparams   median       sd
1     -827201.9       2 49191.55 77420.45
2     -542735.2       2 39037.36 60279.78

$timeStamp
[1] 1616546582

$parameters
                 mu     sigma nu tau
California 10.80110 0.9471489 NA  NA
Texas      10.56992 0.9382649 NA  NA

$quantiles
NULL

binequality documentation built on May 2, 2019, 9:58 a.m.