fled: Fiber length determination

View source: R/fiber.r

fledR Documentation

Fiber length determination

Description

This function estimates fiber (tracheid) and fine (e.g. ray parenchyma cells and other small particles) lengths distribution in standing trees based on increment cores (cylindric wood samples). The data from the increment cores contain uncut fiber, fibers cut once or twice (cut by the borer) as well as non-fiber cells so-called 'fines'. A censored version of a mixture of the fine and fiber length distributions is therefore proposed to fit the data. The function offers two choices for the underlying density functions of the true unobserved uncut lengths of the fines and fibers in the increment core such as generalized gamma and log normal densities. The parameters of the mixture models are estimated by log likelihood maximization. The routine calls an optim() or nlm() functions for optimization procedure with the possibility to use a supplied gradient function. Some parameters of the generalized gamma mixture model can be fixed (rather than estimated) at the given values.

Usage

 fled(data=stop("No data supplied"), data.type="ofa", r=2.5, 
     model="ggamma", method="ML", parStart=NULL, fixed=NULL, 
    optimizer=c("optim","L-BFGS-B","grad"),lower=-Inf,upper=Inf,cluster=1,...)

Arguments

data

A numeric vector of cell lengths from increment cores.

data.type

type of data supplied: ”ofa” (default) measured by an optical fiber analyser, or measured by ”microscopy” (only the lengths of uncut fibers in the core).

r

radius of the increment core (default 2.5).

model

if model="ggamma" then the distributions of the true lengths of the fibers (fines) that at least partially appear in the increment core are assumed to follow generalized gamma distributions; if model="lognorm" then log normal distributions are assumed on those fiber (fine) lengths.

method

either "ML" (default) for the maximum likelihood method or "SEM" for a stochastic version of the EM algorithm. Note "SEM" works only with the log normal model and increment core data measured by an optical fiber analyzer ("ofa").

parStart

numerical vector of starting values of parameters (or fixed values for ggamma model when !is.null(fixed)). The parameter values of the generalized gamma model should be given in the following order,

(ε, b_{fines},d_{fines},k_{fines},b_{fibers},d_{fibers},k_{fibers}).

The parameter values of the log normal model are in the order

(ε, μ_{fines}, σ_{fines}, μ_{fibers}, σ_{fibers}) (see Details below).

fixed

TRUE/FALSE vector of seven components used to tell which parameters of ggamma model to fix. These are fixed at the values given in the argument parStart). The positive values in parStart for non-fixed parameters are starting values for the optimiser, the negative or zero values indicate that no starting values are assumed. Note, fixing parameter values currently works only with 'optim'.

optimizer

numerical optimization method used to minimize 'minus' the loglikelihood function of the observed data: 'optim', 'nlm' or 'nlm.fd' (nlm is based on finite-difference approximation of the derivatives). If optimizer==”optim” then the second argument specifies the numerical method to be used in 'optim' (”Nelder-Mead”, ”BFGS”, ”CG”, ”L-BFGS-B”, ”SANN”. The third element of optimizer indicates whether the finite difference approximation should be used ('fd') or analytical gradient ('grad') for the 'BFGS', 'CG' and 'L-BFGS-B' methods. The default is optimizer=c("optim", "L-BFGS-B","grad").

lower, upper

Bounds on the parameters for the "L-BFGS-B" method. The order of the bounds values has to be the same as the order of the parStart. Note that these bounds are on the original rather than transformed scale of the parameters used for optimization.

cluster

either '0' for no parallel computing to be used; or '1' (default) for one less than the number of cores; or user-supplied cluster on which to do estimation. cluster can only be used with OFA analyzed data (a cluster here can be some cores of a single machine).

...

Further arguments to be passed to optim.

Details

The probability density function of the three-parameter generalized gamma distribution proposed by Stacy (1962) can be written as

f(y;b,d,k) = d * b^(-d*k) * y^(d*k-1) * exp(-(y/b)^d)/gamma(k),

where b > 0, d > 0, k > 0, and y > 0.

The probability density function of the log normal distribution can be written as

f(y;mu, sigma) = exp(-(log(y)-mu)^2/(2*sigma^2))/(y * sigma * sqrt(2*pi)),

where σ > 0 and y > 0.

Value

cov.par

approximate covariance matrix of the estimated parameters.

cov.logpar

approximate covariance matrix of the transformed estimated parameters.

loglik

the log likelihood value corresponding to the estimated parameters.

model

model used

mu.fibers

estimated mean value of the fiber lengths in the standing tree.

mu.fines

estimated mean value of the fine lengths in the standing tree.

mu.cell

estimated mean value of the cell lengths in the standing tree.

prop.fines

estimated proportion of fines in the standing tree.

par

the estimated parameters on the original scale.

logpar

the estimated values of the transformed parameters.

termcode

an integer indicating why the optimization process terminated (see optim).

conv

indicates why the optimization algorithm terminated.

iterations

number of iterations of the optimization method taken to get convergence.

fixed

TRUE/FALSE vector denoting if a parameter of ggamma model is fixed or not.

n

number of observations

Warning

Fixing the parameters with the generalized gamma model may lead to unstable results of the optim method.

Note

The idea and some of the code for fixing parameters with optim() is due to Barry Rowlingson, October 2011.

Author(s)

Sara Sjöstedt de Luna, Konrad Abramowicz, Natalya Pya Arnqvist

References

Svensson, I., Sjöstedt de Luna, S., Bondesson, L. (2006). Estimation of wood fibre length distributions from censored data through an EM algorithm. Scandinavian Journal of Statistics, 33(3), 503–522.

Chen, Z. Q., Abramowicz, K., Raczkowski, R., Ganea, S., Wu, H. X., Lundqvist, S. O., Mörling, T., Sjöstedt de Luna, S., Gil, M.R.G., Mellerowicz, E. J. (2016). Method for accurate fiber length determination from increment cores for large-scale population analyses in Norway spruce. Holzforschung. Volume 70(9), 829–838.

Stacy, E. W. (1962). A generalization of the gamma distribution. Annals of Mathematical Statistics, 33(3), 1187–1192.

Examples


library(fiberLD)
## using microscopy data (uncut fiber lengths in the increment core)
data(microscopy)
dat <- microscopy[1:200]
m1 <- fled(data=dat,data.type="microscopy",model="ggamma",r=2.5) 
summary(m1)
plot(m1)

## and with log normal model...
m2 <- fled(data=dat,data.type="microscopy",model="lognorm",r=2.5)
summary(m2)
plot(m2)

## Not run:  
## using data measured by an optical fiber analyser
data(cell.length) 
d1 <- fled(data=cell.length,model="lognorm",r=6)
summary(d1)
plot(d1)
x11()
plot(d1,select=2,density.scale="uncut.core")

## change the model to generalized gamma
## and set lower and upper bounds on the parameters for 
## the "L-BFGS-B" method ... 
d2 <- fled(data=cell.length,model="ggamma",r=6,lower=c(.12,1e-3,.05,rep(.3,4)),
      upper=c(.5,2,rep(7,5)),cluster=1) 
d2
summary(d2)
plot(d2,select=1)

## change "ML" default method to a stochastic version of the EM algorithm...
d3 <- fled(data=cell.length,model="lognorm",r=6,method="SEM",cluster=0)
summary(d3)


## End(Not run)


fiberLD documentation built on July 21, 2022, 5:08 p.m.