Auxiliary parameters for controlling local principal curves.

Description

This function bundles parameters controlling mainly the starting-, convergence-, boundary-, and stopping-behaviour of the local principal curve. It will be used only inside the lpc() function argument.

Usage

1
2
3
4
lpc.control(iter =100, cross=TRUE,
            boundary = 0.005, convergence.at = 0.00001,
            mult=NULL, ms.h=NULL, ms.sub=30, 
            pruning.thresh=0.0, rho0=0.4) 

Arguments

iter

Maximum number of iterations on either side of the starting point within each branch.

cross

Logical parameter. If FALSE, a curve is stopped when it comes too close to an another part of itself. Note: Even when cross=FALSE, different branches of the curve (for higher depth or multiple starting points) are still allowed to cross. This option only avoids crossing of each particular branch with itself. Used in the self-coverage functions to avoid overfitting.

boundary

This boundary correction [2] reduces the bandwidth adaptively once the relative difference of parameter values between two centers of mass falls below the given threshold. This measure delays convergence and enables the curve to proceed further into the end points. If set to 0, this boundary correction is switched off.

convergence.at

This forces the curve to stop if the relative difference of parameter values between two centers of mass falls below the given threshold. If set to 0, then the curve will always stop after exactly iter iterations.

mult

numerical value which enforeces a fixed number of starting points. If the number given here is larger than the number of starting points provided at x0, then the missing points will be set at random (For example, if d=2, mult=3, and x0=c(58.5, 17.8, 80,20), then one gets the starting points (58.5, 17.8), (80,20), and a randomly chosen third one. Another example for such a situation is x0=NULL with mult=1, in which one random starting point is chosen). If the number given here is smaller the number of starting points provided at x0, then only the first mult starting points will be used.

ms.h

sets the bandwidth (vector) for the initial mean shift procedure which finds the local density modes, and, hence, the starting points for the LPC. If unspecified, the bandwidth h used in function lpc is used here too.

ms.sub

proportion of data points (default=30) which are used to initialize mean shift trajectories for the mode finding. In fact, we use

min(max(ms.sub, floor(ms.sub*N/100)), 10*ms.sub)

trajectories.

pruning.thresh

Prunes branches corresponding to higher-depth starting points if their density estimate falls below this threshold. Typically, a value between 0.0 and 1.0. The setting 0.0 means no pruning.

rho0

A numerical value which steers the birth process of higher-depth starting points. Usually, between 0.3 and 0.4 (see reference [1]).

Value

A list of the nine specified imput parameters, which can be read by the control argument of the lpc function.

Author(s)

JE

References

[1] Einbeck, J., Tutz, G. & Evers, L. (2005): Exploring Multivariate Data Structures with Local Principal Curves. In: Weihs, C. and Gaul, W. (Eds.): Classification - The Ubiquitous Challenge. Springer, Heidelberg, pages 256-263.

[2] Einbeck, J. and Zayed, M. (2011). Some asymptotics for localized principal components and curves. Working paper, Durham University. Unpublished.

Examples

1
2
3
4
data(calspeedflow)
fit1 <- lpc(calspeedflow[,c(3,4)], x0=c(50,60),scaled=TRUE,
   control=lpc.control(iter=20, boundary=0))
plot(fit1, type=c("curve","start","mass"))