# oregMclust: Orthogonal Regression Clustering In edci: Edge Detection and Clustering in Images

## Description

Computation of center points for regression data by means of orthogonal regression. A cluster method based on redescending M-estimators is used.

## Usage

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13``` ``` oregMclust(datax, datay, bw, method = "const", xrange = range(datax), yrange = range(datay), prec = 4, na = 1, sa = NULL, nl = 10, nc = NULL, brmaxit = 1000) regparm(reg) ## S3 method for class 'oregMclust' plot(x, datax, datay, prec = 3, rcol = "black", rlty = 1, rlwd = 3, ...) ## S3 method for class 'oregMclust' print(x, ...) ```

## Arguments

 `datax, datay` numerical vectors of coordinates of the observations. Alternatively, a matrix with two or three columns can be given. Then, the first two columns are interpreted as coordinates of the observations and, if available, the third is passed to parameter `sa`. `bw` positive number. Bandwidth for the cluster method. `method` optional string. Method of choosing starting values for maximization. Possible values are: "const": a constant number of angles for every observation is used. By default, one horizontal line through any observation is used as starting value. If a value for parameter `na` is passed, `na` lines through any observation are used. Alternatively, with the parameter `sa` a proper starting angle for every observation can be specified. In this case, `na` is ignored. The length of `sa` must be the number of observations. "all": every line through any two observations is used. "prob": Clusters are searched iteratively with randomly chosen starting values until either no new clusters are found (default), or until `nc` clusters are found. The precision of distinguishing the clusters can be tuned with the parameter `prec`. In each iteration, `nl` times a line through two randomly chosen observations is used as starting value. `xrange, yrange` optional numerical intervals describing the domains of the observations. This is only used for normalization of the data. Note that both intervals should have approximately the same length or should be transformed otherwise. This is not done automatically, since this transformation affects the choice of the bandwidth. `prec` optional positive integer. Tuning parameter for distinguishing different clusters, which is passed to `deldupMclust`. `na` optional positive integer. Number of angles per observation used as starting values for `method = "const"` (default). `sa` optional numerical vector. Angles (within `[0,2pi)`) used as starting values for `method = "const"` (default). `nl` optional positive integer. Number of starting lines in each iteration for `method = "prob"`. `nc` optional positive integer. Number of clusters to search if method `"const"` is chosen. Note that if `nc` is too large, i.e., `nc` clusters cannot be found, the function does not terminate. Attention! Using Windows, it is impossible to interrupt the routine manually in this case! `brmaxit` optional positive integer. Since the maximization could be very slow in some cases depending on the starting value, the maximization is stopped after `brmaxit` iterations. `reg, x` object returned from `oregMclust`. `rcol, rlty, rlwd` optional graphic parameters used for plotting regression lines. `...` additional parameters passed to `plot`.

## Details

`oregMclust` implements a cluster method based on redescending M-estimators for the case of orthogonal regression. This method is introduced by Mueller and Garlipp in 2003 (see references).

`regparm` transforms the columns "alpha" and "beta" to "intersept" and "slope".

See also `bestMclust`, `projMclust`, and `envMclust` for choosing the 'best' clusters out of all found clusters.

## Value

A numerical matrix containing one row for every found regression center line. The columns "alpha" and "beta" are their parameters in the representation (cos(alpha), sin(alpha)) * (x,y)' = beta, where alpha is within `[0,2pi)`. For the alternative representation y = mx + b, the return value can be passed to `regparm`.

The columns "value" and "count" give the value of the objective function and the number how often they are found.

## Author(s)

Tim Garlipp, TimGarlipp@gmx.de

## References

Mueller, C. H., & Garlipp, T. (2005). Simple consistent cluster methods based on redescending M-estimators with an application to edge identification in images. Journal of Multivariate Analysis, 92(2), 359–385.

`bestMclust`, `projMclust`, `envMclust`, `deldupMclust`

## Examples

 ```1 2 3 4 5 6 7 8 9``` ``` x = c(rnorm(100, 0, 3), rnorm(100, 5, 3)) y = c(-2 * x[1:100] - 5, 0.5 * x[101:200] + 30)/2 x = x + rnorm(200, 0, 0.5) y = y + rnorm(200, 0, 0.5) reg = oregMclust(x, y, 1, method = "prob") reg = projMclust(reg, x, y) reg plot(bestMclust(reg, 2, crit = "proj"), x, y) ```

### Example output

```Break with <CTRL>-C (linux) or <ESC> (windows)
Found clusters:  3
Found clusters:  5
Found clusters:  5
alpha    beta    value count proj
[1,] 2.9083 -0.2363 23.92903    17   20
[2,] 0.7832 -1.7508 35.41184     6   35
[3,] 2.6643  3.6276 22.21987     4   14
[4,] 1.8193 14.5220 35.19585     2   85
[5,] 0.7831 -1.7508 35.41184     1   46
```

edci documentation built on May 1, 2019, 7:44 p.m.