In klapaukh/hangler: Fourier shape analysis as per Hangle

This vignette describes the implementation of the algorithm as implemented in this system. This is intended as a reimplementation of the Hangle tool of Haines and Crampton (2000). The algorithm is taken entirely from their work and implemented in the R language. Any errors in this document or implementation come from our incomplete understanding.

Input To The Algorithm

The algorithm takes in a set of points that represent the outline of a closed shape. Hangler will check if an outline is closed (the first and last points are the same), and if it is remove one of the repeated points.

Algorithm Overview

The algorith is a pipeline as follows:

Mirror curves where necessary
Ensure that all curves go in the same direction
Remove duplicated xy point that closes the curve if present
Smooth the input coordinates
Estimate tangents to each coordinate
Normalise the tangents starting position to 0
Compute a spline to model the tangents of the closed curve
Resample the spline to get 1024 (usually) points
Subtract the tangents of the unit circle from the spline
Normalise the starting position and orientation by either:
- Normalisation by harmonics
- Normalisation by matching against all speciments
Perform an FFT on the points

There are also helper functions to:

Reconstruct the curve from fourier space harmonic coeeficients
Draw a curve from its tangents

Smoothing

Smoothing is done by means of a weighed moving averages filter. This filter can be applied multiple times.

The formula to get a point's smooth location is:

$$ (x,y)i^{new} = \frac{1}{4}(x,y){i-1}^{old} + \frac{1}{2}(x,y)i^{old} + \frac{1}{4}(x,y){i+1}^{old} $$

Note that since this is a closed curve, given points numbered $0..N,\ (x,y){-1} = (x,y){N}$.

There is also a minumum number of iterations of the filter that need to be run. Given:

$N_{Samp}$: the number of sampled xy points in the input
$N_{FFT}$: the number of re-sampled points used in the FFT

The minimum number of iterations required is: $\frac{2N_{Samp}}{N_{FFT}}$.

Estimating tangents for sampled points

The tangents are estimated assuming that each triplet of points lies on the surface of a circle. So assuming we have sampled points $p_1\ldots p_n$, to find the tangent $\theta_i$ for $p_i$ we take several steps:

Find the center of the circle described by the three points. Given three points $(x_1,y_1), (x_2,y_2), (x_3,y_3)$ the center can be found using the following equations (http://www.ambrsoft.com/TrigoCalc/Circle3D.htm):

$$ x_{center} = \frac{(x_1^2 + y_1^2)(y_2 - y_3) + (x_2^2 + y_2^2)(y_3 - y_1) + (x_3^2 + y_3^2)(y_1 - y_2)} {2(x_1(y_2-y_3) - y1(x_2-x_3) + x_2y_3 - x_3y_2)} $$

$$ y_{center} = \frac{(x_1^2 + y_1^2)(x_3 - x_2) + (x_2^2 + y_2^2)(x_1 - x_3) + (x_3^2 + y_3^2)(x_2 - x_1)} {2(x_1(y_2-y_3) - y1(x_2-x_3) + x_2y_3 - x_3y_2)} $$
Find the angle between the center and $p_i$ and subtract $\frac{\pi}{2}$ from the angle to get the tangent ($\theta_i$):

$$\theta_i = \mathrm{atan2}\left(y_i - y_{center}, x_i - x_{center}\right) - \frac{\pi}{2}$$

Computing the spline coefficients

Hangle uses a spline to interpolate points in between the samples. This spline function requires three extra paramaters for each point in order to compute:

$s_i$: The arc-length along the curve until this point.
$s_{i+1} - s_i$: The arc-length between this point and the next point on the curve.
$\Delta_i$: How uncircular this bit of curve is.

There is no simple analytical method that I am aware of to compute $\Delta_i$. Instead, we will do all the computation by numerical approximation. Note that we do not have bounds on the values of $\Delta_i$ and $s_k$ (ruling out methods such as bisection), nor can we compute a derivative (ruling out methods such as Newton's). In this implementation we use the secant method for numerical root finding, and Simpson's rule for numerical integration.

Computing $\Delta_i$

Given that we now have a set of tangents ($\theta_1\ldots\theta_n$), we can compute $\Delta_i$ by using the secant method and Simpson's rule to numerically estimate the zero value for:

$$
f(\Delta_i) = (y_{i+1} - y_i) * \int_0^1 \cos\left(\theta_i(1-t) + \theta_{i+1}t + \frac{1}{2}\Delta_i(1-t)t\right) dt - (x_{i+1} - x_i) * \int_0^1 \sin\left(\theta_i(1-t) + \theta_{i+1}t + \frac{1}{2}\Delta_i(1-t)t\right) dt $$

Note that $\Delta_i$ is the only unknown value. Also note that this does linear interpolation between $\theta_i$ and $\theta_{i+1}$ which means that when two $\theta$ values are used in the equation there must not be any wrapping around which can come from angle measurements. E.g. the pair of $\theta$s, 0 and 6.1, would cause problems. They should instead be either (0 and -0.183) or (6.283 and 6.1).

We can also estimate a sensible set of initial bounds for our guess of $\Delta_i$. We can do this by making an assumption about what values we expect $\theta(t)$ to take. Essentially we assume that the outputs of the spline are bounded by the boundary values. This gives us a lower bound of $-2|\theta_i-\theta_j|$ and an upper bound of $2|\theta_i-\theta_j|$. Note that we are not sure of these assumptions, so we do not use a bounded search algorithm.

Computing $s_{i+1} - s_i$

Once we have $\Delta_i$ finding the distance along the curve to the next point ($s_{i+1} - s_i$) involves only a numerical integration.

$$ s_{i+1} - s_i = \frac{x_{i+1} - x_i} {\int_0^1 \cos\left(\theta_i(1-t) + \theta_{i+1}t + \frac{1}{2}\Delta_i(1-t)t\right) dt} $$

Computing $s_i$

Computing the $s_i$ involves simply summing the $s_{i+1} - s_i$ values which we computed previously to get the total distance for each point. Note that $s_0 = 0$, and the following formula works for all $i>0$:

$$ s_i = \displaystyle\sum_{k=0}^{i-1} s_{k+1} - s_k $$

klapaukh/hangler documentation built on May 20, 2019, 11:04 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

Tweet to @rdrrHQ

GitHub issue tracker

ian@mutexlabs.com