velout: Identify outliers with abnormal velocity in growth curves

View source: R/sitarlib.R

veloutR Documentation

Identify outliers with abnormal velocity in growth curves

Description

Quickly identifies putative outliers in a large number of growth curves.

Usage

velout(x, y, id, data, lag = 1, velpower = 0.5, limit = 5, linearise = FALSE)

Arguments

x

age vector.

y

outcome vector, typically weight or height.

id

factor identifying each subject.

data

data frame containing x, y and id.

lag

lag between measurements for defining growth velocity.

velpower

a value, typically between 0 and 1, defining the power of delta x to use when calculating velocity as delta(y)/delta(x)^velpower. The default of 0.5 is midway between velocity and increment.

limit

the number of standard deviations beyond which a velocity is deemed to be an outlier.

linearise

if TRUE y is converted to a residual about the median curve of y versus x.

Details

The algorithm works by viewing serial measurements in each growth curve as triplets (A-B-C) and comparing the velocities between them. Velocity is calculated as

diff(y, lag = lag) / diff(x, lag = lag) ^ velpower

Missing values for x or y are ignored. If any of the AB, BC or AC velocities are abnormal (more than limit SDs in absolute value from the median for the dataset) the code for B is non-zero.

Value

Returns a data frame with columns: id, x, y (from the call), code (as described below), vel1, vel2 and vel3 (corresponding to the velocities AB, BC and AC above). The 'data' attribute contains the name of 'data'.

Code is a factor taking values between 0 and 8, with 0 normal (see table below). Values 1-6 depend on the pattern of abnormal velocities, while 7 and 8 indicate a duplicate age (7 for the first in an individual and 8 for later ones). Edge outliers, i.e. first or last for an individual, have just one velocity. Code 4 indicates a conventional outlier, with both AB and BC abnormal and AC normal. Code 6 is an edge outlier. Other codes are not necessarily outliers, e.g. codes 1 or 3 may be adjacent to a code 4. Use codeplot to look at individual curves, and zapvelout to delete outliers.

code AB+BC AC interpretation
0 0 0 no outlier
0 0 NA no outlier
1 0 1 rare pattern
2 1 0 complicated - look at curve
3 1 1 adjacent to simple outlier
4 2 0 single outlier
5 2 1 double outlier
6 1 NA edge outlier
7 - - first duplicate age
8 - - later duplicate age

Author(s)

Tim Cole tim.cole@ucl.ac.uk

See Also

codeplot, zapvelout

Examples


outliers <- velout(age, height, id, heights, limit=3)


statist7/sitar documentation built on Feb. 7, 2024, 2:08 a.m.