noisysensors: noisysensors

View source: R/noisysensors.R

noisysensorsR Documentation

noisysensors

Description

Detect noisy or outlier sensors

Usage

noisysensors(
  x,
  fs = 1,
  sensors = setdiff(colnames(x), "t"),
  devthr = 5,
  HFnoisethr = 5,
  corsecs = 1,
  corthr = 0.4,
  timethr = 0.01,
  ransac = TRUE,
  slocs = eegr::getLocationsfromLabels(colnames(x)[sensors]),
  sfrac = 0.25,
  ubtime = 0.4,
  ransacthr = 0.75,
  rwin = 5,
  rss = 50
)

Arguments

x

input data, specified as a numeric matrix or vector. In case of a vector it represents a single signal; in case of a matrix each column is a signal. Alternatively, an object of class ctd. The data is assumed to be high-pass filtered.

fs

sampling frequency of x in Hz. Default: 1. Overruled if x is a ctd object, in which case the sampling frequency is fs(x). .

sensors

sensors to detect, specified as positive integers indicating sensors numbers, or as sensor names (colnames of x). Default: all sensors names in x excluding a variable t (if present).

devthr

robust deviation threshold, specified as a numeric value. Sensors with robust z-score greater than this value are marked as bad. Default: 5

HFnoisethr

high-frequency noise threshold, specified as a numeric value. Sensors with a robust z-score calculated from the ratio between high- and low-frequency activity are marked as bad. Default: 5

corsecs

segment length in seconds for computing correlations between sensors. The data is segmented in non-overlapping segments of this duration before calculating correlations. Default: 1 second

corthr

correlation threshold, specified as a numeric value. Sensors which have a correlation with other channels lower than this value, for a proportion of the data specified by timethr, are flagged as bad. Default: 0.4

timethr

time threshold, specified as a numeric value. Proportion of segments during which the correlation is allowed to be lower than corthr before the sensor is marked as bad. Default: 0.01

ransac

logical indicating whether to perform RANSAC (random sample consensus). Default: TRUE

slocs

sensor locations, needed for RANSAC, specified as a data frame according to sensorlocs format. Default: obtained from the sensor names given by the column names of x

sfrac

fraction of sensors for robust reconstruction. Default: 0.25

ubtime

RANSAC unbroken time - cutoff fraction of time a sensor can have poor RANSAC predictability. Default: 0.4

ransacthr

RANSAC threshold. Sensors which with a correlation less this value with their RANSAC-predicted time courses on more than ubtime proportion of the windows are flagged as bad. Default: 0.75

rwin

RANSAC window in seconds. Default: 5

rss

RANSAC sample size. Default: 50

Details

The algorithm uses four primary measures: extreme amplitudes (deviation criterion), unusual high frequency noise (noisiness criterion), lack of correlation with any other sensor (correlation criterion), and lack of predictability by other sensors (predictability criterion), in this order. Several of the criteria use a robust z score, replacing the mean by the median and the standard deviation by the robust standard deviation (0.7413 times the interquartile range). The algorithm also detects sensors that contain any NAs that have significant periods with constant values or very small values.

The **deviation criterion** calculates the robust z-score of the robust standard deviation for each sensor. Sensors designated as bad have a robust z-score greater than devthr (default 5).

The **noisiness criterion** of signal quality uses a robust estimate of the ratio of the power of the high frequency components to the power in the low frequency components. A 50 Hz low pass FIR filter is used to separate the low and high frequency components. Noisiness is defined as the ratio of the median absolute deviation of the high frequency component over the low frequency component for each sensor. A z-score relative to all of the sensors is computed and sensors with a z-score greater than a threshold (HFnoisethr; default: 5) are marked as bad.

The **correlation criterion** is based on the observation that the low frequency portion of EEG signals is somewhat correlated (but not too correlated) among channels. Using signals low- pass filtered at 50 Hz, the algorithm calculates the correlation of each sensor with the other channels in small, non-overlapping time windows (corsecs parameter; 1 s by default). The maximum absolute correlation is calculated as the 98th percentile of the absolute values of the correlations with the other sensors in each window. If this maximum correlation is less than a threshold (corthr; 0.4 by default) for a certain percentage of the windows (timethr; 1

The **predictability criterion** also relies on correlations of the low frequency portion of the signals. The RANSAC (random sample consensus) method of Fischler and Bolles (1981) is used to select a random subset of (so far) good sensors to predict the behavior of each sensor in small non-overlapping time windows (rwin, 5 seconds by default). A random subset of predictor sensors is chosen for each sensor (sfrac; 25 The RANSAC algorithm uses a method of spherical splines for estimating scalp potentials. Sensors which have a correlation less than a threshold (ransacthr; 0.75 by default) with their RANSAC-predicted time courses on more than a certain fraction (ubtime; 0.4 by default) of the windows are flagged as bad.

Value

A list with the following fields:

BadFromNA

**Unusable**: sensors containing NA values

BadFromConstant

**Unusable**: sensors containing constant values (usually 0)

devthr

**Deviation criterion**: sensors having a robust z-score of the robust standard deviation greater than this value are classified as bad

dev

**Deviation criterion**: robust standard deviation for each sensor

devSD

**Deviation criterion**: robust SD of deviation scores

devMed

**Deviation criterion**: median of deviation scores

Deviation

**Deviation criterion**: robust z-scores of deviation scores

BadFromDeviation

**Deviation criterion**: sensors marked as bad by the deviation criterion

HFnoisethr

**Noisiness criterion**: sensors having a robust noisiness z-score greater than this value are classified as bad

HFnoise

**Noisiness criterion**: ratio of high over low frequency noise (per median absolute deviation) for each sensor

medHFnoise

**Noisiness criterion**: median of HFnoise values

sdHFnoise

**Noisiness criterion**: robust SD of HFnoise values

zHFnoise

**Noisiness criterion**: robust z-scores of HFnoise values

BadFromHFnoise

**Noisiness criterion**: sensors marked as bad by the noisiness criterion

corsecs

**Correlation criterion**: for the correlation criterion, each sensor is segmented into non-overlapping periods of this length (seconds)

corthr

**Correlation criterion**: sensors with a correlation less than this value are marked as bad

timethr

**Correlation criterion**: percentage of segments of a sensor for which the correlation is allowed to be less than the threshold

cors

**Correlation criterion**: matrix of size [number of segments, number of sensors] containing, for each segment, the maximum absolute correlation (98 with the other sensors)

cor_noise

**Correlation criterion**: matrix of size [number of segments, number of sensors] containing, for each segment and sensor, a value of the noise level, defined as the ratio of the difference between the original and the 50 Hz low-pass filtered data

cor_dev

**Correlation criterion**: robust deviation scores for cor_noise

corMed

**Correlation criterion**: median of correlations per sensor

BadFromCorrelation

**Correlation criterion**: sensors marked as bad by the correlation criterion

BadFromDropout

**Correlation criterion**: sensors marked as bad by the cor_noise noisiness criterion (dropouts)

ransacPerformed

**Predictability criterion**: logical indicating whether RANSAC was performed

ransacSensors

**Predictability criterion**: sensors on which RANSAC was performed

sfrac

**Predictability criterion**: fraction of sensors on which RANSAC prediction was based

ubtime

**Predictability criterion**: RANSAC unbroken time - cutoff fraction of time a sensor can have poor RANSAC predictability

ransacthr

**Predictability criterion**: RANSAC threshold. Sensors which with a correlation less this value with their RANSAC-predicted time courses on more than ubtime proportion of the windows are flagged as bad

rwin

**Predictability criterion**: RANSAC window length in seconds

rss

**Predictability criterion**: RANSAC sample size

rcorr

**Predictability criterion**: matrix of size [number of sensors, number of windows] containing, for each segment (window) and sensor, the RANSAC correlations

ransac_frac

**Predictability criterion**: fraction of bad RANSAC windows

BadFromRansac

**Predictability criterion**: sensors marked as bad from the predictability criterion

TotalBad

**Summary**: indices of all bad sensors

BadNames

**Summary**: names of bad sensors

Author(s)

Methods 1 and 4 are adapted from code by Christian Kothe and Methods 2 and 3 are adapted from code by Nima Bigdely-Shamlo; ported to R and adapted by Geert van Boxtel, G.J.M.vanBoxtel@gmail.com,

References

Bigdely-Shamlo, N., Mullen, T.,, Kothe, C.,, Su, K.-M., and Robbins, K. A. (2015). The PREP pipeline: standardized preprocessing for large-scale EEG analysis. Frontiers in Neuroinformatics, 9, Article 16, https://www.frontiersin.org/article/10.3389/fninf.2015.00016.

Fischler, M.A., and Bolles, R.C. (1981). Random sample consensus: A paradigm for model fitting with apphcatlons to image analysis and automated cartography. Communications of the ACM, 24(6)), 381-395.

Examples

## Not run:  
noisy <- noisysensors(EEGdata)

## End(Not run)
  

gjmvanboxtel/eegr documentation built on May 20, 2023, 4:26 a.m.