outliers: Remove extreme with thresholds values and standard deviation...

Description Usage Arguments Value Author(s) See Also Examples

View source: R/outliers.R

Description

A function to remove outliers of a given data frame. Extreme values can be excluded based on a minimal/maximal values (thresholds) and/or based on a given standard deviation value. Return a data frame of filtered data as well as the number and relative percentage of data filtered.

Usage

1
outliers(data, fpass = NULL, target, sdv = NULL, tokeep = NULL)

Arguments

data

A data frame in the long format (one row per record).

fpass

A vector length two with the minimal and maximal accepted value to use for a first filtering of the reaction times. To filter only the lowest or highest values, indicate NA as value. For instance c(100, NA) will only remove RT < 100ms. Defaults to NULL.

target

A string indicating the column names of the data to filter. Defaults to 'RT'.

sdv

A number indicating how many standard deviations should be used to filter the data. Defaults to 3.

tokeep

A vector of column names to indicate in the filtering section which conditions were used in the processing, typically subjects and/or experimental condition. Default to NULL.

Value

Return a list with a data frame of filtered data and a data frame of number of data excluded and its relative percentage per condition.

Author(s)

Guillaume T. Vallet gtvallet@gmail.com, University of de Montreal (Canada);

Benoit A. Riou riouba@gmail.com, Lyon2 University (France)

See Also

filtRT

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
# Generate fake data with a subject number in the first colum, 
#   a fake experimental condition in the second column and fake
#   reaction times in the third column
df = data.frame(Subj=1, Cond="Test", RT=rnorm(25, mean=550, sd=50))

# Adding extreme values
df[5,3]  <- df[5,3]+300
df[25,3] <- df[5,3]+500
df[19,3] <- df[19,3]-350
df[12,3] <- 55
df[15,3] <- 2340

# Filter with low and high thresolds and with 3 standard deviations
outliers(df, target='RT', fpass=c(100,1000), sdv=3)

# Filter with only a low thresold with 2 standard deviations
outliers(df, target='RT', fpass=c(100, NA), sdv=2)

Cogitos/statxp documentation built on March 22, 2021, 6:38 a.m.