Installation

To install the stable version of sensorQC package with dependencies:

install.packages("sensorQC", 
    repos = c("http://owi.usgs.gov/R","http://cran.rstudio.com/"),
    dependencies = TRUE)

Or to install the current development version of the package (using the devtools package):

devtools::install_github("USGS-R/sensorQC")

This package is still very much in development, so the API may change at any time.

| Name | Status |
| :------------ |:-------------|
| Linux Build: | Build Status |
| Windows Build: | Build status |
| Package Tests: | Coverage Status |

High-frequency aquatic sensor QAQC procedures. sensorQC imports data, and runs various statistical outlier detection techniques as specified by the user.

sensorQC Functions (as of vr packageVersion('sensorQC'))

| Function | Title | | ------------- |:-------------| | read | read in a file for sensor data or a config (.yml) file | | window | window sensor data for processing in chunks | | plot | plot sensor data | | flag | create data flags for a sensor | | clean | remove or replace flagged data points |

example usage

library(sensorQC)
file <- system.file('extdata', 'test_data.txt', package = 'sensorQC') 
sensor <- read(file, format="wide_burst", date.format="%m/%d/%Y %H:%M")
flag(sensor, 'x == 999999', 'persist(x) > 3', 'is.na(x)')

Use the MAD (median absolute deviation) test, and add w to the function call to specify "windows" (note, sensor must be windowed w/ window() prior to using w)

sensor = window(sensor, type='auto')
flag(sensor, 'x == 999999', 'persist(x) > 3', 'MAD(x,w) > 3', 'MAD(x) > 3')

Use sensorQC with a simple vector of numbers:

flag(c(3,2,4,3,3,4,2,4),'MAD(x) > 3')

plotting data

plot dataset w/ outliers:

plot(sensor)

plot dataset w/o outliers:

flagged = flag(sensor, 'x == 999999', 'persist(x) > 3', 'MAD(x,w) > 3', 'MAD(x) > 3')
plot(flagged)

cleaning data

The clean function can be used to strip flagged data points from the record or replace them with other values (such as NA or -9999)

data = c(999999, 1,2,3,4,2,3,4)
sensor = flag(data, 'x > 9999')
clean(sensor)
clean(sensor, replace=NA)

if you have multiple flag rules, you can choose which ones to use by their index:

data = c(999999, 1,2,3,4,2,3,4)
sensor = flag(data, 'x > 9999', 'x == 3')
clean(sensor, which=1)
clean(sensor, which=2)

or flag data and clean data all in one step:

clean(data, 'x > 9999', 'persist(x) > 10', 'MAD(x) > 3', replace=NA)

flagging data with a moving window

The MAD(x,w) function can use a rolling window by leveraging the RcppRoll R package.

sensor <- read(file, format="wide_burst", date.format="%m/%d/%Y %H:%M")
sensor = window(sensor, n=300, type='rolling')
flag(sensor, 'x == 999999', 'persist(x) > 3', 'MAD(x,w) > 3', 'MAD(x) > 3')


USGS-R/sensorQC documentation built on June 3, 2017, 7:44 a.m.