README.md

roahd

Build Status codecov

Package roahd (Robust Analysis of High-dimensional Data) allows to use a set of statistical tools for the exploration and robustification of univariate and multivariate functional datasets through the use of depth-based statistical methods.

In the implementation of functions special attention was put to their efficiency, so that they can be profitably used also for the analysis of high-dimensional datasets.

(For a full-featured description of the package, please turn to the Vignette)

fData and mfData objects

A simple S3 representation of functional data object, fData, allows to encapsulate the important features of univariate functional datasets (like the grid of the dependent variable, the pointwise observations etc.):

# Grid representing the dependent variable
grid = seq( 0, 1, length.out = 100 )

# Pointwise-measurements of the functional dataset
Data = matrix( c( sin( 2 * pi * grid ),
                  cos ( 2 * pi * grid ),
                  sin( 2 * pi * grid + pi / 4 ) ), ncol = 100, byrow = TRUE )

# S3 object encapsulating the univariate functional dataset            
fD = fData( grid, Data )

# S3 representation of a multivariate functional dataset
mfD = mfData( grid, list( 'comp1' = Data, 'comp2' = Data ) )

Also, this allows to exploit simple calls to customised functions which simplify the exploratory analysis:

# Algebra of fData objects
fD + 1 : 100
fD * 4

fD_1 + fD_2

# Subsetting fData objects (providing other fData objects)
fD[ 1, ]
fD[ 1, 2 : 4]

# Smaple mean and (depth-based) median(s)
mean( fD )
mean( fD[ 1, 10 : 20 ] )
median_fData( fD, type = 'MBD' )

# Plotting functions
plot( fD )
plot( mean( fD ), add = TRUE )

plot( fD[ 2:3, :] )

Robust methods for functional data analysis

A part of the package is specifically devoted to the computation of depths and other statistical indexes for functional data:

These also are the core of the visualization/robustification tools like functional boxplot (fbplot) and outliergram (outliergram), allowing the visualization and identification of amplitude/shape outliers.

Thanks to the functions for the simulation of synthetic functional datasets, both fbplot and outliergram procedures can be auto-tuned to the dataset at hand, in order to control the true positive outliers rate.



Try the roahd package in your browser

Any scripts or data that you put into this service are public.

roahd documentation built on May 30, 2017, 6:09 a.m.