ddply_getFeatures: Compute the quadratic, linear, and/or discrete features of...

Description Usage Arguments Details Value Author(s) References Examples

Description

This is a wrapper that implements getFeatures for each group in a data frame using plyr::ddply.

Usage

1
2
3
ddply_getFeatures(y, .variables, cont = NULL, disc = NULL,
  centerScale = TRUE, stats = c("min", "q1", "mean", "med", "q3", "max",
  "sd", "count"), fitQargs = NULL, nJobs = 1)

Arguments

y

Data frame, each row containing a vector of measurements for a particular point in time, with columns indicating the discrete and/or continuous measured variables (and possibly other descriptive variables). The data processed presuming the rows are orderd chronologically.

.variables

character vector with variable names in y that will be used to split the data. These combinations of the variables uniquely identify the groups for which the features will be separately extracted. This is passed directly to the argument of the same name in plyr::ddply.

cont

Vector of integers or a character vector indicating the columns of x that correspond to continuous variables. These are the variables from which features will be extracted by fitting the moving regression model using fitQ.

disc

Vector of integers or character vector indicating the columns of x that correspond to variables that will be treated as discrete. These are the variables from which features will be extracted using discFeatures.

centerScale

Logical indicating whether the continuous variables (indicate by cont) should be centered and scaled by the global mean and standard deviation of that variable. By 'global', we mean all the values of a continuous variable, say x, in y are used to compute the mean and standard deviation. The resulting value for the continuous variable, x, is equivalent to y$x <- (y$x - mean(y$x)) / sd(y$x).

stats

This argument defines the summary statistics that will be calculated for each of the regression parameters. It can be a character vector of summary statistics, which are passed to summaryStats. Or the function object returned by summaryStats may be supplied.

fitQargs

Named list of arguments for fitQ. If NULL, the default arguments of fitQ are used. Any argument for fitQ may be included except y.

nJobs

The number of parallel jobs to run when extracting the features.

Details

A least one of cont or disc must be specified.

Instead of a data frame, the y argument can be a valid_getFeatures_args object (returned by check_getFeatures_args), in which case all the subsequent arguments to getFeature are ignored (because the valid_getFeatures_args object contains all those arguments).

Parallel processing, if requested via nJobs > 1, is facilitated via Smisc::pddply, a wrapper for parallelized calls to plyr::ddply.

Value

A dataframe with one row for each grouping defined by .variables. The features computed by getFeatures is presented across the columns.

Author(s)

Landon Sego

References

Amidan BG, Ferryman TA. 2005. "Atypical Event and Typical Pattern Detection within Complex Systems." IEEE Aerospace Conference Proceedings, March 2005.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
# Load the data
data(demoData)
str(demoData)

# Calculate features for each subset defined by the unique combinations of
# "subject" and "phase", calculate the mean and standard deviation summary
# statistics to summarize the coefficients of the quadratic model fits
f <- ddply_getFeatures(demoData, c("subject", "phase"),
                      cont = 3:4, disc = 8:9, stats = c("mean", "skew"),
                      fitQargs = list(x1 = -5:5), nJobs = 2)

str(f)
head(f)

pnnl/qFeature documentation built on May 25, 2019, 10:22 a.m.