extend: Method extend

Description Usage Arguments Value Examples

Description

Wrapper to GMQL EXTEND operator

For each sample in an input dataset, it generates new metadata attributes as result of aggregate functions applied to sample region attributes and adds them to the existing metadata attributes of the sample. Aggregate functions are applied sample by sample.

Usage

1
2
3
4
extend(.data, ...)

## S4 method for signature 'GMQLDataset'
extend(.data, ...)

Arguments

.data

GMQLDataset class object

...

a series of expressions separated by comma in the form key = aggregate. The aggregate is an object of class AGGREGATES. The aggregate functions available are: SUM, COUNT, MIN, MAX, AVG, MEDIAN, STD, BAG, BAGD, Q1, Q2, Q3. Every aggregate accepts a string value, except for COUNT, which does not have any value. Argument of 'aggregate function' must exist in schema, i.e. among region attributes. Two styles are allowed:

  • list of key-value pairs: e.g. sum = SUM("pvalue")

  • list of values: e.g. SUM("pvalue")

"mixed style" is not allowed

Value

GMQLDataset object. It contains the value to use as input for the subsequent GMQLDataset method

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
## This statement initializes and runs the GMQL server for local execution 
## and creation of results on disk. Then, with system.file() it defines 
## the path to the folder "DATASET" in the subdirectory "example" 
## of the package "RGMQL" and opens such folder as a GMQL dataset 
## named "data"

init_gmql()
test_path <- system.file("example", "DATASET", package = "RGMQL")
data <- read_gmql(test_path)

## This statement counts the regions in each sample and stores their number 
## as value of the new metadata attribute RegionCount of the sample.

e <- extend(data, RegionCount = COUNT())

## This statement copies all samples of data dataset into 'res' dataset, 
## and then calculates for each of them two new metadata attributes:
##  1. RegionCount is the number of sample regions;
##  2. MinP is the minimum pvalue of the sample regions.
## res sample regions are the same as the ones in data.

res = extend(data, RegionCount = COUNT(), MinP = MIN("pvalue"))

RGMQL documentation built on Nov. 8, 2020, 5:59 p.m.