map: Method map

Description Usage Arguments Details Value Examples

Description

Wrapper to GMQL MAP operator

It computes, for each sample in the right dataset, aggregates over the values of the right dataset regions that intersect with a region in a left dataset sample, for each region of each sample in the left dataset. The number of generated output samples is the Cartesian product of the samples in the two input datasets; each output sample has the same regions as the related input left dataset sample, with their attributes and values, plus the attributes computed as aggregates over right region values. Output sample metadata are the union of the related input sample metadata, whose attribute names are prefixed with 'left' or 'right' respectively.

Usage

1
2
3
4
5
map(x, y, ...)

## S4 method for signature 'GMQLDataset'
map(x, y, ..., joinBy = conds(),
  count_name = "")

Arguments

x

GMQLDataset class object

y

GMQLDataset class object

...

a series of expressions separated by comma in the form key = aggregate. The aggregate is an object of class AGGREGATES. The aggregate functions available are: SUM, COUNT, MIN, MAX, AVG, MEDIAN, STD, BAG, BAGD, Q1, Q2, Q3. Every aggregate accepts a string value, except for COUNT, which does not have any value. Argument of 'aggregate function' must exist in schema, i.e. among region attributes. Two styles are allowed:

  • list of key-value pairs: e.g. sum = SUM("pvalue")

  • list of values: e.g. SUM("pvalue")

"mixed style" is not allowed

joinBy

conds function to support methods with groupBy or JoinBy input parameter

count_name

string defining the metadata count name; if it is not specified the name is "count_left_right"

Details

When the joinby clause is present, only pairs of samples of x dataset and of y dataset with metadata M1 and M2, respectively, that satisfy the joinby condition are considered.

The clause consists of a list of metadata attribute names that must be present with equal values in both M1 and M2

Value

GMQLDataset object. It contains the value to use as input for the subsequent GMQLDataset method

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
## This statement initializes and runs the GMQL server for local execution 
## and creation of results on disk. Then, with system.file() it defines 
## the path to the folders "DATASET" and "DATASET_GDM" in the subdirectory 
## "example" of the package "RGMQL", and opens such folders as a GMQL 
## dataset named "exp" and "ref", respectively, using CustomParser

init_gmql()
test_path <- system.file("example", "DATASET", package = "RGMQL")
test_path2 <- system.file("example", "DATASET_GDM", package = "RGMQL")
exp = read_gmql(test_path)
ref = read_gmql(test_path2)

## This statement counts the number of regions in each sample from exp 
## dataset that overlap with a ref dataset region, and for each ref region 
## it computes the minimum score of all the regions in each exp sample that 
## overlap with it. The MAP joinBy option ensures that only the exp samples 
## referring to the same 'cell_tissue' of a ref sample are mapped on such 
## ref sample; exp samples with no cell_tissue metadata attribute, or with 
## such metadata attribute, but with a different value from the one(s) 
## of ref sample(s), are disregarded.

out = map(ref, exp, minScore = MIN("score"), joinBy = conds("cell_tissue"))

RGMQL documentation built on Nov. 8, 2020, 5:59 p.m.