cover | R Documentation |
Wrapper to GMQL COVER operator
It takes as input a dataset containing one or more samples and returns another dataset (with a single sample, if no groupBy option is specified) by “collapsing” the input dataset samples and their regions according to certain rules specified by the input parameters. The attributes of the output genomic regions are only the region coordinates, and Jaccard indexes (JaccardIntersect and JaccardResult). Jaccard Indexes are standard measures of similarity of the contributing regions, added as default region attributes. The JaccardIntersect index is calculated as the ratio between the lengths of the intersection and of the union of the contributing regions; the JaccardResult index is calculated as the ratio between the lengths of the result and the union of the contributing regions. If aggregate functions are specified, a new region attribute is added for each aggregate function specified. Output metadata are the union of the input ones. If groupBy clause is specified, the input samples are partitioned in groups, each with distinct values of the grouping metadata attributes, and the cover operation is separately applied to each group, yielding to one sample in the result for each group. Input samples that do not satisfy the groupBy condition are disregarded.
cover(.data, ...)
## S4 method for signature 'GMQLDataset'
cover(.data, min_acc, max_acc, groupBy = conds(), variation = "cover", ...)
.data |
GMQLDataset class object |
... |
a series of expressions separated by comma in the form
key = aggregate. The aggregate is an object of
class AGGREGATES. The aggregate functions available are:
"mixed style" is not allowed |
min_acc |
minimum number of overlapping regions to be considered during execution. It is an integer number, declared also as string. minAcc accepts also:
|
max_acc |
maximum number of overlapping regions to be considered during execution. It is an integer number, declared also as string. maxAcc accept also:
|
groupBy |
|
variation |
string identifying the cover GMQL operator variation. The admissible strings are:
It can be all caps or lowercase |
GMQLDataset object. It contains the value to use as input for the subsequent GMQLDataset method
## This statement initializes and runs the GMQL server for local execution
## and creation of results on disk. Then, with system.file() it defines
## the path to the folder "DATASET" in the subdirectory "example"
## of the package "RGMQL" and opens such file as a GMQL dataset named "exp"
## using CustomParser
init_gmql()
test_path <- system.file("example", "DATASET", package = "RGMQL")
exp = read_gmql(test_path)
## The following statement produces an output dataset with a single output
## sample. The COVER operation considers all areas defined by a minimum
## of two overlapping regions in the input samples, up to any amount of
## overlapping regions.
res = cover(exp, 2, ANY())
## The following GMQL statement computes the result grouping the input
## exp samples by the values of their cell metadata attribute,
## thus one output res sample is generated for each cell value;
## output regions are produced where at least 2 and at most 3 regions
## of grouped exp samples overlap, setting as attributes of the resulting
## regions the minimum pvalue of the overlapping regions (min_pvalue)
## and their Jaccard indexes (JaccardIntersect and JaccardResult).
res = cover(exp, 2, 3, groupBy = conds("cell"), min_pValue = MIN("pvalue"))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.