filter_and_extract: Filter and extract function

Description Usage Arguments Details Value Examples

View source: R/filter-extract-function.R

Description

This function lets user to create a new GRangesList with fixed information: seqnames, ranges and strand, and a variable part made up by the regions defined as input. The metadata and metadata_prefix are used to filter the data and choose only the samples that match at least one metdatata with its prefix. The input regions are shown for each sample obtained from filtering.

Usage

1
2
filter_and_extract(data, metadata = NULL, metadata_prefix = NULL,
  region_attributes = NULL, suffix = "antibody_target")

Arguments

data

string GMQL dataset folder path or GRangesList object

metadata

vector of strings containing names of metadata attributes to be searched for in metadata files. Data will be extracted if at least one condition is satisfied: this condition is logically "ANDed" with prefix filtering (see below) if NULL no filtering action occures (i.e every sample is taken for region filtering)

metadata_prefix

vector of strings that will support the metadata filtering. If defined, each 'metadata' is concatenated with the corresponding prefix.

region_attributes

vector of strings that extracts only region attributes specified; if NULL no regions attribute is taken and the output is only GRanges made up by the region coordinate attributes (seqnames, start, end, strand)

suffix

name for each metadata column of GRanges. By default it is the value of the metadata attribute named "antibody_target". This string is taken from sample metadata file or from metadata() associated. If not present, the column name is the name of selected regions specified by 'region_attributes' input parameter

Details

This function works only with datatset or GRangesList all whose samples or Granges have the same region coordinates (chr, ranges, strand) ordered in the same way for each sample

In case of GRangesList data input, the function searches for metadata into metadata() function associated to GRangesList.

Value

GRanges with selected regions

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
## This statement defines the path to the folder "DATASET" in the 
## subdirectory "example" of the package "RGMQL" and filters such folder 
## dataset including at output only "pvalue" and "peak" region attributes

test_path <- system.file("example", "DATASET", package = "RGMQL")
filter_and_extract(test_path, region_attributes = c("pvalue", "peak"))

## This statement imports a GMQL dataset as GRangesList and filters it 
## including at output only "pvalue" and "peak" region attributes, the sort
## function makes sure that the region coordinates (chr, ranges, strand) 
## of all samples are ordered correctly


grl = import_gmql(test_path, TRUE)
sorted_grl = sort(grl)
filter_and_extract(sorted_grl, region_attributes = c("pvalue", "peak"))

RGMQL documentation built on Nov. 8, 2020, 5:59 p.m.