queryEncodeGeneric: Produce a subset of data following predefined criteria.

Description Usage Arguments Details Value Examples

View source: R/query.R

Description

After running the prepare_ENCODEDb function, this function will allow you to extract a subset of the files it describes. Search terms are passed in as named parameters, where the parameter's name indicates the field, and its value the terms to be searched for. Each term may be a vector of values, which are processed using the OR logical operation (the function will return all results matching at least one of the terms). In contrast, separate search fields are subjected to the AND logical operation.

Usage

1
2
3
4
5
6
7
queryEncodeGeneric(
  df = get_encode_df(),
  fixed = TRUE,
  quiet = FALSE,
  fuzzy = FALSE,
  ...
)

Arguments

df

data.frame containing ENCODE experiment and dataset metadata

fixed

logical. If TRUE, pattern is a string to be matched as it is. If FALSE, case insensitive perl regular expression matching is used.

quiet

logical enables to switch off the result summary information

fuzzy

logical. If TRUE while fixed is also TRUE, allows searching by substrings and alternate space or hyphenation spellings. For example, "MCF7" will match "MCF-7" or "RNA-Seq" will match "polyA mRNA RNA-Seq".

...

All other named parameters are used as terms to be searched for, with the parameter name naming the field (biosample_name, assay, etc.) and the value being the terms that are searched for.

Details

Possible search fields include the following: accession, assay name, biosample, dataset accession, file accession, file format, laboratory, donor organism, target and treatment.

By default, the query is made using exact matches. Set fixed to FALSE to use regular expression matching, and fuzzy to TRUE to search for substring or alternate hyphenations. These options cannot be combined.

Value

a data.frames containing data about ENCODE experiments and datasets

Examples

1
2
3
4
5
6
7
8
9
    # Will return all bam files from biosample A549.
    res = queryEncodeGeneric(biosample_name = "A549", file_format = "bam")

    # Will return all bam files from biosamples A549 and HeLA-S3.
    res = queryEncodeGeneric(biosample_name = c("A549", "HeLa-S3"), file_format = "bam")

    # Will return all fles where the assay contains RNA-Seq or a substrings
    # thereof, such as "polyA mRNA RNA-Seq" or "small RNA-Seq".
    res = queryEncodeGeneric(assay="RNA-Seq", fuzzy=TRUE)

CharlesJB/ENCODExplorer documentation built on Dec. 9, 2019, 10:11 a.m.