get_dataset: Extract data from RegulonDB

Description Usage Arguments Value Author(s) Examples

View source: R/get_dataset.R

Description

This function retrieves data from RegulonDB. Attributes from datasets can be selected and filtered.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
get_dataset(
  regulondb,
  dataset = NULL,
  attributes = NULL,
  filters = NULL,
  and = TRUE,
  interval = NULL,
  partialmatch = NULL,
  output_format = "regulondb_result"
)

Arguments

regulondb

A regulondb() object.

dataset

Dataset of interest. Use the function list_datasets for an overview of valid datasets.

attributes

Vector of attributes to be retrieved.

filters

List of filters to be used. The names should correspond to the attribute and the values correspond to the condition for selection.

and

Logical argument. If FALSE, filters will be considered under the "OR" operator

interval

the filters whose values will be considered as interval

partialmatch

name of the condition(s) with a string pattern for full or partial match in the query

output_format

A string specifying the output format. Possible options are "regulondb_result", "GRanges", "DNAStringSet" or "BStringSet".

Value

By default, a regulon_results object. If specified in the parameter output_format, it can also return either a GRanges object or a Biostrings object.

Author(s)

Carmina Barberena Jonas, Jesús Emiliano Sotelo Fonseca, José Alquicira Hernández, Joselyn Chávez

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
## Connect to the RegulonDB database if necessary
if (!exists("regulondb_conn")) regulondb_conn <- connect_database()

## Build the regulon db object
e_coli_regulondb <-
    regulondb(
        database_conn = regulondb_conn,
        organism = "E.coli",
        database_version = "1",
        genome_version = "1"
    )

## Obtain all the information from the "GENE" dataset
get_dataset(e_coli_regulondb, dataset = "GENE")

## Get the attributes posright and name from the "GENE" dataset
get_dataset(e_coli_regulondb,
    dataset = "GENE",
    attributes = c("posright", "name")
)

## From "GENE" dataset, get the gene name, strand, posright, product name
## and id of all genes regulated with name like "ara", strand as "forward"
## with a position right between 2000 and 40000
get_dataset(
    e_coli_regulondb,
    dataset = "GENE",
    attributes = c("name", "strand", "posright", "product_name", "id"),
    filters = list(
        name = c("ara"),
        strand = c("forward"),
        posright = c("2000", "40000")
    ),
    and = TRUE,
    partialmatch = "name",
    interval = "posright"
)

regutools documentation built on Dec. 20, 2020, 2 a.m.