View source: R/perform_search.R
perform_search | R Documentation |
This function allows users to perform highly customizable searches in the RCSB Protein Data Bank (PDB) by specifying detailed search criteria. It interfaces directly with the RCSB PDB's RESTful API, enabling complex queries to retrieve specific data, such as PDB entries, assemblies, polymer entities, non-polymer entities, and more.
perform_search(
search_operator,
return_type = "ENTRY",
request_options = NULL,
return_with_scores = FALSE,
return_raw_json_dict = FALSE,
verbosity = TRUE
)
search_operator |
An object that specifies the search criteria. This object can be constructed using various operator functions:
Please see the Details section. |
return_type |
A string specifying the type of data to return. The available options for
|
request_options |
A list of additional options to further customize the search request. These options can include:
|
return_with_scores |
Logical; if |
return_raw_json_dict |
Logical; if |
verbosity |
Logical; if |
The operators allow you to build complex search queries tailored to your specific needs. Detailed documentation for each search operator can be found in the RCSB PDB Search Operators. The searchable attributes include annotations from the mmCIF dictionary, external resources, and those added by RCSB PDB. Both internal additions to the mmCIF dictionary and external resource annotations are prefixed with 'rcsb_'. For a complete list of available attributes for text searches, refer to the Structure Attributes Search and Chemical Attributes Search pages.
The function returns search results based on the specified return_type
:
ENTRY
A vector of PDB IDs that match the search criteria.
ASSEMBLY
A list of PDB IDs with appended assembly IDs, formatted as "PDB_ID-ASSEMBLY_ID"
.
POLYMER_ENTITY
A list of PDB IDs with appended entity IDs for polymeric chains.
NON_POLYMER_ENTITY
A list of PDB IDs with appended entity IDs for non-polymeric components.
POLYMER_INSTANCE
A list of PDB IDs with appended asym IDs for specific polymer instances.
CHEMICAL_COMPONENT
A list of chemical component identifiers.
# Example 1: Search for Polymer Entities from Mus musculus and Homo sapiens
search_operator <- InOperator(
attribute = "rcsb_entity_source_organism.taxonomy_lineage.name",
value = c("Mus musculus", "Homo sapiens")
)
results <- perform_search(
search_operator = search_operator,
return_type = "POLYMER_ENTITY"
)
results
# Example 2: Search for Entries Released After a Specific Date
operator_date <- ComparisonOperator(
attribute = "rcsb_accession_info.initial_release_date",
value = "2019-08-20",
comparison_type = "GREATER"
)
request_options <- list(
facets = list(
list(
name = "Methods",
aggregation_type = "terms",
attribute = "exptl.method"
)
)
)
results <- perform_search(
search_operator = operator_date,
return_type = "ENTRY",
request_options = request_options
)
results
# Example 3: Search for Symmetric Dimers with DNA-Binding Domain
operator_symbol <- ExactMatchOperator(
attribute = "rcsb_struct_symmetry.symbol",
value = "C2"
)
operator_kind <- ExactMatchOperator(
attribute = "rcsb_struct_symmetry.kind",
value = "Global Symmetry"
)
operator_full_text <- DefaultOperator(
value = "\"heat-shock transcription factor\""
)
operator_dna_count <- ComparisonOperator(
attribute = "rcsb_entry_info.polymer_entity_count_DNA",
value = 1,
comparison_type = "GREATER_OR_EQUAL"
)
query_group <- list(
type = "group",
logical_operator = "and",
nodes = list(
list(
type = "terminal",
service = "text",
parameters = operator_symbol
),
list(
type = "terminal",
service = "text",
parameters = operator_kind
),
list(
type = "terminal",
service = "full_text",
parameters = operator_full_text
),
list(
type = "terminal",
service = "text",
parameters = operator_dna_count
)
)
)
results <- perform_search(
search_operator = query_group,
return_type = "ASSEMBLY"
)
results
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.