data_fetcher | R Documentation |
The 'data_fetcher' function provides a flexible way to access data from the RCSB Protein Data Bank (PDB). By specifying an identifier, data type, and a set of properties, users can tailor the data retrieval process to meet their specific research needs. The function integrates several steps, including validating IDs, generating a JSON query, fetching the data, and formatting the response.
data_fetcher(
id = NULL,
data_type = "ENTRY",
properties = NULL,
return_as_dataframe = TRUE,
verbosity = FALSE
)
id |
A single identifier or a list of identifiers for the data to be fetched. These IDs correspond to the entries, assemblies, polymer entities, or other entities within the RCSB PDB. The ID must match the data type you are querying (e.g., PDB ID for entries, assembly ID for assemblies). |
data_type |
A string specifying the type of data to fetch. The available options for
Each |
properties |
A list or dictionary of properties to be included in the data fetching process. The properties should match the data type you are querying. For example, if you are fetching |
return_as_dataframe |
A boolean indicating whether to return the response as a dataframe. If |
verbosity |
A boolean flag indicating whether to print status messages during the function execution. When set to |
The 'data_fetcher' function is particularly useful for researchers who need to access and analyze specific subsets of PDB data. By providing a list of IDs and the corresponding data type, users can retrieve only the information relevant to their study, reducing the need to manually filter or process large datasets. The function also supports fetching multiple properties simultaneously, allowing for a more comprehensive data retrieval process.
Depending on the value of return_as_dataframe
, this function returns either a dataframe or the raw data in its original format. The dataframe format is particularly useful for further data analysis and visualization within R, while the raw format may be preferred for more complex or custom data processing tasks.
# Example 1: Fetching basic entry information
properties <- list(cell = c("length_a", "length_b", "length_c"), exptl = c("method"))
data_fetcher(
id = c("4HHB"),
data_type = "ENTRY",
properties = properties,
return_as_dataframe = TRUE
)
# Example 2: Fetching polymer entity data
properties <- list(
rcsb_entity_source_organism = c("ncbi_taxonomy_id", "ncbi_scientific_name"),
rcsb_cluster_membership = c("cluster_id", "identity")
)
data_fetcher(
id = c("4HHB_1", "12CA_1"),
data_type = "POLYMER_ENTITY",
properties = properties,
return_as_dataframe = TRUE
)
# Example 3: Fetching non-polymer entity data
properties <- list(
rcsb_nonpolymer_entity = c("details", "formula_weight", "pdbx_description"),
rcsb_nonpolymer_entity_container_identifiers = c("chem_ref_def_id")
)
data_fetcher(
id = c("3PQR_5", "3PQR_6"),
data_type = "NONPOLYMER_ENTITY",
properties = properties,
return_as_dataframe = TRUE
)
# Example 4: Fetching chemical component data
properties <- list(
rcsb_id = list(),
chem_comp = list("type", "formula_weight", "name", "formula"),
rcsb_chem_comp_info = list("initial_release_date")
)
data_fetcher(
id = c("NAG", "EBW"),
data_type = "CHEMICAL_COMPONENT",
properties = properties,
return_as_dataframe = TRUE
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.