View source: R/parallel_create_structure_contact_map.R
parallel_create_structure_contact_map | R Documentation |
This function is a wrapper around create_structure_contact_map()
that allows the use of all
system cores for the creation of contact maps. Alternatively, it can be used for sequential
processing of large datasets. The benefit of this function over create_structure_contact_map()
is that it processes contact maps in batches, which is recommended for large datasets. If used
for parallel processing it should only be used on systems that have enough memory available.
Workers can either be set up manually before running the function with
future::plan(multisession)
or automatically by the function (maximum number of workers
is 12 in this case). If workers are set up manually the processing_type
argument should
be set to "parallel manual". In this case workers can be terminated after completion with
future::plan(sequential)
.
parallel_create_structure_contact_map(
data,
data2 = NULL,
id,
chain = NULL,
auth_seq_id = NULL,
distance_cutoff = 10,
pdb_model_number_selection = c(0, 1),
return_min_residue_distance = TRUE,
export = FALSE,
export_location = NULL,
split_n = 40,
processing_type = "parallel"
)
data |
a data frame containing at least a column with PDB ID information of which the name
can be provided to the |
data2 |
optional, a data frame that contains a subset of regions for which distances to regions
provided in the |
id |
a character column in the |
chain |
optional, a character column in the |
auth_seq_id |
optional, a character (or numeric) column in the |
distance_cutoff |
a numeric value specifying the distance cutoff in Angstrom. All values for pairwise comparisons are calculated but only values smaller than this cutoff will be returned in the output. If a cutoff of e.g. 5 is selected then only residues with a distance of 5 Angstrom and less are returned. Using a small value can reduce the size of the contact map drastically and is therefore recommended. The default value is 10. |
pdb_model_number_selection |
a numeric vector specifying which models from the structure files should be considered for contact maps. E.g. NMR models often have many models in one file. The default for this argument is c(0, 1). This means the first model of each structure file is selected for contact map calculations. For AlphaFold predictions the model number is 0 (only .pdb files), therefore this case is also included here. |
return_min_residue_distance |
a logical value that specifies if the contact map should be returned for all atom distances or the minimum residue distances. Minimum residue distances are smaller in size. If atom distances are not strictly needed it is recommended to set this argument to TRUE. The default is TRUE. |
export |
a logical value that indicates if contact maps should be exported as ".csv". The
name of the file will be the structure ID. Default is |
export_location |
optional, a character value that specifies the path to the location in
which the contact map should be saved if |
split_n |
a numeric value that specifies the number of structures that should be included in each batch. Default is 40. |
processing_type |
a character value that is either "parallel" for parallel processing or
"sequential" for sequential processing. Alternatively it can also be "parallel manual" in this
case you have to set up the number of cores on your own using the |
A list of contact maps for each PDB or UniProt ID provided in the input is returned.
If the export
argument is TRUE, each contact map will be saved as a ".csv" file in the
current working directory or the location provided to the export_location
argument.
## Not run:
# Create example data
data <- data.frame(
pdb_id = c("6NPF", "1C14", "3NIR"),
chain = c("A", "A", NA),
auth_seq_id = c("1;2;3;4;5;6;7", NA, NA)
)
# Create contact map
contact_maps <- parallel_create_structure_contact_map(
data = data,
id = pdb_id,
chain = chain,
auth_seq_id = auth_seq_id,
split_n = 1,
)
str(contact_maps[["3NIR"]])
contact_maps
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.