gbk_features_to_df | R Documentation |
This function processes a list of GenBank features (loaded by read_gbk()) and converts selected features into a data frame. It supports processing multiple gene clusters.
gbk_features_to_df(
gbk_list,
feature = "CDS",
keys = NULL,
process_region = TRUE
)
gbk_list |
A list of lists where each sub-list contains GenBank features for a specific gene cluster. Each sub-list is expected to have a named list of features, with each feature being a character vector. |
feature |
A string specifying the feature type to extract from each gene cluster's FEATURE list (e.g., "CDS" or "gene"). Defaults to "CDS". |
keys |
An optional vector of strings representing specific keys within the feature to retain in the final data frame. If 'NULL' (the default), all keys within the specified feature are included. |
process_region |
A boolean flag; when set to 'TRUE' (the default), special processing is performed on the 'region' key (if present) to extract 'strand', 'start', and 'end' information. |
A data frame where each row corresponds to a feature from the input list. The data frame includes a 'cluster' column indicating the source gbk file.
## Not run:
gbk <- read_gbk("path/to/genbank_file.gbk")
df <- gbk_features_to_df(gbk)
# To extract only specific keys within the "CDS" feature
df <- gbk_features_to_df(gbk, feature = "CDS", keys = c("gene", "region"))
# To disable special processing of the 'region' key
df <- gbk_features_to_df(gbk, process_region = FALSE)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.