parse_pubchem_compound | R Documentation |
Parse the json or xml compound data from PubChem
parse_pubchem_compound(file_name)
file_name |
file name of the data. |
#' @title Read the xml database from download_pubchem_compound function #' @description Read the xml database from download_pubchem_compound function #' @author Xiaotao Shen #' shenxt1990@outlook.com #' @param file should be xml format #' @param path Default is .. Should be same with download_pubchem_compound function. #' @return A list #' @importFrom magrittr #' @importFrom plyr dlply . #' @importFrom readr read_delim #' @importFrom dplyr mutate bind_rows select distinct rename full_join filter #' @importFrom tidyr pivot_wider #' @importFrom purrr map #' @importFrom XML xmlParse #' @importFrom R.utils gunzip isGzipped #' @importFrom utils untar #' @importFrom xml2 read_xml #' @export read_pubchem_xml <- function(file, path = ".") if (R.utils::isGzipped(file.path(path, "data", file))) message("Uncompressing data...") R.utils::gunzip(file.path(path, "data", file)) message("Done")
message("Reading data, it may take a while...") result <- xml2::read_xml(stringr::str_replace(file.path(path, "data", file), "\.gz", "")) message("Done")
message("Parsing data, it may take a while...") result <- XML::xmlParse(result) message("Done")
result <- XML::xmlToList(result)
message("Organizing...") pb <- progress::progress_bar$new(total = length(lipidmaps))
lipidmaps_result <- seq_len(length(lipidmaps)) purrr::map(function(i) # cat(i, " ") pb$tick() x <- lipidmaps[[i]] result <- tryCatch( matrix(x[[4]], nrow = 1) as.data.frame(), error = NULL ) if (is.null(result)) return(NULL) colnames(result) <- names(x[[4]]) result ) message("Done.") return(lipidmaps_result)
A data frame.
Xiaotao Shen shenxt1990@outlook.com
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.