ParseCell: Parse the Cellosaurus XML file

View source: R/api_cellosaurus.R

ParseCellR Documentation

Parse the Cellosaurus XML file

Description

GetAllCell will parse the Cellosaurus XML file and extract all content in "cell-line-list" node as a XML document object.

Usage

ParseCell(file)

Arguments

file

File path to a Cellosaurus xml file.

Details

Cellosaurus.xml file contains 5 child nodes in its root node: "header", "cell-line-list", "publication-list", "copyright". (more information in "ftp://ftp.expasy.org/databases/cellosaurus/cellosaurus.xsd") All the cell line informations we need for preparing data are in "cell-line-list" so this function will parse the dataset file and remove all rudundant informations.

Warning: Although it is possible to parsing the online database directly by passing url ftp://ftp.expasy.org/databases/cellosaurus/cellosaurus.xml to function GetAllCell, it is easy to cause crach of R as the huge reuirement of memory. We recommend to download the dataset to a local file and then parse this local file by using this function.

Value

An XMLNode containing all cell lines' information archieved in Cellosaurus dataset.

Author(s)

Jing Tang jing.tang@helsinki.fi Shuyu Zheng shuyu.zheng@helsinki.fi


DrugComb/TidyComb documentation built on June 22, 2022, 2:49 a.m.