dbimportXMLDir: Import a directory of cyphergenXML files into SQLite

Description Usage Arguments Details Value Author(s) See Also Examples

Description

A wrapper function to import a directory of cyphergenXML files into a SQLite database.

Usage

1
2
3
importXMLDir(xmldir, dbname, tablename, tof = FALSE, maxRows = 10000, 
  maxCols = NULL, tmpdir = tempdir(), splitSubdir = TRUE, verbose = 0, 
  ...)

Arguments

xmldir

A character specifying the directory that holds xml files.

dbname

A character specifying the full name of a SQLite database file, including path.

tablename

A character string specifying the SQL table name.

tof

A logical determining whether reading in TOF or processed data.

maxRows

An integer specifying the largest number of rows that a intermediate binary file can hold.

maxCols

An integer specifying the largest number of columns that the resulting SQLite table can hold. Or a NULL value, indicating that all xml files to be put into a single table. If the number of cyphergenXML files are too large, it is recommended to specify a maxCols at around 100 so that xml files can be partitioned into several SQLite tables.

tmpdir

A character specifying the name of the temporary directory to store binary files.

splitSubdir

A logical. If TRUE, the subdirectory will be used as grouping factors: all xml under the same subdirectory will be put into the same category. Otherwise, no category structure will be used. See cypherGenXMList2BinBlocks for details.

verbose

A logical or non-negative integer specifying the extend of extra messages to be printed out.

...

Additional optional arguments.

Details

This function will import all cyphergenXML files under a certain directory (including subdirectory) into a SQLite database. Each XML file contains one mass spectra. XML files can be grouped into subdirectories so that each group of XML files will go into the same SQLite table. Otherwise all XML files are treated as the same group. SQLite Tables should not contain too many columns. Therefore a limit is given by maxCols. If the number of XML files in a group is too large, we split the XML files evenly into multiple tables. maxRows determined the size of intermediate binary files. If it is too large, the intermediate file might be out of memory and could not be read in.

Value

It returns a logical indicating whether the importing was successful or not.

Author(s)

Y Alex Chen

See Also

importBin2Sqlite, cypherGenXMList2BinBlocks

Examples

1
2
3
4
5
6
7
8
9
## Not run: 
	xmldir <- "E:\SQLData\UPCI-2007-06\UPCI AUG WCX"
	dbname <- "e:\mydatabase1.db"
	system.time(p<-importXMLDir(xmldir, dbname, tof=FALSE, split=FALSE, 
	maxRows=5000, tablename="nocattable", verbose=3))
	conn <- dbConnect("SQLite", "e:/mydatabase1.db", cache.size=100000)
	dbListTables(conn)

## End(Not run) 

zeehio/msProcess documentation built on May 4, 2019, 10:15 p.m.