text2bin | R Documentation |
This function converts a list of RI files (also known as peak list files) in text or binary format to binary or text format.
text2bin(in.files, out.files=NULL, columns=NULL)
bin2text(in.files, out.files=NULL)
in.files |
A character vector of file paths to the input RI files. |
out.files |
A character vector of file paths. If |
columns |
Either a numeric vector with the positions of the columns for |
These functions transform a list of RI files from and to binary to text representation. The format of the input files is detected dynamically and an error will be issued on invalid files.
Transforming a binary file to text might be useful if you need to inspect what a RI file looks in the inside (for example, you need to check that the peak detection was correct). On the other hand, a text file to binary is highly recommended as it is faster to parse than a text file.
For text files, the order of the columns is important (see option columns
above). The first
entry is the spectrum list, followed by the retention time index and the retention time. If the
column names are other than SPECTRUM
, RETENTION_TIME_INDEX
, and RETENTION_TIME
,
use the respective column names or the column names positions starting at zero (first column
is zero, second is one, and so on).
Many functions relay on those column names and having to pass them as arguments on each function
is tedious, so the global option TS_RI_columns
can be set at the beginning, for example:
# using column names options(TS_RI_columns=c('spec_column', 'RI_column', 'RT_column')) # using column indices (zero-based!) options(TS_RI_columns=c(1, 2, 0))
where "spec_column", "RI_column", and "RT_columns" are the names of the spectrum, retention index and retention time columns.
This command is useful if your RI files were generated by another software. However, it
is highly recommended to simple convert those custom RI files into TargetSearch
's binary
format and do not worry about column names.
A character vector of the created files paths or invisible.
The so-called RI files contain lists of m/z peaks detected for every ion trace measured in the samples. Historically, the file format was a simple tab-delimited text file in the format described below. Note that the column order could differ and additional columns could be present, but they are ignored.
RETENTION_TIME | SPECTRUM | RETENTION_TIME_INDEX |
212.46 | 250:26 256:26 316:27 | 221029.7 |
212.51 | 114:46 162:30 251:27 | 221081.3 |
212.56 | 319:25 | 221132.9 |
212.61 | 95:38 108:30 262:32 266:27 292:25 | 221184.5 |
The retention time is usually represented in seconds, while the retention time in arbitrary units, which depends on the retention time correction standard method (in the table above it is in milliseconds, but other units can be used).
The spectrum column is represented by pairs of m/z and raw intensity (peak height), similarly
as the representation of a metabolite library (see ImportLibrary
). Thus each
pair correspond to a peak of the respective ion trace.
The disadvantage of using text files is they are slow to parse, so a binary format was created
which represents the peak data as binary vectors so they are fast to parse. These files contain
the extension dat
.
Beware that the respective tsSample
object may need to be updated by using
the method fileFormat
.
Alvaro Cuadros-Inostroza
ImportSamples
, tsSample
,
RIcorrect
require(TargetSearchData)
# take three example files from package TargetSearchData
in.files <- tsd_rifiles()[1:3]
# out files to current directory
out.files <- sub(".txt", ".dat", basename(in.files))
# convert to binary format
res <- text2bin(in.files, out.files)
stopifnot(res == out.files)
# convert back to text
res <- bin2text(out.files)
stopifnot(res == basename(in.files))
# Demonstrate how to use the `columns` option
# make dummy RI file with arbitrary column names and save it
tmp <- data.frame(RT=c(101.5,102.5), SPEC=c('12:100 23:100', '114:46 162:30'), RI=c(300, 400) + .75)
# file must be tab-delimited, unquoted strings and no row names
RI_test <- tempfile(fileext=".txt")
write.table(tmp, file=RI_test, sep="\t", quote=FALSE, row.names=FALSE)
# convert this text file to binary format
## wrong! It fails because of invalid columns
# text2bin(RI_test)
# correct! The columns are correct
text2bin(RI_test, columns=c('SPEC', 'RI', 'RT'))
# same example but using integers (not recommended)
text2bin(RI_test, columns=c(1, 2, 0)) # note they start from zero.
# Alternative, set a global option (so it can be used in a session)
opt <- options(TS_RI_columns=c('SPEC', 'RI', 'RT'))
text2bin(RI_test)
# or using integers (again, not recommended)
options(TS_RI_columns=c(1, 2, 0))
text2bin(RI_test)
# unset options
options(opt)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.