Merge Files in a Directory into a Spectra Object

Share:

Description

This function will read all files of a given type in a directory, and use the file names to construct group membership and assign colors and symbols. All the data is placed into an object of S3 class Spectra. This function uses read.table to read files so it is very flexible.

Usage

1
2
3
4
5
6
files2SpectraObject(gr.crit = NULL, gr.cols = c("auto"),
freq.unit = "no frequency unit provided",
int.unit = "no intensity unit provided",
descrip = "no description provided",
fileExt = "\\.(csv|CSV)$",
out.file = "mydata", debug = FALSE, ...)

Arguments

gr.crit

Group Criteria. A vector of character strings which will be searched for among the file names in order to assign an individual spectrum/sample to group membership. Warnings are issued if there are file names that don't match entries in gr.crit or there are entries in gr.crit that don't match any file names. See Details for some nuances.

gr.cols

Group Colors. Either the word "auto", in which case colors will be automatically assigned, or a vector of acceptable color names with the same length as gr.crit. In the latter case, colors will be assigned one for one, so the first element of gr.crit is assigned the first element of gr.col and so forth. See details below for some other issues to consider.

freq.unit

A character string giving the units of the x-axis (frequency or wavelength).

int.unit

A character string giving the units of the y-axis (some sort of intensity).

descrip

A character string describing the data set that will be stored. This string is used in some plots so it is recommended that its length be less than about 40 characters.

fileExt

A character string giving the extension of the files to be processed. regex strings can be used. For instance, the default finds files with either ".csv" or ".CSV" as the extension. Matching is done via a grep process, which is greedy.

out.file

A file name acceptable to the save function. The completed object of S3 class Spectra will be written to this file.

debug

Logical; set to TRUE for troubleshooting when an error is thrown during import.

...

Arguments to be passed to read.table. You MUST supply values for sep, dec and header consistent with your file structure, unless they are the same as the defaults for read.table.

Details

The linking of groups with colors is handled by groupNcolor.

The matching of gr.crit against the sample file names is done one at a time, in order. This means that the entries in gr.crit must be mutually exclusive. For example, if you have files with names like "Control_1" and "Sample_1" and use gr.crit = c("Control", "Sample") groups will be assigned as you would expect. But, if you have file names like "Control_1_Shade" and "Sample_1_Sun" you can't use gr.crit = c("Control", "Sample", "Sun", "Shade") because each criteria is grepped in order, and the "Sun/Shade" phrases, being last, will form the basis for your groups. Because this is a grep process, you can get around this by using regular expressions in your gr.crit argument to specify the desired groups in a mutually exclusive manner. In this second example, you could use gr.crit = c("Control(.*)Sun", "Control(.*)Shade", "Sample(.*)Sun", "Sample(.*)Shade") to have your groups assigned based upon both phrases in the file names.

files2SpectraObject acts on all files in the current working directory with the specified fileExt. The first column should contain the frequency values and the second column the intensity values. The files may have a header or not (supply header = TRUE/FALSE as necessary). The frequency column is assumed to be the same in all files.

If fileExt contains any of "dx", "DX", code"jdx" or "JDX", then the files will be processed by readJDX. Consider setting debug = TRUE for this format, as there are many options for JCAMP, and most are untested. See readJDX for known limitations.

There should be no other files of the given extension in the directory except those containing the data to be processed by files2SpectraObject, as all files with that format in the directory will be processed.

Value

A object of class Spectra. An unnamed object of S3 class Spectra is also written to out.file. To read it back into the workspace, use new.name <- loadObject(out.file) (loadObject is package R.utils).

Warning

Files whose names are not matched using gr.crit are still incorporated into the Spectra object, but they are not assigned a group or color. They don't plot, but they do take up space in a plot! A warning is issued in these cases, since one wouldn't normally want a spectrum to be orphaned this way.

Author(s)

Bryan A. Hanson, DePauw University. hanson@depauw.edu

References

https://github.com/bryanhanson/ChemoSpec

See Also

matrix2SpectraObject.

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.