ImportSamples: Sample definitions

ImportSamplesR Documentation

Sample definitions

Description

These functions import sample information from either a tab delimited file or a file system path. The sample information contain sample names, CDF files and their measurement day, also known as, measurement run or measurement batch.

Usage

ImportSamples(sampfile, CDFpath, RIpath, ftype=c("binary", "text"), ...)

ImportSamplesFromDir(CDFpath=".", RIfiles=FALSE, ftype=c("binary", "text"),
		verbose=FALSE, ...)

Arguments

sampfile

A character string naming a sample file or a data.frame. See details.

CDFpath

A character string naming a directory where the CDF files are located (by default is the current directory)

RIpath

A character string naming a directory where the RI corrected text files are/will be located.

RIfiles

Logical. If TRUE, the function will look for for RI files (RI_*) instead of CDF files (the default).

ftype

A character string giving the file type for RI files. Options are "binary" and "text".

verbose

Logical. Show more output if TRUE.

...

Extra options whose meaning depend on the function: for ImportSamples, the options will be passed to read.delim, while for ImportSamplesFromDir, these are passed to dir

.

Details

The sample file is a tab-delimited text file or, alternatively, a data.frame like the one would get by calling read.delim. The following columns are expected, though only the first one below is mandatory.

  • CDF_FILE - The list of baseline corrected CDF files (mandatory).

  • MEASUREMENT_DAY - The day or batch when the sample was measured (optional).

  • SAMPLE_NAME - A unique sample identifier (optional).

The function first looks for columns matching those names. If they are not found, then it looks for columns with the substrings 'cdf', 'day' and 'name' respectively. If the 'cdf' column is not found, the function raises an error, but if 'day' is not found, then it assumes the measurement day is the same for all. Moreover, if the column for sample names/identifiers is missing, then the function uses the 'cdf' file names as identifiers.

During the column matching, the first match is taken in case of ambiguity. The column order does not matter. Other columns could be included in that file. They won't be used by the script, but will be included in the "sample" R object.

The files given in the 'cdf' column may optionally include a relative path, such as mypath/myfile.cdf, which will be relative to the working directory or relative to the argument CDFpath. Optionally, the file extension can be omitted because ‘TargetSearch’ adds the .cdf extension automatically if missing. Note: it may fail in case sensitive file systems as a lower case extension is appended.

If the parameter CDFpath is missing, then the current directory is used. If RIpath is missing, the value of CDFpath will be used.

The ftype parameter sets the file format of the RI files, which are created by the function RIcorrect. "text" is the classic text format and "binary" is a binary version, designed for speed, of the text format (TargetSearch >= 1.12.0). The file format can be identified by the file extension (".txt" for text, ".dat" for binary), but this is just an indicator: the file format is detected dynamically during file reading. Use the method fileFormat to set or get this parameter.

The function ImportSamplesFromDir scans for cdf files in the current or given directory. The search can be made recursive by passing the parameter recursive=TRUE (actually passed to dir).

The parameter verbose will print a list of detected files is set to TRUE. This can be used for debugging.

Value

A tsSample object.

Author(s)

Alvaro Cuadros-Inostroza, Matthew Hannah, Henning Redestig

See Also

ImportLibrary, tsSample

Examples

# get the sample definition definition file
require(TargetSearchData)
cdfpath <- tsd_data_path()
sample.file  <- tsd_file_path("samples.txt")

# set a path where the RI files will be created
RIpath <- "."

# import samples
sampleDescription <- ImportSamples(sample.file, CDFpath = cdfpath, RIpath = RIpath)

# change the sample names. It is required that the names must be unique.
sampleNames(sampleDescription) <- paste("Sample", 1:length(sampleDescription), sep = "_")

# change the file paths (relative to the working path)
CDFpath(sampleDescription) <- "my_cdfs"
RIpath(sampleDescription)  <- "my_RIs"


acinostroza/TargetSearch documentation built on Nov. 22, 2024, 3:31 p.m.