read.FCS: Read an FCS file

View source: R/IO.R

read.FCSR Documentation

Read an FCS file

Description

Check validity and Read Data File Standard for Flow Cytometry

Usage

isFCSfile(files)

read.FCS(filename, transformation="linearize", which.lines=NULL,
         alter.names=FALSE, column.pattern=NULL, invert.pattern = FALSE,
         decades=0, ncdf = FALSE, min.limit=NULL, 
         truncate_max_range = TRUE, dataset=NULL, emptyValue=TRUE, 
         channel_alias = NULL, ...)

Arguments

filename

Character of length 1: filename

transformation

An character string that defines the type of transformation. Valid values are linearize (default), linearize-with-PnG-scaling, or scale. The linearize transformation applies the appropriate power transform to the data. The linearize-with-PnG-scaling transformation applies the appropriate power transform for parameters stored on log scale, and also a linear scaling transformation based on the 'gain' (FCS \$PnG keywords) for parameters stored on a linear scale. The scale transformation scales all columns to $[0,10^decades]$. defaulting to decades=0 as in the FCS4 specification. A logical can also be used: TRUE is equal to linearize and FALSE(or NULL) corresponds to no transformation. Also when the transformation keyword of the FCS header is set to "custom" or "applied", no transformation will be used.

which.lines

Numeric vector to specify the indices of the lines to be read. If NULL all the records are read, if of length 1, a random sample of the size indicated by which.lines is read in. It's used to achieve partial disk IO for the large FCS that can't fit the full data into memory. Be aware the potential slow read (especially for the large size of random sampling) due to the frequent disk seek operations.

alter.names

boolean indicating whether or not we should rename the columns to valid R names using make.names. The default is FALSE.

column.pattern

An optional regular expression defining parameters we should keep when loading the file. The default is NULL.

invert.pattern

logical. By default, FALSE. If TRUE, inverts the regular expression specified in column.pattern. This is useful for indicating the channel names that we do not want to read. If column.pattern is set to NULL, this argument is ignored.

decades

When scaling is activated, the number of decades to use for the output.

ncdf

Deprecated. Please use 'ncdfFlow' package for cdf based storage.

min.limit

The minimum value in the data range that is allowed. Some instruments produce extreme artifactual values. The positive data range for each parameter is completely defined by the measurement range of the instrument and all larger values are set to this threshold. The lower data boundary is not that well defined, since compensation might shift some values below the original measurement range of the instrument. This can be set to an arbitrary number or to NULL (the default value), in which case the original values are kept. When the transformation keyword of the FCS header is set (typically to "custom" or "applied"), no shift up to min.limit will occur.

truncate_max_range

logical type. Default is TRUE. can be optionally turned off to avoid truncating the extreme positive value to the instrument measurement range .i.e.'$PnR'. When the transformation keyword of the FCS header is set (typically to "custom" or "applied"), no truncation will occur.

dataset

The FCS file specification allows for multiple data segments in a single file. Since the output of read.FCS is a single flowFrame we can't automatically read in all available sets. This parameter allows to chose one of the subsets for import. Its value is supposed to be an integer in the range of available data sets. This argument is ignored if there is only a single data segment in the FCS file.

emptyValue

boolean indicating whether or not we allow empty value for keyword values in TEXT segment. It affects how the double delimiters are treated. IF TRUE, The double delimiters are parsed as a pair of start and end single delimiter for an empty value. Otherwise, double delimiters are parsed one part of string as the keyword value. default is TRUE.

channel_alias

an optional data.frame used to provide the alias of the channels to standardize and solve the discrepancy across FCS files. It is expected to contain 'alias' and 'channels' column of 'channel_alias'. Each row/entry specifies the common alias name for a collection of channels (comma separated). See examples for details.

For each channel in the FCS file, read.FCS will first attempt to find an exact match in the 'channels' column. If no exact match is found, it will check for partial matches. That is, if "V545" is in the 'channels' column of 'channel_alias' and "V545-A" is present in the FCS file, this partial match will allow the corresponding 'alias' to be assigned. This partial matching only works in this direction ("V545-A" in the 'channels' column will not match "V545" in the FCS file) and care should be exercised to ensure no unintended partial matching of other channel names. If no exact or partial match is found, the channel is unchanged in the resulting flowFrame.

...

ignore.text.offset: whether to ignore the keyword values in TEXT segment when they don't agree with the HEADER. Default is FALSE, which throws the error when such discrepancy is found. User can turn it on to ignore TEXT segment when he is sure of the accuracy of HEADER so that the file still can be read.

files

A vector of filenames

Details

The function isFCSfile determines whether its arguments are valid FCS files.

The function read.FCS works with the output of the FACS machine software from a number of vendors (FCS 2.0, FCS 3.0 and List Mode Data LMD). However, the FCS 3.0 standard includes some options that are not yet implemented in this function. If you need extensions, please let me know. The output of the function is an object of class flowFrame.

For specifications of FCS 3.0 see http://www.isac-net.org and the file ../doc/fcs3.html in the doc directory of the package.

The which.lines arguments allow you to read a subset of the record as you might not want to read the thousands of events recorded in the FCS file. It is mainly used when there is not enough memory to read one single FCS (which probably will not happen). It will probably take more time than reading the entire FCS (due to the multiple disk IO).

Value

isFCSfile returns a logical vector.

read.FCS returns an object of class flowFrame that contains the data in the exprs slot, the parameters monitored in the parameters slot and the keywords and value saved in the header of the FCS file.

Author(s)

F. Hahne, N.Le Meur

See Also

read.flowSet

Examples


## a sample file
fcsFile <- system.file("extdata", "0877408774.B08", package="flowCore")

## read file and linearize values
samp <-  read.FCS(fcsFile, transformation="linearize")
exprs(samp[1:3,])
keyword(samp)[3:6]
class(samp)

## Only read in lines 2 to 5
subset <- read.FCS(fcsFile, which.lines=2:5, transformation="linearize")
exprs(subset)

## Read in a random sample of 100 lines
subset <- read.FCS(fcsFile, which.lines=100, transformation="linearize")
nrow(subset)

#manually supply the alias vs channel options mapping as a data.frame
map <- data.frame(alias = c("A", "B")
                  , channels = c("FL2", "FL4")
)
fr <- read.FCS(fcsFile, channel_alias = map)
fr


RGLab/flowCore documentation built on Aug. 26, 2024, 8:52 a.m.