fit_beads: Fit multi-level bead data.

Description Usage Arguments Details Value Author(s) See Also Examples

View source: R/fit_functions.R

Description

Fit observed means and variances of data generated by a sample of multi-level beads to a quadratic model involving the Poisson distribution expectations for the relation between them.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
    fit_beads(fcs_file_path, scatter_channels, ignore_channels,
        N_peaks, dyes, detectors, bounds,
        signal_type, instrument_name, minimum_useful_peaks = 3, 
        max_iterations = 10, logicle_width = 1.0, ...)
    
    fit_spherotech(fcs_file_path, scatter_channels, ignore_channels,
        dyes, detectors, bounds,
        signal_type, instrument_name, minimum_useful_peaks = 3, 
        max_iterations = 10, logicle_width = 1.0, ...)

    fit_thermo_fisher(fcs_file_path, scatter_channels, ignore_channels,
        dyes, detectors, bounds,
        signal_type, instrument_name, minimum_useful_peaks = 3, 
        max_iterations = 10, logicle_width = 1.0, ...)

Arguments

fcs_file_path

A character string specifying the file path to the FCS file with the acquired bead data.

scatter_channels

A vector of 2 short channel names (values of the $PnN keywords) specifying the 2 channels that should not be used to gate the main bead population. The first channel should be a forward scatter channel, the second one should be a side scatter channel.

ignore_channels

A vector of short channel names (values of the $PnN keywords) specifying channels that should not be considered for the fitting procedure. Normally, those should be all non-fluorescence channels, such as the time and the (forward and side) scatter channels.

N_peaks

The number of peaks (different beads) to look for. This argument is applicable to the fit_beads function only; the fit_spherotech and fit_thermo_fisher functions have the number of peaks predefined to 8 and 6, resp.

dyes

A vector of dye names. This value does not affect the fitting, but those dyes will be “highlighted” in the provided results.

detectors

A vector of short channel names (values of the $PnN keywords) specifying channels matching to the dyes specified above. The length of this vector shall correspond to the length of the dyes vector. These channels should be all of the same type as specified by the signal_type below, i.e., area or height of the measured signal.

bounds

On some instruments, the lowest LED peaks may be cut off at a data baseline so that the peak statistics will not be valid. Therefore, peaks too close to the baseline need to be excluded from the fitting. Also, many instruments do not maintain good linearity to the full top of scale, so it is also important to specify a maximum level for good linearity and, on each fluorescence channel, exclude any peak that is above that maximum. The bounds argument shall provide a list specifying the minimum and maximum value for the means of valid peaks; peaks with means outsize of this range will be ignored for that particular channel.

signal_type

he type of the signal specified as the "area" or "height". This should match to the signal type that is being captured by the channels specified in the detectors argument. The signal type is being used in order to trigger type-specific peak validity checks. Currently, if signal type equals to "height" then peaks with a mean value lower than the lowest peak mean value are omitted from the fitting. In addition, peaks that are not sufficiently narrow (i.e., exceeding a specific maximum CV) are also omitted from the fitting. Currently, the maximum allowed CV is set to 0.65, but the code is designed to make this user-configurable and signal type dependent eventually.

instrument_name

The make/model of the instrument. The purpose if this argument is to allow for instrument-specific peak validity checks. At this point, if BD Accuri is passed as the instrument type, then peaks with a mean value lower than the lowest peak mean value are omitted from the fitting. Additional instrument-specific peak validity checks may be implemented in the future.

minimum_useful_peaks

Different peaks may be omitted for different channels due to various validity checks described above. This argument specifies the minimal number of valid peaks required in order for the fitting procedure to be performed on a particular fluorescence channel.

max_iterations

The maximum number of iterations for the iterative fitting approach with appropriate weight recalculations.

logicle_width

The width parameter for the Logicle transformation. The data clustering part is performed on data transformed with the Logicle transformation. Generally, the Logicle width (w parameter) of 1.0 has been working well for all our data, but users can change the default by providing a different value.

...

Additional arguments that will be passed to the get_peak_statistics function used internally to calculate peak statistics, such as the maximum.cv.area and maximum.cv.height values.

Details

The fit_beads function performs quadratic fitting for multi-level, multi-dye bead sets. In addition, the fit_spherotech function performs fitting for the Sph8 particle sets from Spherotech, and the fit_thermo_fisher function performs fitting for the 6-level (TF6) Thermo Fisher set. Internally, this is the same fit_beads function except that the number of expected peaks is predefined to 8 and 6, resp. The parameters for the bead data fitting functions are similar to those required for the LED fitting. The main difference is that a single FCS file is expected because the bead sets are provided as a mixture of the different populations and therefore, acquiring data from a single sample will naturally result in all the peaks contained within a single FCS file. All the beads are expected to have the same (or very similar) light scatter properties. Therefore, we perform automated gating on the forward and side scatter channels in order to isolate the main population. In order to do that, the method requires a scatter_channels argument that specifies which 2 channels shall be used for the scatter gating. After the main population is isolated, we use K-means clustering to separate the expression peaks generated by different beads. The number of clusters is pre-defined as 8 for the fit_spherotech function, 6 for the fit_thermo_fisher function, and provided by the user in the form of the N_peaks argument in case of the fit_beads function. This clustering is performed on data transformed with the Logicle transformation.

Value

The value is a list, see the vignette for a detailed description.

Author(s)

Josef Spidlen, Wayne Moore, Faysal El Khettabi

See Also

fit_led

Examples

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
    library(flowCore)
    library(xlsx)
    library(flowQBData)

    inst_xlsx_path <- system.file("extdata", 
        "140126_InstEval_Stanford_LSRIIA2.xlsx", package="flowQBData")
    xlsx <- read.xlsx(inst_xlsx_path, 1, headers=FALSE, stringsAsFactors=FALSE)
    
    ignore_channels_row <- 9
    ignore_channels <- vector()
    i <- 1
    while(!is.na(xlsx[[i+4]][[ignore_channels_row]])) {
        ignore_channels[[i]] <- xlsx[[i+4]][[ignore_channels_row]]
        i <- i + 1
    }
    
    instrument_folder_row <- 9
    instrument_folder_col <- 2
    instrument_folder <- xlsx[[instrument_folder_col]][[instrument_folder_row]]
    
    folder_column <- 16
    folder_row <- 14
    folder <- xlsx[[folder_column]][[folder_row]]
    filename <- xlsx[[folder_column]][[folder_row+1]]
    scatter_channels <- c(
        xlsx[[folder_column]][[folder_row+2]], 
        xlsx[[folder_column]][[folder_row+3]])
        
    fcs_file_path <- system.file("extdata", instrument_folder, folder, 
        filename, package="flowQBData")

    bounds_min_col <- 6
    bounds_min_row <- 7
    bounds_max_col <- 7
    bounds_max_row <- 7
    bounds <- list()
    if (is.na(xlsx[[bounds_min_col]][[bounds_min_row]])) {
        bounds$minimum <- -100
    } else {
        bounds$minimum <- as.numeric(xlsx[[bounds_min_col]][[bounds_min_row]])
    }
    if (is.na(xlsx[[bounds_max_col]][[bounds_max_row]])) {
        bounds$maximum <- 100000
    } else {
        bounds$maximum <- as.numeric(xlsx[[bounds_max_col]][[bounds_max_row]])
    }
    signal_type_col <- 3
    signal_type_row <- 19
    signal_type <- xlsx[[signal_type_col]][[signal_type_row]]
    
    instrument_name_col <- 2
    instrument_name_row <- 5
    instrument_name <- xlsx[[instrument_name_col]][[instrument_name_row]]

    channel_cols <- 3:12
    dye_row <- 11
    detector_row <- 13
    dyes <- as.character(xlsx[dye_row,channel_cols])
    detectors <- as.character(xlsx[detector_row,channel_cols])

    multipeak_results <- fit_spherotech(fcs_file_path, scatter_channels, 
        ignore_channels, dyes, detectors, bounds, 
        signal_type, instrument_name, minimum_useful_peaks = 3, 
        max_iterations = 10, logicle_width = 1.0)

    ## The above is the same as this:
    ## N_peaks <- 8
    ## multipeak_results <- fit_beads(fcs_file_path, scatter_channels, 
    ##     ignore_channels, N_peaks, dyes, detectors, bounds, 
    ##     signal_type, instrument_name, minimum_useful_peaks = 3, 
    ##     max_iterations = 10, logicle_width = 1.0)

    plot(
        exprs(multipeak_results$transformed_data[,"FITC-A"]),
        exprs(multipeak_results$transformed_data[,"Pacific Blue-A"]),
        col=multipeak_results$peak_clusters$cluster, pch='.')
        
    ## Thermo-Fisher Example:
    folder_column <- 17
    folder <- xlsx[[folder_column]][[folder_row]]
    filename <- xlsx[[folder_column]][[folder_row+1]]

    fcs_file_path <- system.file("extdata", instrument_folder, folder, 
        filename, package="flowQBData")

    beads_results_tf <- fit_thermo_fisher(fcs_file_path, scatter_channels,
        ignore_channels, dyes, detectors, bounds, 
        signal_type, instrument_name, minimum_useful_peaks = 3, 
        max_iterations = 10, logicle_width = 1.0)
        
    ## The above is the same as this:
    ## N_peaks <- 6
    ## beads_results_tf <- fit_beads(fcs_file_path, scatter_channels,
    ##     ignore_channels, N_peaks, dyes, detectors, bounds, 
    ##     signal_type, instrument_name, minimum_useful_peaks = 3, 
    ##     max_iterations = 10, logicle_width = 1.0)
    
    plot(
        exprs(beads_results_tf$transformed_data[,"FITC-A"]),
        exprs(beads_results_tf$transformed_data[,"Pacific Blue-A"]),
        col=beads_results_tf$peak_clusters$cluster, pch='.')

flowQB documentation built on May 6, 2019, 3:05 a.m.