build_gaussians: Deconvolve profiles into Gaussian mixture models

Description Usage Arguments Value Examples

View source: R/build_gaussians.R

Description

Identify peaks in co-fractionation profiles by deconvolving peaks in Gaussian mixture models. Models are mixtures of between 1 and 5 Gaussians. Profiles are pre-processed prior to building Gaussians by filtering and cleaning. By default, profiles with fewer than 5 non-missing points, or fewer than 5 consecutive points after imputation of single missing values, are removed. Profiles are cleaned by replacing missing values with near-zero noise, imputing single missing values as the mean of neighboring points, and smoothing with a moving average filter.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
build_gaussians(
  profile_matrix,
  min_points = 1,
  min_consecutive = 5,
  impute_NA = TRUE,
  smooth = TRUE,
  smooth_width = 4,
  max_gaussians = 5,
  criterion = c("AICc", "AIC", "BIC"),
  max_iterations = 50,
  min_R_squared = 0.5,
  method = c("guess", "random"),
  filter_gaussians_center = TRUE,
  filter_gaussians_height = 0.15,
  filter_gaussians_variance_min = 0.5,
  filter_gaussians_variance_max = 50
)

Arguments

profile_matrix

a numeric matrix of co-elution profiles, with proteins in rows, or a MSnSet object

min_points

filter profiles without at least this many total, non-missing points; passed to filter_profiles

min_consecutive

filter profiles without at least this many consecutive, non-missing points; passed to filter_profiles

impute_NA

if true, impute single missing values with the average of neighboring values; passed to clean_profiles

smooth

if true, smooth the chromatogram with a moving average filter; passed to clean_profiles

smooth_width

width of the moving average filter, in fractions; passed to clean_profiles

max_gaussians

the maximum number of Gaussians to fit; defaults to 5. Note that Gaussian mixtures with more parameters than observed (i.e., non-zero or NA) points will not be fit. Passed to choose_gaussians

criterion

the criterion to use for model selection; one of "AICc" (corrected AIC, and default), "AIC", or "BIC". Passed to choose_gaussians

max_iterations

the number of times to try fitting the curve with different initial conditions; defaults to 50. Passed to fit_gaussians

min_R_squared

the minimum R-squared value to accept when fitting the curve with different initial conditions; defaults to 0.5. Passed to fit_gaussians

method

the method used to select the initial conditions for nonlinear least squares optimization (one of "guess" or "random"); see make_initial_conditions for details. Passed to fit_gaussians

filter_gaussians_center

true or false: filter Gaussians whose centres fall outside the bounds of the chromatogram. Passed to fit_gaussians

filter_gaussians_height

Gaussians whose heights are below this fraction of the chromatogram height will be filtered. Setting this value to zero disables height-based filtering of fit Gaussians. Passed to fit_gaussians

filter_gaussians_variance_min

Gaussians whose variance falls below this number of fractions will be filtered. Setting this value to zero disables filtering. Passed to fit_gaussians

filter_gaussians_variance_max

Gaussians whose variance is above this number of fractions will be filtered. Setting this value to zero disables filtering. Passed to fit_gaussians

Value

a list of fit Gaussian mixture models, where each item in the list contains the following five fields: the number of Gaussians used to fit the curve; the R^2 of the fit; the number of iterations used to fit the curve with different initial conditions; the coefficients of the fit model; and the curve predicted by the fit model. Profiles that could not be fit by a Gaussian mixture model above the minimum R-squared cutoff will be absent from the returned list.

Examples

1
2
3
data(scott)
mat <- clean_profiles(scott[seq_len(5), ])
gauss <- build_gaussians(mat, max_gaussians = 3)

fosterlab/PrInCE-R documentation built on Dec. 11, 2020, 3:51 p.m.