pcaClustId: Identify clusters in a PCA plot from a list of co-variates...

Description Usage Arguments Details Value See Also

Description

attempts to the identify clusters in a pca from a data frame of covariates, using partitioning around the medoid clustering.

Usage

1
2
pcaClustId(pcaResult = NULL, peakTable = NULL, coVarTable = NULL,
  obsNames = NULL, ...)

Arguments

pcaResult

a pcaRes class object

peakTable

optional if pcaResult argument not supplied. Either a data.frame, full file path as a character string to a .csv file of a peak table in the form observation (samples) in columns and variables (Mass spectral signals) in rows. If argument is not supplied a GUI file selection window will open and a .csv file can be selected.

coVarTable

either a data.frame, full file path as a character string to a .csv file of a co-variates table in the form observation (sample) names in the first column and co-variates from the 2nd column onward. If argument is not supplied a GUI file selection window will open and a .csv file can be selected.

obsNames

character vector of observation (i.e. sample/ QC/ Blank) names to identify appropriate observation (sample) columns.

...

additional arguments to pamk.

Details

potential clusters in pcaRes scores are identified using partitioning around the medoid clustering (pamk) from the fpc package with an estimation of the number of clusters. Given a data.frame of co-variates/ sample information, the most likely explanatory co-variate AND/OR potential two-factor interactions will be established using linear modelling lm.

Co-variates containing missing values or only one unique value will not be considered. The linear model consists of response ~ terms where response is the clusters established by pamk and terms are the factor levels of the co-variate table. The best explanatory co-variate for the PCA clustering is defined as the linear model with the highest coefficient of determination (R2).

Value

a list containing three elements:

1. a list pamkClust containing three elements returned from pamk

2. a list lmCoVarClust containing the linear models obtained from lm.

3. a named character vector rSquaredLm containing the coefficients of determination from the linear models. Named with each covariate or two-factor interaction considered.

See Also

lowess, na.spline, pcaOutId.


WMBEdmands/MetMSLine documentation built on May 9, 2019, 10:03 p.m.