analyzeTPPCCR: Analyze TPP-CCR experiment

Description Usage Arguments Details Value References See Also Examples

View source: R/analyzeTPPCCR.R

Description

Performs analysis of a TPP-CCR experiment by invoking routines for data import, data processing, normalization, curve fitting, and production of the result table.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
analyzeTPPCCR(
  configTable,
  data = NULL,
  resultPath = NULL,
  idVar = "gene_name",
  fcStr = "rel_fc_",
  naStrs = c("NA", "n/d", "NaN", "<NA>"),
  qualColName = "qupm",
  normalize = TRUE,
  ggplotTheme = tppDefaultTheme(),
  nCores = "max",
  nonZeroCols = "qssm",
  r2Cutoff = 0.8,
  fcCutoff = 1.5,
  slopeBounds = c(1, 50),
  plotCurves = TRUE,
  verbose = FALSE,
  xlsxExport = TRUE,
  fcTolerance = 0.1
)

Arguments

configTable

dataframe, or character object with the path to a file, that specifies important details of the TPP-CCR experiment. See Section details for instructions how to create this object.

data

single dataframe, containing fold change measurements and additional annotation columns to be imported. Can be used instead of specifying the file path in the configTable argument.

resultPath

location where to store dose-response curve plots and results table.

idVar

character string indicating which data column provides the unique identifiers for each protein.

fcStr

character string indicating which columns contain the actual fold change values. Those column names containing the suffix fcStr will be regarded as containing fold change values.

naStrs

character vector indicating missing values in the data table. When reading data from file, this value will be passed on to the argument na.strings in function read.delim.

qualColName

character string indicating which column can be used for additional quality criteria when deciding between different non-unique protein identifiers.

normalize

perform median normalization (default: TRUE).

ggplotTheme

ggplot theme for dose response curve plots.

nCores

either a numerical value given the desired number of CPUs, or 'max' to automatically assign the maximum possible number (default).

nonZeroCols

character string indicating a column that will be used for filtering out zero values.

r2Cutoff

Quality criterion on dose response curve fit.

fcCutoff

Cutoff for highest compound concentration fold change.

slopeBounds

Bounds on the slope parameter for dose response curve fitting.

plotCurves

boolean value indicating whether dose response curves should be plotted. Deactivating plotting decreases runtime.

verbose

print name of each fitted or plotted protein to the command line as a means of progress report.

xlsxExport

produce results table in xlsx format and store at the location specified by the resultPath argument.

fcTolerance

tolerance for the fcCutoff parameter. See details.

Details

Invokes the following steps:

  1. Import data using the tppccrImport function.

  2. Perform normalization by fold change medians (optional) using the tppccrNormalize function. To perform normalization, set argument normalize=TRUE.

  3. Fit and analyze dose response curves using the tppccrCurveFit function.

  4. Export results to Excel using the tppExport function.

The default settings are tailored towards the output of the python package isobarQuant, but can be customized to your own dataset by the arguments idVar, fcStr, naStrs, qualColName.

If resultPath is not specified, result files are stored at the path defined in the first entry of configTable$Path. If the input data are not specified in configTable, no result path will be set. This means that no output files or dose response curve plots are produced and analyzeTPPCCR just returns the results as a data frame.

The function analyzeTPPCCR reports intermediate results to the command line. To suppress this, use suppressMessages.

The dose response curve plots will be stored in a subfolder with name DoseResponse_Curves at the location specified by resultPath.

Only proteins with fold changes bigger than [fcCutoff * (1 - fcTolerance) or smaller than 1/(fcCutoff * (1 - fcTolerance))] will be used for curve fitting. Additionally, the proteins fulfilling the fcCutoff criterion without tolerance will be marked in the output column meets_FC_requirement.

Value

A data frame in which the fit results are stored row-wise for each protein.

References

Savitski, M. M., Reinhard, F. B., Franken, H., Werner, T., Savitski, M. F., Eberhard, D., ... & Drewes, G. (2014). Tracking cancer drugs in living cells by thermal profiling of the proteome. Science, 346(6205), 1255784.

Franken, H, Mathieson, T, Childs, D. Sweetman, G. Werner, T. Huber, W. & Savitski, M. M. (2015), Thermal proteome profiling for unbiased identification of drug targets and detection of downstream effectors. Nature protocols 10(10), 1567-1593.

See Also

tppDefaultTheme

Examples

1
2
3
4
data(hdacCCR_smallExample)
tppccrResults <- analyzeTPPCCR(configTable=hdacCCR_config, 
                               data=hdacCCR_data, nCores=1)
  

Example output

Loading required package: dplyr

Attaching package: 'dplyr'

The following objects are masked from 'package:stats':

    filter, lag

The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union

Loading required package: magrittr
Loading required package: tidyr

Attaching package: 'tidyr'

The following object is masked from 'package:magrittr':

    extract

This is TPP version 3.4.3.
Importing data...

The following valid label columns were detected:
126, 127L, 127H, 128L, 128H, 129L, 129H, 130L, 130H, 131L.

Importing CCR dataset: Panobinostat_1
Removing duplicate identifiers using quality column 'qupm'...
507 out of 507 rows kept for further analysis.
  -> Panobinostat_1 contains 507 proteins.
  -> 494 out of 507 proteins (97.44%) suitable for curve fit (criterion: > 2 valid fold changes per protein).

Importing CCR dataset: Panobinostat_2
Removing duplicate identifiers using quality column 'qupm'...
507 out of 507 rows kept for further analysis.
  -> Panobinostat_2 contains 507 proteins.
  -> 494 out of 507 proteins (97.44%) suitable for curve fit (criterion: > 2 valid fold changes per protein).


Filtering CCR dataset: Panobinostat_1
Removed proteins with zero values in column(s) 'qssm':
	494 out of 507 proteins remaining.
Filtering CCR dataset: Panobinostat_2
Removed proteins with zero values in column(s) 'qssm':
	494 out of 507 proteins remaining.
No output directory specified. No result files or plots will be produced.
Normalizing dataset: Panobinostat_1
Normalizing dataset: Panobinostat_2
Normalization complete.

Normalizing dataset: Panobinostat_1 to reference column 1
Normalizing dataset: Panobinostat_2 to reference column 1
Transforming dataset: Panobinostat_1
Transforming dataset: Panobinostat_2
Transformation complete.

Fitting 169 individual dose response curves to 134 proteins.
Runtime (1 CPUs used): 2.17 secs

Dose response curves fitted sucessfully!
169 out of 169 models with sufficient data points converged (100 %).

Results table created successfully.

Cannot produce xlsx output because no result path is specified.

TPP documentation built on Nov. 8, 2020, 5:55 p.m.