designSampleSizeClassificationPlots: Visualization for sample size calculation in classification
In MSstatsSampleSize: Simulation tool for optimal design of high-dimensional MS-based proteomics experiment

Description Usage Arguments Details Value Author(s) Examples

View source: R/designSampleSizeClassificationPlots.R

To illustrate the mean classification accuracy and protein importance under different sample sizes through predictive accuracy plot and protein importance plot.

designSampleSizeClassificationPlots(
  data,
  list_samples_per_group,
  num_important_proteins_show = 10,
  protein_importance_plot = TRUE,
  predictive_accuracy_plot = TRUE,
  x.axis.size = 10,
  y.axis.size = 10,
  protein_importance_plot_width = 3,
  protein_importance_plot_height = 3,
  predictive_accuracy_plot_width = 4,
  predictive_accuracy_plot_height = 4,
  ylimUp_predictive_accuracy = 1,
  ylimDown_predictive_accuracy = 0,
  address = ""
)

`data`	A list of outputs from function `designSampleSizeClassification`. Each element represents the results under a specific sample size. The input should include at least two simulation results with different sample sizes.
`list_samples_per_group`	A vector includes the different sample sizes simulated. This is required. The number of simulated sample sizes in the input ‘data’ should be equal to the length of list_samples_per_group
`num_important_proteins_show`	The number of proteins to show in protein importance plot.
`protein_importance_plot`	TRUE(default) draws protein importance plot.
`predictive_accuracy_plot`	TRUE(default) draws predictive accuracy plot.
`x.axis.size`	Size of x-axis labeling in predictive accuracy plot and protein importance plot. Default is 10.
`y.axis.size`	Size of y-axis labels in predictive accuracy plot and protein importance plot. Default is 10.
`protein_importance_plot_width`	Width of the saved pdf file for protein importance plot. Default is 3.
`protein_importance_plot_height`	Height of the saved pdf file for protein importance plot. Default is 3.
`predictive_accuracy_plot_width`	Width of the saved pdf file for predictive accuracy plot. Default is 4.
`predictive_accuracy_plot_height`	Height of the saved pdf file for predictive accuracy plot. Default is 4.
`ylimUp_predictive_accuracy`	The upper limit of y-axis for predictive accuracy plot. Default is 1. The range should be 0 to 1.
`ylimDown_predictive_accuracy`	The lower limit of y-axis for predictive accuracy plot. Default is 0.0. The range should be 0 to 1.
`address`	the name of folder that will store the results. Default folder is the current working directory. The other assigned folder has to be existed under the current working directory. An output pdf file is automatically created with the default name of ‘PredictiveAccuracyPlot.pdf’ and ‘ProteinImportancePlot.pdf’. The command address can help to specify where to store the file as well as how to modify the beginning of the file name. If address=FALSE, plot will be not saved as pdf file but showed in window.

This function visualizes for sample size calculation in classification. Mean predictive accuracy and mean protein importance under each sample size is from the input ‘data’, which is the output from function designSampleSizeClassification.

To illustrate the mean predictive accuracy and protein importance under different sample sizes, it generates two types of plots in pdf files as output: (1) The predictive accuracy plot, The X-axis represents different sample sizes and y-axis represents the mean predictive accuracy. The reported sample size per condition can be used to design future experiment

(2) The protein importance plot includes multiple subplots. The number of subplots is equal to ‘list_samples_per_group’. Each subplot shows the top 'num_important_proteins_show' most important proteins under each sample size. The Y-axis of each subplot is the protein name and X-axis is the mean protein importance under the sample size.

predictive accuracy plot is the mean predictive accuracy under different sample sizes. The X-axis represents different sample sizes and y-axis represents the mean predictive accuracy.

protein importance plot includes multiple subplots. The number of subplots is equal to ‘list_samples_per_group’. Each subplot shows the top ‘num_important_proteins_show’ most important proteins under each sample size. The Y-axis of each subplot is the protein name and X-axis is the mean protein importance under the sample size.

Ting Huang, Meena Choi, Olga Vitek.

data(OV_SRM_train)
data(OV_SRM_train_annotation)

# simulate different sample sizes
# 1) 10 biological replicats per group
# 2) 25 biological replicats per group
# 3) 50 biological replicats per group
# 4) 100 biological replicats per group
list_samples_per_group <- c(10, 25, 50, 100)

# save the simulation results under each sample size
multiple_sample_sizes <- list()
for(i in seq_along(list_samples_per_group)){
    # run simulation for each sample size
    simulated_datasets <- simulateDataset(data = OV_SRM_train,
                                          annotation = OV_SRM_train_annotation,
                                          num_simulations = 10, # simulate 10 times
                                          expected_FC = "data",
                                          list_diff_proteins =  NULL,
                                          select_simulated_proteins = "proportion",
                                          protein_proportion = 1.0,
                                          protein_number = 1000,
                                          samples_per_group = list_samples_per_group[i],
                                          simulate_valid = FALSE,
                                          valid_samples_per_group = 50)

    # run classification performance estimation for each sample size
    res <- designSampleSizeClassification(simulations = simulated_datasets,
                                          parallel = TRUE)

    # save results
    multiple_sample_sizes[[i]] <- res
}

## make the plots
designSampleSizeClassificationPlots(data = multiple_sample_sizes,
                                    list_samples_per_group = list_samples_per_group)

MSstatsSampleSize documentation built on Nov. 8, 2020, 4:53 p.m.

MSstatsSampleSize index

README.md MSstatsSampleSize : A package for optimal design of high-dimensional MS-based proteomics experiment

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

MSstatsSampleSize
Simulation tool for optimal design of high-dimensional MS-based proteomics experiment

designSampleSizeClassificationPlots: Visualization for sample size calculation in classification
In MSstatsSampleSize: Simulation tool for optimal design of high-dimensional MS-based proteomics experiment

Description

Usage

Arguments

Details

Value

Author(s)

Examples

Related to designSampleSizeClassificationPlots in MSstatsSampleSize...

R Package Documentation

Browse R Packages

We want your feedback!

MSstatsSampleSize Simulation tool for optimal design of high-dimensional MS-based proteomics experiment

designSampleSizeClassificationPlots: Visualization for sample size calculation in classification In MSstatsSampleSize: Simulation tool for optimal design of high-dimensional MS-based proteomics experiment

Description

Usage

Arguments

Details

Value

Author(s)

Examples

Related to designSampleSizeClassificationPlots in MSstatsSampleSize...

R Package Documentation

Browse R Packages

We want your feedback!

MSstatsSampleSize
Simulation tool for optimal design of high-dimensional MS-based proteomics experiment

designSampleSizeClassificationPlots: Visualization for sample size calculation in classification
In MSstatsSampleSize: Simulation tool for optimal design of high-dimensional MS-based proteomics experiment