get_statistics_from_dataFrame: get_statistics_from_dataFrame

Description Usage Arguments Value Examples

View source: R/selfea.R

Description

This function computes Cohen's f, f2 and w, adjusted p-value from GLM quasi-Poisson, negative binomial and Normal distribution.

Usage

1
get_statistics_from_dataFrame(df_contrast, df_group, padj = "fdr")

Arguments

df_contrast

A data frame that consists of 'ID' column and expression profile (columns after 'ID' column). 'ID' column should be unique. Column names after 'ID' column should be unique. Only positive numbers are allowed in expression data. Here is an example.

ID Y500U100_001 Y500U100_002 Y500U200_001 Y500U200_002
YKL060C 151 195 188 184
YDR155C 154 244 237 232
YOL086C 64 89 128 109
YJR104C 161 155 158 172
YGR192C 157 161 173 175
YLR150W 96 109 113 115
YPL037C 23 28 27 27
YNL007C 53 58 64 63
YBR072W 52 53 54 44
YDR418W_1 76 53 62 74
df_group

A data frame that consists of 'Col_Name' and 'Group' columns This parameter is to match experiment groups to expression profiles of df_contrast. 'Col_Name' should be corresponding to column names of expression profile of df_contrast. 'Group' columns have experiment informaion of columns in expression profile of df_contrast. Here is an example. See the example of df_contrast together.

Col_Name Group
Y500U100_001 U100
Y500U100_002 U100
Y500U200_001 U200
Y500U200_002 U200
padj

Choose one of these c("holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none"). "fdr" is default option. The option is same to p.adjust.

Value

A list that consists of the following items:

$data_table A data frame that have statistics for each IDs
$min_rep Common number of replicates in your group information.
$max_rep Maximum number of replicates in your group information.
$nt The number of total experiments in your expression profile.
$ng The number of groups in your group information.
$method_pvalue_adjustment The selected method for p-value adjustment
data_table's elements
Cohens_W Cohen's w
Cohens_F Cohen's f
Cohens_F2 Cohen's f2
Max_FC Maximum fold change among all the possible group pairs
QP_Pval_adjusted Adjusted p-value from GLM quasi-Poisson
NB_Pval_adjusted Adjusted p-value from GLM negative binomial
Normal_Pval_adjusted Adjusted p-value from Normal ANOVA

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
library(selfea)

## Test selfea for single protein expression
values <- c(6,8,10,29,26,22)
groups <- c("U200","U200","U200","U600","U600","U600")
experiments <- c("exp1","exp2","exp3","exp4","exp5","exp6")

df_expr <- data.frame(ID="Protein_1",exp1=6,exp2=8,exp3=10,exp4=29,exp5=26,exp6=22)
df_group <- data.frame(Col_Name=experiments,Group=groups)
list_result <- get_statistics_from_dataFrame(df_expr,df_group)
top_table(list_result)

## For this example we will import Gregori data
## Josep Gregori, Laura Villareal, Alex Sanchez, Jose Baselga, Josep Villanueva (2013).
## An Effect Size Filter Improves the Reproducibility
## in Spectral Counting-based Comparative Proteomics.
## Journal of Proteomics, DOI http://dx.doi.org/10.1016/j.jprot.2013.05.030')

## Description:
## Each sample consists in 500ng of standard yeast lisate spiked with
## 100, 200, 400 and 600fm of a mix of 48 equimolar human proteins (UPS1, Sigma-Aldrich).
## The dataset contains a different number of technical replimessagees of each sample

## import Gregori data
data(example_data1)
df_contrast <- example_data
df_group <- example_group

## Get statistics through 'get_statistics_from_dataFrame' function
list_result <- get_statistics_from_dataFrame(df_contrast,df_group)

## Get significant features (alpha >= 0.05 and power >= 0.90)
significant_qpf <- top_table(list_result,pvalue=0.05,power_desired=0.90,method='QPF')

selfea documentation built on May 2, 2019, 5:08 a.m.