get_statistics_from_dataFrame: get_statistics_from_dataFrame
In selfea: Select Features Reliably with Cohen's Effect Sizes

Description Usage Arguments Value Examples

This function computes Cohen's f, f2 and w, adjusted p-value from GLM quasi-Poisson, negative binomial and Normal distribution.

1	get_statistics_from_dataFrame(df_contrast, df_group, padj = "fdr")

df_contrast

A data frame that consists of 'ID' column and expression profile (columns after 'ID' column). 'ID' column should be unique. Column names after 'ID' column should be unique. Only positive numbers are allowed in expression data. Here is an example.

ID	Y500U100_001	Y500U100_002	Y500U200_001	Y500U200_002
YKL060C	151	195	188	184
YDR155C	154	244	237	232
YOL086C	64	89	128	109
YJR104C	161	155	158	172
YGR192C	157	161	173	175
YLR150W	96	109	113	115
YPL037C	23	28	27	27
YNL007C	53	58	64	63
YBR072W	52	53	54	44
YDR418W_1	76	53	62	74

df_group

A data frame that consists of 'Col_Name' and 'Group' columns This parameter is to match experiment groups to expression profiles of df_contrast. 'Col_Name' should be corresponding to column names of expression profile of df_contrast. 'Group' columns have experiment informaion of columns in expression profile of df_contrast. Here is an example. See the example of df_contrast together.

Col_Name	Group
Y500U100_001	U100
Y500U100_002	U100
Y500U200_001	U200
Y500U200_002	U200

padj

Choose one of these c("holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none"). "fdr" is default option. The option is same to p.adjust.

A list that consists of the following items:

$data_table	A data frame that have statistics for each IDs
$min_rep	Common number of replicates in your group information.
$max_rep	Maximum number of replicates in your group information.
$nt	The number of total experiments in your expression profile.
$ng	The number of groups in your group information.
$method_pvalue_adjustment	The selected method for p-value adjustment

data_table's elements
Cohens_W	Cohen's w
Cohens_F	Cohen's f
Cohens_F2	Cohen's f2
Max_FC	Maximum fold change among all the possible group pairs
QP_Pval_adjusted	Adjusted p-value from GLM quasi-Poisson
NB_Pval_adjusted	Adjusted p-value from GLM negative binomial
Normal_Pval_adjusted	Adjusted p-value from Normal ANOVA

library(selfea)

## Test selfea for single protein expression
values <- c(6,8,10,29,26,22)
groups <- c("U200","U200","U200","U600","U600","U600")
experiments <- c("exp1","exp2","exp3","exp4","exp5","exp6")

df_expr <- data.frame(ID="Protein_1",exp1=6,exp2=8,exp3=10,exp4=29,exp5=26,exp6=22)
df_group <- data.frame(Col_Name=experiments,Group=groups)
list_result <- get_statistics_from_dataFrame(df_expr,df_group)
top_table(list_result)

## For this example we will import Gregori data
## Josep Gregori, Laura Villareal, Alex Sanchez, Jose Baselga, Josep Villanueva (2013).
## An Effect Size Filter Improves the Reproducibility
## in Spectral Counting-based Comparative Proteomics.
## Journal of Proteomics, DOI http://dx.doi.org/10.1016/j.jprot.2013.05.030')

## Description:
## Each sample consists in 500ng of standard yeast lisate spiked with
## 100, 200, 400 and 600fm of a mix of 48 equimolar human proteins (UPS1, Sigma-Aldrich).
## The dataset contains a different number of technical replimessagees of each sample

## import Gregori data
data(example_data1)
df_contrast <- example_data
df_group <- example_group

## Get statistics through 'get_statistics_from_dataFrame' function
list_result <- get_statistics_from_dataFrame(df_contrast,df_group)

## Get significant features (alpha >= 0.05 and power >= 0.90)
significant_qpf <- top_table(list_result,pvalue=0.05,power_desired=0.90,method='QPF')