View source: R/TwoPart_MultiMS.R
prot_level_multiMat_PresAbs | R Documentation |
Multi-Matrix Presence Absence Analysis computes Model-Based statistics for each dataset and sums them up to produce the final statistic. The significance is determined via a permutation test which computes the same statistics and sums them after permuting the values across treatment groups, as is outlined in Karpievitch et al. 2018. Whenever possible proteins should be analysed using the Model-Based Differential Expression Analysis due to higher statistical power over the Presence Absence analysis.
prot_level_multiMat_PresAbs( mm_list, treat, prot.info, prot_col_name, nperm = 500, dataset_suffix )
mm_list |
list of matrices of intensities for each experiment, dimentions: numpeptides x numsamples |
treat |
list of data frames with treatment information to compute the statistic, parallel to mm_list and prot.info |
prot.info |
list of protein metadata for each matrix in mm_list, data.frame parallel to mm_list and treat |
prot_col_name |
column names present in all datasets that identifies protein IDs across all datasets |
nperm |
number of permutations |
dataset_suffix |
a list of strings that will be appended to the column names for FC, PV, BHPV and numebers of peptides |
a data frame with the following columns:
protein metadata, peptide sequence if was passed in as one of the columns is the first peptide equence encountered in the data for that protein
Avegares across all datasets of the approximation of the fold change computed as percent missing observations group 1 munis in percent missing observations group 2 in peptideLevel_PresAbsDE() function
p-value for the comparison between 2 groups (2 groups only here) obtained from a permutation test
Benjamini-Hochberg adjusted p-values
statistic returned by the g-test and summed across all datasets, not very useful as depends on the direction of the test and can produce all 0's
column containing ptoein identifiers across all datasets
Approximation of the fold change computed as percent missing observations group 1 munis in percent missing observations group 2 in peptideLevel_PresAbsDE() function
p-values produced by g-test for individual datasets
adjusted p-values produced by g-test for individual datasets
number of peptides observed for each protein in each of the datasets
# Load mouse dataset data(mm_peptides) head(mm_peptides) intsCols = 8:13 metaCols = 1:7 m_logInts = make_intencities(mm_peptides, intsCols) # will reuse the name m_prot.info = make_meta(mm_peptides, metaCols) m_logInts = convert_log2(m_logInts) grps = as.factor(c('CG','CG','CG', 'mCG','mCG','mCG')) set.seed(135) mm_m_ints_eig1 = eig_norm1(m=m_logInts,treatment=grps,prot.info=m_prot.info) mm_m_ints_eig1$h.c # check the number of bias trends detected mm_m_ints_norm = eig_norm2(rv=mm_m_ints_eig1) # Load human dataset data(hs_peptides) head(hs_peptides) intsCols = 8:13 metaCols = 1:7 m_logInts = make_intencities(hs_peptides, intsCols) m_prot.info = make_meta(hs_peptides, metaCols) m_logInts = convert_log2(m_logInts) grps = as.factor(c('CG','CG','CG', 'mCG','mCG','mCG')) set.seed(137) hs_m_ints_eig1 = eig_norm1(m=m_logInts,treatment=grps,prot.info=m_prot.info) hs_m_ints_eig1$h.c # check the number of bias trends detected hs_m_ints_norm = eig_norm2(rv=hs_m_ints_eig1) # Set up for presence/absence analysis raw_list = list() norm_imp_prot.info_list = list() raw_list[[1]] = mm_m_ints_eig1$m raw_list[[2]] = hs_m_ints_eig1$m norm_imp_prot.info_list[[1]] = mm_m_ints_eig1$prot.info norm_imp_prot.info_list[[2]] = hs_m_ints_eig1$prot.info protnames_norm_list = list() protnames_norm_list[[1]] = unique(mm_m_ints_norm$normalized$MatchedID) protnames_norm_list[[2]] = unique(hs_m_ints_norm$normalized$MatchedID) presAbs_dd = get_presAbs_prots(mm_list=raw_list, prot.info=norm_imp_prot.info_list, protnames_norm=protnames_norm_list, prot_col_name=2) ints_presAbs = list() protmeta_presAbs = list() ints_presAbs[[1]] = presAbs_dd[[1]][[1]] # Mouse ints_presAbs[[2]] = presAbs_dd[[1]][[2]] # HS protmeta_presAbs[[1]] = presAbs_dd[[2]][[1]] protmeta_presAbs[[2]] = presAbs_dd[[2]][[2]] treats = list() treats[[1]] = grps treats[[2]] = grps subset_presAbs = subset_proteins(mm_list=ints_presAbs, prot.info=protmeta_presAbs, 'MatchedID') nperm = 50 # set to 500+ for publication set.seed(275937) presAbs_comb = prot_level_multiMat_PresAbs( mm_list=subset_presAbs$sub_mm_list, treat=treats, prot.info=subset_presAbs$sub_prot.info, prot_col_name='MatchedID', nperm=nperm, dataset_suffix=c('MM', 'HS') ) plot_volcano(presAbs_comb$FC, presAbs_comb$BH_P_val, FC_cutoff=.5, PV_cutoff=.05, 'Combined Pres/Abs CG vs mCG')
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.