pmm: Fitting the PMM
In pmm: Parallel Mixed Model

Description Usage Arguments Details Value Author(s) Examples

View source: R/pmm.R

Fits the parallel mixed model.

1 2	pmm(df.data, response, weight = "None", ignore = 3, simplify = TRUE, gene.col = "GeneID", condition.col = "condition")

`df.data`	a data frame containing the variables for the model. Each row should correspond to one independent siRNA experiment. The data frame needs to have at least the following variables: GeneID, condition and a column with the measurements/readouts of the screens.
`response`	name of the column that contains the measurements/readouts of the screens.
`weight`	an optional vector of weights to be used in the fitting process of the linear mixed model. It should be a numeric vector. Default is a fit without weights.
`ignore`	number of minimal required sirna replicates for each gene. If a gene has less siRNA replicates it is ignored during the fitting process. Default is 3.
`simplify`	logical value that indicates whether the output of pmm should be simplified.
`gene.col`	name of the column that give a gene identifier. Default is "GeneID".
`condition.col`	name of the column that indicates the condition that was used for each measurement. Default is "condition".

The Parallel Mixed Model (PMM) is composed of a linear mixed model and an assessment of the local False Discovery Rate. The linear mixed model consists of a fixed effect for condition and of two random effects for gene g and for gene g within a condition c. We fit a linear mixed model by using lmer function from lme4 R-package. To distinguish hit genes, PMM provides also an estimate of the local False Discovery Rate (FDR). pmm will only use the data of genes that have at least a certain number of siRNA replicates per condition. The number of ignored genes can be passed to pmm by the argument ignore. We recommend using at least 3 siRNA replicates per gene and condition in order to obtain a reliable fit.

The simplified output of pmm is a matrix that contains the c_cg effects for each condition c and gene g, as well as an estimate for the local false discovery rate. A positive estimated c_cg effect means that the response was enhanced when the corresponding gene is knocked down. A negative effect means that the response was reduced.
The non-simplified output of pmm is a list of three components. The first component contains the simpilified output, i.e the matrix with the c_cg effects and fdr values, the second component contains the fit of the linear mixed model and the third component contains the a_g and b_cg values.

Anna Drewek <adrewek@stat.math.ethz.ch>

 data(kinome)

 ## Fitting the parallel mixed model with weights
 fit1 <- pmm(kinome,"InfectionIndex","weight_library")
 head(fit1)

 ## Fitting the parallel mixed model without weights
 fit2 <- pmm(kinome,"InfectionIndex","None")
 head(fit2)

 ## Accessing the fit of the linear mixed model
 fit3 <- pmm(kinome,"InfectionIndex","weight_library",simplify=FALSE)
 identical(fit1,fit3[[1]])
 summary(fit3[[2]])

 ## NA-Handling
 kinome$InfectionIndex[kinome$GeneID == 10000 & kinome$condition ==
 "ADENO"] <- rep(NA,12)
 fit4 <- pmm(kinome,"InfectionIndex","weight_library",3)
 head(fit4)