Description Usage Arguments Value Examples
Generalized linear model for four-parameter beta-Possion model
1 2 3 |
data |
Input data matrix for differential expression analysis |
controlIds |
Indicies of control group |
design |
Design matrix for glm fitting |
coef |
An integer to point out the column index corresponding to the coefficient for the GLM model testing. This should be specified by users. Default value is 2 |
keepFit |
An option to keep fit results when the input dataset is small (keepFit=TRUE). Otherwise, keepFit=FALSE by default |
minExp |
A threshold for minimum expressed values. If a expression is less than minExp, it is set to be zero |
tbreak.num |
Number of breaks for binning |
break.thres |
The threshold to decide whether use breaks from logs cale or not. We decide to use breaks from log scale if 75 percent of input data less than this threshold |
estIntPar |
An option to allow estimating initial parameters for the model from only expressed values |
useExt |
A parameter setting of getTbreak function that allows to extend the last bin to infinity or not |
extreme.quant |
A quantile probability to remove extrem values (outliers) higher than the quantile. If extreme.quant=NULL, no elimination of outliers is done |
useParallel |
An option to allow using parallel computing |
A list of p-values, statistics, indicies, a list of fitting models (if keepFit=TRUE), etc
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 | set.seed(2015)
###Generate a random data matrix from a beta-poisson model
#Set the number of genes
N=100
#Generate randomly the parameters of BP models
alp=sample(100,N,replace=TRUE)*0.1;
bet=sample(100,N,replace=TRUE)*0.1;
lam1=sample(100,N,replace=TRUE)*10;
lam2=sample(100,N,replace=TRUE)*0.01
#Generate a control group
n1=100
control.mat=NULL
for (i in 1:N) control.mat=rbind(control.mat,rBP(n1,alp=alp[i],
bet=bet[i],lam1=lam1[i],lam2=lam2[i]))
#To create biological variation, we randomly set 10% as differentially expressed genes
#by simply replacing the parameter lam1 in treated group by a fold-change fc
DE.ids= sample(N,N*0.1)
fc=2.0
lam1[DE.ids]=lam1[DE.ids] * fc
#Generate a treated group
n2=100
treated.mat=NULL
for (i in 1:N)treated.mat=rbind(treated.mat,rBP(n2,alp=alp[i],
bet=bet[i],lam1=lam1[i],lam2=lam2[i]))
#Create a data set by merging the control group and the treated group
bp.mat=cbind(control.mat,treated.mat)
rownames(bp.mat)=c(1:nrow(bp.mat));
colnames(bp.mat)=c(1:ncol(bp.mat))
group=c(rep(1,ncol(control.mat)),rep(2,ncol(treated.mat)))
#First, choose IDs of all cells of the control group for estimating parameters of BP models
controlIds=which(group==1)
#Create a design matrix including the group labels.
#All batch effects can be also added here if they are available
design=model.matrix(~group)
#Select the column in the design matrix corresponding to
#the coefficient (the group label) for the GLM model testing
coef=2
#Run BPglm for differential expression analysis
res=BPglm(data=bp.mat, controlIds=controlIds, design=design, coef=coef, estIntPar=FALSE)
#In this function, user can also set estIntPar=TRUE to have better estimated beta-Poisson
#models for the generalized linear model. However, a longer computational time is required.
#Plot the p-value distribution
hist(res$PVAL, breaks=20)
#Summarize the resutls
ss=summary(res)
#Compare the discovered DE genes and the true DE genes predefined beforeward
fdr=p.adjust(res$PVAL, method="BH")
bpglm.DE.ids=which(fdr<=0.05)
#Print the indices of the true DE genes:
cat(sort(DE.ids))
#Print the indices of the DE genes discovered by BPglm:
cat(sort(bpglm.DE.ids))
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.