Yeast cell-cycle gene expression data

Description

A yeast cell-cycle gene expression data set collected in the CDC15 experiment of Spellman et al. (1998) where genome-wide mRNA levels of 6178 yeast open reading frames (ORFs) in a two cell-cycle period were measured at M/G1-G1-S-G2-M stages. However, to better understand the phenomenon underlying cell-cycle process, it is important to identify transcription factors (TFs) that regulate the gene expression levels of cell cycle-regulated genes. In this study, we presented a subset of 283 cell-cycled-regularized genes observed over 4 time points at G1 stage and the standardized binding probabilities of a total of 96 TFs obtained from a mixture model approach of Wang et al. (2007) based on the ChIP data of Lee et al. (2002).

Usage

1
data("yeastG1")

Details

A data frame with 1132 observations (283 cell-cycled-regularized genes observed over 4 time points) with 99 variables (e.g., id, y, time, and 96 TFs).

References

Lee, T.I., Rinaldi, N.J., Robert, F., Odom, D.T., Bar-Joseph, Z., Gerber, G.K., Hannett, N.M., Harbison, C.T., Thompson, C.M., Simon, I., et al. (2002). Transcriptional regulatory networks in Saccharomyces cerevisiae. Science, 298, 799–804.

Spellman, P.T., Sherlock, G., Zhang, M.Q., Iyer, V.R., Anders, K., Eisen, M.B., Brown, P.O., Botstein, D., and Futcher, B. (1998). Comprehensive identification of cell cycle regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Molecular Biology of Cell, 9, 3273–3297.

Wang, L., Chen, G., and Li, H. (2007). Group SCAD regression analysis for microarray time course gene expression data. Bioinformatics, 23, 1486–1494.

Wang, L., Zhou, J., and Qu, A. (2012). Penalized generalized estimating equations for high-dimensional longitudinal data anaysis. Biometrics, 68, 353–360.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
## Not run: 
library(PGEE)
# load data
data(yeastG1)
data <- yeastG1
# get the column names
colnames(data)[1:9]
# see some portion of yeast G1 data
head(data,5)[1:9]

# define the input arguments
formula <- "y ~.-id"
family <- gaussian(link = "identity")
lambda.vec <- seq(0.01,0.2,0.01)
# find the optimum lambda
cv <- CVfit(formula = formula, id = data[,1], data = data, family = family, scale.fix = TRUE,
scale.value = 1, fold = 4, lambda.vec = lambda.vec, pindex = c(1,2), eps = 10^-6,
maxiter = 30, tol = 10^-6)
# print the results
print(cv)

# see the returned values by CVfit
names(cv)
# get the optimum lambda
cv$lam.opt

#fit the PGEE model
myfit1 <- PGEE(formula = formula, id = data[,1], data = data, na.action = NULL,
family = family, corstr = "independence", Mv = NULL,
beta_int = c(rep(0,dim(data)[2]-1)), R = NULL, scale.fix = TRUE,
scale.value = 1, lambda = cv$lam.opt, pindex = c(1,2), eps = 10^-6, 
maxiter = 30, tol = 10^-6, silent = FALSE)

# get the values returned by myfit object
names(myfit1)
# get the values returned by summary(myfit) object
names(summary(myfit1))
# see a portion of the results returned by summary(myfit1)
# $coefficients
head(summary(myfit1)$coefficients,7)

# see the variables which have non-zero coefficients
index1 <- which(abs(summary(myfit1)$coef[,"Estimate"]) > 10^-3)
index1

# see the PGEE summary statistics of these non-zero variables
summary(myfit1)$coef[index1,]

# fit the GEE model
myfit2 <- MGEE(formula = formula, id = data[,1], data = data, na.action = NULL,
family = family, corstr = "independence", Mv = NULL,
beta_int = c(rep(0,dim(data)[2]-1)), R = NULL, scale.fix = TRUE,
scale.value = 1, maxiter = 30, tol = 10^-6, silent = FALSE)

# get the GEE summary statistics of the variables that turned out to be
#non-zero in PGEE analysis
summary(myfit2)$coef[index1,]

# see the significantly associated TFs in PGEE analysis
which(abs(summary(myfit1)$coef[index1,][,"Robust z"]) > 1.96)

# see the significantly associated TFs in both PGEE and GEE analyses
which(abs(summary(myfit2)$coef[index1,][,"Robust z"]) > 1.96)


## End(Not run)