glmParallel | R Documentation |
This is a non-user function that is managed by RegParallel, the primary function.
glmParallel(
data,
formula.list,
FUN,
variables,
terms,
startIndex,
blocksize,
blocks,
APPLYFUN,
conflevel,
excludeTerms,
excludeIntercept)
data |
A data-frame that contains all model terms to be tested. Variables that have all zeros will, automatically, be removed. REQUIRED. |
formula.list |
A list containing formulae that can be coerced to formula class via as.formula(). REQUIRED. |
FUN |
Regression function. Must be of form, for example: function(formula, data) glm(formula = formula, family = binomial, data = data). REQUIRED. |
variables |
Vector of variable names in data to be tested independently. Each variable will have its own formula in formula.list. REQUIRED. |
terms |
Vector of terms used in the formulae in formula.list, excluding the primary variable of interest. REQUIRED. |
startIndex |
Starting column index in data object from which processing can commence. REQUIRED. |
blocksize |
Number of variables to test in each foreach loop. REQUIRED. |
blocks |
Total number of blocks required to complete analysis. REQUIRED. |
APPLYFUN |
The apply function to be used within each block during processing. Will be one of: 'mclapply(...)', system=linux/mac and nestedParallel=TRUE; 'parLapply(cl, ...)', system=windows and nestedParallel=TRUE; 'lapply(...)', nestedParallel=FALSE. REQUIRED. |
conflevel |
Confidence level for calculating odds or hazard ratios. REQUIRED. |
excludeTerms |
Remove these terms from the final output. These will simply be grepped out. REQUIRED. |
excludeIntercept |
Remove intercept terms from the final output. REQUIRED. |
This is a non-user function that is managed by RegParallel, the primary function.
A data.table
object.
Kevin Blighe <kevin@clinicalbioinformatics.co.uk>
options(scipen=10)
options(digits=6)
col <- 20000
row <- 20
mat <- matrix(
rexp(col*row, rate = .1),
ncol = col)
colnames(mat) <- paste0('gene', 1:ncol(mat))
rownames(mat) <- paste0('sample', 1:nrow(mat))
modelling <- data.frame(
cell = rep(c('B', 'T'), nrow(mat) / 2),
group = c(rep(c('treatment'), nrow(mat) / 2), rep(c('control'), nrow(mat) / 2)),
dosage = t(data.frame(matrix(rexp(row, rate = 1), ncol = row))),
mat,
row.names = rownames(mat))
data <- modelling[,1:2000]
variables <- colnames(data)[4:ncol(data)]
res1 <- RegParallel(
data = data,
formula = 'factor(group) ~ [*] + (cell:dosage) ^ 2',
FUN = function(formula, data)
glm(formula = formula,
data = data,
family = binomial(link = 'logit'),
method = 'glm.fit'),
FUNtype = 'glm',
variables = variables,
blocksize = 700,
cores = 2,
nestedParallel = TRUE,
p.adjust = "none",
conflevel = 99,
excludeTerms = NULL,
excludeIntercept = TRUE
)
# spot checks
m <- glm(factor(group) ~ gene265 + (cell:dosage) ^ 2, data=data, family=binomial)
summary(m)$coefficients
exp(cbind("Odds ratio" = coef(m), confint.default(m, level = 0.99)))
res1[which(res1$Variable == 'gene265'),]
m <- glm(factor(group) ~ gene1688 + (cell:dosage) ^ 2, data=data, family=binomial)
summary(m)$coefficients
exp(cbind("Odds ratio" = coef(m), confint.default(m, level = 0.99)))
res1[which(res1$Variable == 'gene1688'),]
###
data <- modelling[,1:500]
variables <- colnames(data)[4:ncol(data)]
res2 <- RegParallel(
data = data,
formula = '[*] ~ cell:dosage',
FUN = function(formula, data)
glm(formula = formula,
data = data,
family = gaussian,
method = 'glm.fit'),
FUNtype = 'glm',
variables = variables,
blocksize = 496,
cores = 2,
nestedParallel = TRUE,
p.adjust = "none",
conflevel = 90,
excludeTerms = NULL,
excludeIntercept = FALSE
)
# spot checks
m <- glm(gene29 ~ cell:dosage, data=data, family=gaussian)
summary(m)$coefficients
exp(cbind("Odds ratio" = coef(m), confint.default(m, level = 0.90)))
res2[which(res2$Variable == 'gene29'),]
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.