Description Usage Arguments Value Examples
View source: R/regenrichClasses.R
This is 'RegenrichSet' object creator function.
There are four types of parameters in this function.
First, parameters to provide raw data and sample information;
'expr' and 'colData'.
Second, parameters to perform differential expression analysis;
'method', 'minMeanExpr', 'design', 'reduced', 'contrast',
'coef', 'name', 'fitType', 'sfType', 'betaPrior', 'minReplicatesForReplace',
'useT', 'minmu', 'parallel', 'BPPARAM' (also for network inference),
'altHypothesis', 'listValues', 'cooksCutoff', 'independentFiltering',
'alpha', 'filter', 'theta', 'filterFun', 'addMLE', 'blind', 'ndups',
'spacing', 'block', 'correlation', 'weights', 'proportion',
'stdev.coef.lim', 'trend', 'robust', and 'winsor.tail.p'.
Thrid, parameters to perform regulator-target network inference;
'reg', 'networkConstruction', 'topNetPercent', 'directed', 'rowSample',
'softPower', 'networkType', 'TOMDenom', 'RsquaredCut', 'edgeThreshold',
'K', 'nbTrees', 'importanceMeasure', 'trace',
'BPPARAM' (also for differential expression analysis), and 'minR'.
Fourth, parameters to perform enrichment analysis:
'enrichTest', 'namedScoresCutoffs', 'minSize', 'maxSize', 'pvalueCutoff',
'qvalueCutoff', 'regAltName', 'universe', and 'nperm'.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 | RegenrichSet(
expr,
colData,
rowData = NULL,
method = c("Wald_DESeq2", "LRT_DESeq2", "limma", "LRT_LM"),
minMeanExpr = NULL,
design,
reduced,
contrast,
coef = NULL,
name,
fitType = c("parametric", "local", "mean"),
sfType = c("ratio", "poscounts", "iterate"),
betaPrior,
minReplicatesForReplace = 7,
useT = FALSE,
minmu = 0.5,
parallel = FALSE,
BPPARAM = bpparam(),
altHypothesis = c("greaterAbs", "lessAbs", "greater", "less"),
listValues = c(1, -1),
cooksCutoff,
independentFiltering = TRUE,
alpha = 0.1,
filter,
theta,
filterFun,
addMLE = FALSE,
blind = FALSE,
ndups = 1,
spacing = 1,
block = NULL,
correlation,
weights = NULL,
proportion = 0.01,
stdev.coef.lim = c(0.1, 4),
trend = FALSE,
robust = FALSE,
winsor.tail.p = c(0.05, 0.1),
reg = TFs$TF_name,
networkConstruction = c("COEN", "GRN", "new"),
topNetPercent = 5,
directed = FALSE,
rowSample = FALSE,
softPower = NULL,
networkType = "unsigned",
TOMDenom = "min",
RsquaredCut = 0.85,
edgeThreshold = NULL,
K = "sqrt",
nbTrees = 1000,
importanceMeasure = "IncNodePurity",
trace = FALSE,
minR = 0.3,
enrichTest = c("FET", "GSEA"),
namedScoresCutoffs = 0.05,
minSize = 5,
maxSize = 5000,
pvalueCutoff = 0.05,
qvalueCutoff = 0.2,
regAltName = NULL,
universe = NULL,
nperm = 10000
)
|
expr |
matrix or data.frame, expression profile of a set of
genes or a set of proteins. If the |
colData |
data frame, sample phenotype data. The rows of colData must correspond to the columns of expr. |
rowData |
NULL or data frame, information of each row/gene. Default is NULL, which will generate a DataFrame of three columns, i.e., "gene", "p", and "logFC". |
method |
either 'Wald_DESeq2', 'LRT_DESeq2', 'limma', or 'LRT_LM' for the differential expression analysis.
|
minMeanExpr |
numeric, the cutoff of gene average expression for pre-filtering. The rows of 'expr' with everage expression < minMeanExpr is removed. The higher 'minMeanExpr' is, the more genes are not included for testing. |
design |
either model formula or model matrix. For method = 'LRT_DESeq2' or 'LRT_LM', the design is the full model formula/matrix. For method = 'limma', and if design is a formula, the model matrix is constructed using model.matrix(design, colData), so the name of each term in the design formula must be included in the column names of 'colData'. |
reduced |
The argument is used only when method = 'LRT_DESeq2' or 'LRT_LM', it is a reduced formula/matrix to compare against. If the design is a model matrix, 'reduced' must also be a model matrix. |
contrast |
The argument is used only when method = 'LRT_DESeq2',
'Wald_DESeq2', or 'limma'.
When method = 'limma', It can be one of following two formats:
|
coef |
The argument is used only when method = 'limma'. (Vector of) column number or column name specifying which coefficient or contrast of the linear model is of interest. Default is NULL. |
name |
The argument is used only when method = 'LRT_DESeq2' or
'Wald_DESeq2'.
the name of the individual effect (coefficient) for building a results
table.
Use this argument rather than contrast for continuous variables,
individual
effects or for individual interaction terms. The value provided to
name must
be an element of |
fitType |
either 'parametric', 'local', or 'mean' for the type of
fitting
of dispersions to the mean intensity. This argument is used only when
method =
'Wald_DESeq2' or 'LRT_DESeq2'. See |
sfType |
either 'ratio', 'poscounts', or 'iterate' for the type
of size
factor estimation. This argument is used only when method = either
'Wald_DESeq2' or 'LRT_DESeq2'. See |
betaPrior |
This argument is used only when method = either
'Wald_DESeq2' or 'LRT_DESeq2'. See |
minReplicatesForReplace |
This argument is used only when method
= either
'Wald_DESeq2' or 'LRT_DESeq2'. See |
useT |
This argument is used only when method = either
'Wald_DESeq2' or 'LRT_DESeq2'. See |
minmu |
This argument is used only when method = either
'Wald_DESeq2' or 'LRT_DESeq2'. See |
parallel |
whether computing (only for differential analysis with method = "Wald_DESeq2" or "LRT_DESeq2") is parallel (default is FALSE). |
BPPARAM |
parameters for parallel computing (default is
|
altHypothesis |
= c('greaterAbs', 'lessAbs', 'greater', 'less').
This argument is used only when method = either
'Wald_DESeq2' or 'LRT_DESeq2'. See |
listValues |
This argument is used only when method = either
'Wald_DESeq2' or 'LRT_DESeq2'. See |
cooksCutoff |
theshold on Cook's distance, such that if one or
more
samples for a row have a distance higher, the p-value for the row is
set to NA.
This argument is used only when method = either
'Wald_DESeq2' or 'LRT_DESeq2'. See |
independentFiltering |
logical, whether independent filtering
should be
applied automatically. This argument is used only when method = either
'Wald_DESeq2' or 'LRT_DESeq2'. See |
alpha |
the significance cutoff used for optimizing the independent
filtering.
This argument is used only when method = either
'Wald_DESeq2' or 'LRT_DESeq2'. See |
filter |
the vector of filter statistics over which the independent
filtering is optimized. By default the mean of normalized counts is used.
This argument is used only when method = either
'Wald_DESeq2' or 'LRT_DESeq2'. See |
theta |
the quantiles at which to assess the number of rejections
from
independent filtering. This argument is used only when method = either
'Wald_DESeq2' or 'LRT_DESeq2'. See |
filterFun |
an optional custom function for performing independent
filtering
and p-value adjustment. This argument is used only when method = either
'Wald_DESeq2' or 'LRT_DESeq2'. See |
addMLE |
if betaPrior=TRUE was used, whether the 'unshrunken' maximum
likelihood estimates (MLE) of log2 fold change should be added as a column
to the results table. This argument is used only when method = either
'Wald_DESeq2' or 'LRT_DESeq2'. See |
blind |
logical, whether to blind the transformation to the
experimental
design. This argument is used only when method = either
'Wald_DESeq2' or 'LRT_DESeq2'. See |
ndups |
positive integer giving the number of times each distinct
probe is
printed on each array. This argument is used only when method = 'limma'.
See |
spacing |
positive integer giving the spacing between duplicate
occurrences of
the same probe, spacing=1 for consecutive rows. This argument is used only
when method = 'limma'. See |
block |
vector or factor specifying a blocking variable on the arrays.
Has length equal to the number of arrays. Must be NULL if ndups > 2.
This argument is used only when method = 'limma'. See |
correlation |
the inter-duplicate or inter-technical replicate
correlation.
The correlation value should be estimated using the
|
weights |
non-negative precision weights. Can be a numeric matrix of
individual weights of same size as the object expression matrix, or a
numeric
vector of array weights with length equal to ncol of the expression matrix,
or a numeric vector of gene weights with length equal to nrow of the
expression
matrix. This argument is used only when method = 'limma' or 'LRT_LM'.
See |
proportion |
numeric value between 0 and 1, assumed proportion of
genes which
are differentially expressed. This argument is used only when method =
'limma'.
See |
stdev.coef.lim |
numeric vector of length 2, assumed lower and
upper limits
for the standard deviation of log2-fold-changes for differentially
expressed
genes. This argument is used only when method = 'limma'.
See |
trend |
logical, should an intensity-trend be allowed for the prior
variance?
This argument is used only when method = 'limma'. See |
robust |
logical, should the estimation of df.prior and var.prior be
robustified against outlier sample variances? This argument is used only
when method = 'limma'. See |
winsor.tail.p |
numeric vector of length 1 or 2, giving left and right
tail proportions of x to Winsorize. Used only when method = 'limma' and
robust=TRUE. See |
reg |
a vector of regulator names (ID). By default, these are transcription (co-)factors defined by three literatures/databases, namely RegNet, TRRUST, and Marbach2016. The type (for example ENSEMBL gene ID, Entrez gene ID, or gene symble/name) of names or IDs of these regulators must be the same as the type of names or IDs in the regulator-target network. |
networkConstruction |
the method to construct this network.
Possible can be: |
topNetPercent |
numeric, what percentage of the top edges in the full network is ratained. Default is 5, meaning top 5% of edges. This value must be between 0 and 100. |
directed |
logical, whether the network is directed. Default is FALSE. |
rowSample |
logic, if TRUE, each row represents a sample. Otherwise, each column represents a sample. Default is FALSE. |
softPower |
numeric, a soft power to achieve scale free topology.
If not provided, the parameter will be picked automatically by
|
networkType |
network type. Allowed values are (unique abbreviations
of)
'unsigned' (default), 'signed', 'signed hybrid'.
See |
TOMDenom |
a character string specifying the TOM variant to be used. Recognized values are 'min' giving the standard TOM described in Zhang and Horvath (2005), and 'mean' in which the min function in the denominator is replaced by mean. The 'mean' may produce better results but at this time should be considered experimental. |
RsquaredCut |
desired minimum scale free topology fitting index R^2. Default is 0.85. |
edgeThreshold |
numeric, the threshold to remove the low weighted edges, Default is NULL, which means no edges will be removed. |
K |
integer or character. The number of features in each tree, can be either a integer number, 'sqrt', or 'all'. 'sqrt' denotes sqrt(the number of 'reg'), 'all' means the number of 'reg'. Default is 'sqrt'. |
nbTrees |
integer. The number of trees. Default is 1000. |
importanceMeasure |
character. importanceMeasure can be '%IncMSE'
or 'IncNodePurity', corresponding to type = 1 and 2 in
|
trace |
logical. To show the progress or not (default). |
minR |
numeric. The minimum correlation coefficient of prediction is to control model accuracy. Default is 0.3. |
enrichTest |
character, specifying the enrichment analysis method, which is either ‘FET' (Fisher’s exact test) or 'GSEA' (gene set enrichment analysis). |
namedScoresCutoffs |
numeric, the significance cutoff for the differential analysis p value. Default is 0.05. |
minSize |
The minimum number (default 5) of target genes. |
maxSize |
The maximum number (default 5000) of target genes. |
pvalueCutoff |
numeric, the significance cutoff for adjusted enrichment p value. This is used for obtaining the 'topResult' slot in the final 'Enrich' object. Default is 0.05. |
qvalueCutoff |
numeric, the significance cutoff of enrichment q-value. Default is 0.2. |
regAltName |
alternative name for regulator. Default is NULL. |
universe |
a vector of charactors. Background target genes. |
nperm |
integer, number of permutations. The minimial possible nominal p-value is about 1/nperm. Default is 10000. |
an object of RegenrichSet class.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | # library(RegEnrich)
data("Lyme_GSE63085")
data("TFs")
data = log2(Lyme_GSE63085$FPKM + 1)
colData = Lyme_GSE63085$sampleInfo
# Take first 2000 rows for example
data1 = data[seq(2000), ]
design = model.matrix(~0 + patientID + week, data = colData)
# Initializing a 'RegenrichSet' object
object = RegenrichSet(expr = data1,
colData = colData,
method = 'limma', minMeanExpr = 0,
design = design,
contrast = c(rep(0, ncol(design) - 1), 1),
networkConstruction = 'COEN',
enrichTest = 'FET')
object
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.