more | R Documentation |
more
fits a GLM regression model (when the selected method is GLM) or a PLS model (when the selected method is PLS) for all genes in the dataset to identify
the potential regulators that show a significant impact on gene expression under specific experimental conditions.
more(
GeneExpression,
data.omics,
associations = NULL,
omic.type = NULL,
edesign = NULL,
clinic = NULL,
clinic.type = NULL,
center = TRUE,
scale = TRUE,
scaletype = "auto",
epsilon = 1e-05,
min.variation = 0,
interactions.reg = TRUE,
family.glm = gaussian(),
elasticnet.glm = NULL,
col.filter.glm = "cor",
correlation.glm = 0.7,
thres.isgl = 0.7,
gr.method.isgl = "cor",
alfa.pls = 0.05,
p.method.pls = "jack",
vip.pls = 0.8,
method = "glm"
)
GeneExpression |
Data frame containing gene expression data with genes in rows and experimental samples in columns. Row names must be the gene IDs. |
data.omics |
List where each element corresponds to a different omic data type to be considered (miRNAs, transcription factors, methylation, etc.). The names of the list will represent the omics, and each element in the list should be a data matrix with omic regulators in rows and samples in columns. |
associations |
List where each element corresponds to a different omic data type (miRNAs,
transcription factors, methylation, etc.). The names of the list will represent the omics. Each element in
the list should be a data frame with 2 columns (optionally 3), describing the potential interactions between genes
and regulators for that omic. First column must contain the genes (or features in
GeneExpression object), second column must contain the regulators, and an optional third column can
be added to describe the type of interaction (e.g., for methylation, if a CpG site is located in
the promoter region of the gene, in the first exon, etc.). If the user lacks prior knowledge of the potential regulators, they can set the parameter to NULL.
In this case, all regulators in |
edesign |
Data frame describing the experimental design. Rows must be the samples (columns
in |
clinic |
Data.frame with all clinical variables to consider,with samples in rows and variables in columns. |
clinic.type |
Vector which indicates the type of data of variables introduced in |
center |
By default TRUE. It determines whether centering is applied to |
scale |
By default TRUE. It determines whether scaling is applied to |
scaletype |
Type of scaling to be applied. Three options:
considering m_b the number of variables of the block. By default, auto. |
epsilon |
Convergence threshold for coordinate descent algorithm in elasticnet. Default value, 1e-5. |
min.variation |
For numerical regulators, it specifies the minimum change required across conditions to retain the regulator in the regression models. In the case of binary regulators, if the proportion of the most common value is equal to or inferior this value, the regulator is considered to have low variation and will be excluded from the regression models. The user has the option to set a single value to apply the same filter to all omics, provide a vector of the same length as omics if they want to specify different levels for each omics, or use 'NA' when they want to apply a minimum variation filter but are uncertain about the threshold. By default, 0. |
interactions.reg |
If TRUE, the model includes interactions between regulators and experimental variables. By default, TRUE. |
family.glm |
Error distribution and link function to be used in the model when |
elasticnet.glm |
ElasticNet mixing parameter. There are three options:
By default, NULL. |
col.filter.glm |
Type of correlation coefficients to use when applying the multicollinearity filter when glm
|
correlation.glm |
Value to determine the presence of collinearity between two regulators when using the glm |
thres.isgl |
Threshold for the correlation when gr.method.isgl is 'cor' or threshold for the percentage of variability to explain when 'pca'. By default, 0.7. |
gr.method.isgl |
Grouping approach to create groups of variables in ISGL penalization. There are two options: 'cor' to cluster variables using correlations and 'pca' to use Principal Component Analysis approach. By default, 'cor'. |
alfa.pls |
Significance level for variable selection in pls1and pls2 |
p.method.pls |
Type of resampling method to apply for the p-value calculation when pls1 or pls2
By default, jack. |
vip.pls |
Value of VIP above which a variable can be considered significant in addition to the computed p-value in |
method |
Model to be fitted. Four options:
By default, glm. |
List containing the following elements:
ResultsPerGene : List with as many elements as genes in GeneExpression
. For each gene, it includes information about gene values, considered variables, estimated coefficients,
detailed information about all regulators, and regulators identified as relevant (in glm scenario) or significant (in pls scenarios).
GlobalSummary : List with information about the fitted models, including model metrics, information about regulators, genes without models, regulators, master regulators and hub genes.
Arguments : List containing all the arguments used to generate the models.
data(TestData)
#Omic type
omic.type = c(1,0,0)
names(omic.type) = names(TestData$data.omics)
SimGLM = more(GeneExpression = TestData$GeneExpressionDE,
associations = TestData$associations,
data.omics = TestData$data.omics,
omic.type = omic.type,
edesign = TestData$edesign,
center = TRUE, scale = TRUE,
scaltype = 'auto',
epsilon = 0.00001, family.glm = gaussian(), elasticnet = NULL,
interactions.reg = TRUE,min.variation = 0, col.filter.glm = 'cor',
correlation.glm = 0.7, method ='glm')
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.