Description Usage Arguments Details Value Examples
View source: R/ModelMultiData.R
ModelMultiData
loops through one or more datasets and returns
coefficients of regression model.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
x |
A matrix or list of matrices containing values for model. |
y |
A matrix containing values to be used on left-hand side of model (e.g. dependent variable). At least one of y or groups must be provided. |
groups |
A data.frame containing sample group data. At least one of y or groups must be provided. Column one of groups data.frame must contain sample names that correspond to column names of x and y. |
by |
Character vector specifying column names in groups that will be used to split data if stratified analysis desired. Ignored if groups not provided. |
formula |
Formula to be used in model. If provided by user, formula should include all values in yName and xName. Else, formula will be built using provided data (see details for more information). |
FUN |
Function to be used for regression analysis. Defaults to lm(). |
pAdjust |
Adjust p-values using one of several methods contained in p.adjust.methods. Skipped if pAjust = NULL. |
xName |
Names of matrices provided in x. If x is a list with named elements, defaults to list names. Else, defaults to x1, x2, ..., xn where n is the number of matrices provided in x. |
yName |
Name of matrix or name of column in groups to be used on left-hand of model. (see formula details for default values) |
returnVars |
Character vector of coefficient variable names to return. By default, returns all coefficients for variables in xName. To return all coefficients (including intercept), use "*". |
comparisons |
Matrix containing rownames or numbers from each dataset for all desired comparisons. If not provided, comparisons matrix will be build (see details for defaults). |
excludeVars |
Character vector of column names in groups to exclude from formula. Ignored if formula provided or if groups not provided. |
includeVars |
Character vector of column names in groups to include in formula. Ignored if formula provided or if groups not provided. |
... |
Additional parameters to be passed to FUN. |
This function applies a regression model across all rows of one or more matrices with inputs from multiple datasets.
For all matrices in x and y, the function will loop by row. Rownames represent individual events or observations and should match names provided in the comparisons matrix, if user-defined. Columns represent individual samples. Columns of x and y will be dropped if their column names are not shared across all matrices in x and y and in the first column of groups, if defined.
Users may provide multiple datasets to be used in the same model by providing a list of matrices to x. For example, a user may wish to test for associations between an outcome of interest and all measurements in x[[1]] while correcting for measurements in x[[2]].
If formula is user-defined, the formula should include all values in yName and xName (e.g. names of provided matrices). Else, formula will be built from provided data. By default, the left-hand (dependent) variable will be "y" if y is provided, else if yName is defined the left-hand variable will be defined as the column of groups that matches yName, else the left-hand variable will be defined as the second column of groups after applying includeVars and excludeVars. The right-hand of formula will include all in xName and all remaining columns in groups after applying includeVars and excludeVars.
The comparisons parameter allows users to pre-define which combinations of rows in x and y should be tested. Column names should match values in yName and xName. Rows contain respective rownames or row numbers from each dataset in x and y that should be tested together. If not provided, comparisons matrix will be constructed. By default, if all matrices in x have the same rownames, ows with the same name will be grouped. Else, comparisons includes all possible combinations of rownames in x and y.
A data.table with model coefficients for all variables in returnVars for all unique tests in comparisons. Includes comparison, variable name, Estimate, Std. Error, t- or z-value, pValue, and adjusted pValue.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 | ## Not run:
#### Working with Multiple Datasets ####
# Linear model with x and groups
out <- ModelMultiData(x = RNAEdDataQCed$Frequency, groups = RNAEdSampleInfo,
excludeVars = "status")
out <- out[order(out$fdr), ]
head(out)
# Logistic regression with x and groups
out <- ModelMultiData(formula = status ~ frequency, x = RNAEdDataQCed$Frequency,
groups = RNAEdSampleInfo, xName = "frequency",
FUN = glm, family = binomial())
out <- out[order(out$fdr), ]
head(out)
# Linear model with x and y
out <- ModelMultiData(x = RNAEdDataQCed$Frequency,
y = RNAEdDataQCed$`Coverage-q25`)
out <- out[order(out$fdr), ]
head(out)
# Model with list of x matrices and groups
out <- ModelMultiData(formula = status ~ frequency + coverage + age,
x = list(frequency = RNAEdDataQCed$Frequency,
coverage = RNAEdDataQCed$`Coverage-q25`),
groups = RNAEdSampleInfo, FUN = glm, family = binomial(),
returnVars = "*")
out <- out[order(out$fdr), ]
head(out)
# Stratified analysis using by
out <- ModelMultiData(formula = age ~ frequency + coverage + batch,
x = list(frequency = RNAEdDataQCed$Frequency,
coverage = RNAEdDataQCed$`Coverage-q25`),
groups = RNAEdSampleInfo, by = "status")
out <- lapply(out, function(x) x[order(x$fdr), ])
lapply(out, head)
# Linear model with x and y using comparisons parameter
head(RNAEdCombinations)
out <- ModelMultiData(formula = y ~ x1 + age,
x = RNAEdGenotypes, y = RNAEdDataQCed$Frequency,
groups = RNAEdSampleInfo, comparisons = RNAEdCombinations)
out <- out[order(out$fdr), ]
head(out)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.