iProFun.reg.1y: Linear regression on one outcome data type

View source: R/iProFun.reg.R

iProFun.reg.1yR Documentation

Linear regression on one outcome data type

Description

Linear regression on one outcome data type with all data types of DNA-level alterations.

Usage

iProFun.reg.1y(
  yList.1y,
  xList,
  covariates.1y,
  permutation = F,
  var.ID = c("Gene_ID"),
  Y.rescale = F,
  var.ID.additional = NULL,
  seed = NULL
)

Arguments

yList.1y

yList is a list of data matrix for outcomes, and yList.1y is one element of the list indicating the outcome on one data type.

xList

xList is a list of data matrix for predictors.

covariates.1y

covariates is a list of data matrix for covariates, and covariates.1y is one element of the list indicating the covariates for one data type. This list should be NULL or have the same No. of subjects as ylist.1y.

permutation

whether to permute the label of the outcome. permutation = F (default): no permutation and it should be used for analysis of original data. permutation = T: permute the label of outcome, which is useful in generating eFDR controlled discoveries.

var.ID

var.ID gives the variable name (e.g. gene/protein name) to match different data types.

Y.rescale

Y.rescale (default = False) gives whether each outcome variable should be standardized to mean 0 and sd 1 before regression.

var.ID.additional

var.ID.additional allows to output additional variable names from the input. Often helpful if multiple rows (e.g. probes) are considered per gene to allow clear index of the rows.

seed

seed allows users to externally assign seed to replicate results. Useful when permutation=T.

Value

It contains

xName:

Predictor variable name corresponds to each predictor-outcome pair

yName:

Outcome variable name corresponds to each predictor-outcome pair

betas:

Coefficient estimate for predictors

betas_se:

Coefficent SE for predictors

sigma2:

Regrssion error terms for predictors

dfs:

Regression degrees of freedom for predictors

v_g:

(X^T X)^-1 projection on predictors

Examples

# Load data
data(lscc_iProFun_Data)
# For analysis with overlapping genes, use:
yList = list(rna, protein, phospho); xList = list(mut, cnv)
covariates = list(cov, cov, cov)
pi1 = 0.05
# Regression on one outcome data type
ft1=iProFun.reg.1y(yList.1y=yList[[1]], xList=xList, covariates.1y=covariates[[1]],
                   var.ID=c("geneSymbol"))
# Regression on all three outcome data types
reg.all=iProFun.reg(yList=yList, xList=xList, covariates=covariates,
                    var.ID=c("geneSymbol"), var.ID.additional=c("id"))
# Calculate FWER for data type(s) that have few number of genes
FWER.all=iProFun.FWER(reg.all=reg.all, FWER.Index=c(1))
# Calculate Empirical FDR for one outcome
eFDR1=iProFun.eFDR.1y(reg.all=reg.all, which.y=2, yList=yList, xList=xList,
                      covariates=covariates, pi1=pi1, NoProbXIndex=c(1),
                      permutate_number=2, var.ID=c("geneSymbol"),
                      var.ID.additional=c("id"))
# Calculate Empirical FDR for all outcomes
eFDR.all=iProFun.eFDR(reg.all=reg.all, yList=yList, xList=xList, covariates=covariates, pi1=pi1,
                 NoProbXIndex=c(1),
                 permutate_number=2, var.ID=c("geneSymbol"),
                  var.ID.additional=c( "id"), seed=123)
# iProFun identification
# For data types with abundance genes, it's based on (1) association probabilities > 0.75,
# (2) FDR 0.1, and (3) the association direction filtering.
# For data types with few genes, it's  based (1) FWER 0.1, and
# (2)  the association direction filtering.

res=iProFun.detection(reg.all=reg.all, eFDR.all=eFDR.all, FWER.all=FWER.all, filter=c(0, 1),
                     NoProbButFWERIndex=1,fdr.cutoff = 0.1, fwer.cutoff=0.1, PostPob.cutoff=0.75,
                     xType=c("mutation", "cnv"), yType=c("rna", "protein", "phospho"))

songxiaoyu/iProFun documentation built on Dec. 8, 2022, 3:54 p.m.