IdentifyAssociatedPCs: Identifyprincipal components (PCs) that are significantly...

IdentifyAssociatedPCsR Documentation

Identifyprincipal components (PCs) that are significantly associated with eQTLs and genes

Description

This function identifies PCs that are significantly associated the eQTLs or genes, and merge the associated PCs with the data on the eQTL and genes. PCs may be derived from Principal Component Analysis (PCA) of the entire gene expression matrix, and may be viewed as potential confounders in the sequent causal network analysis on the eQTLs and genes. See details in Badsha and Fu (2019) and Badsha et al. (2021).

Usage

IdentifyAssociatedPCs(PCs.matrix,no.PCs,data,fdr.level,corr.threshold,corr.value)

Arguments

PCs.matrix

A matrix of PCs.

no.PCs

Number of top PCs to test for association. The default is 10.

data

Data of the eQTLs and genes, containing the genotypes of the eQTLs and the expression of the genes.

fdr.level

(optional) The false discover rate (FDR) for association tests. Must be in (0,1]. The default is "0.05".

corr.threshold

(optional). The default is "FALSE". If "TRUE" then a constraint on the correlation between a PC and an eQTL or a gene is applied in addition to the FDR control.

corr.value

The threshold for the Pearson correlation between a PC and an eQTL or a gene when corr.threshold is "TRUE". The default is 0.3.

Value

A list of object that containing the following:

  • AssociatedPCs: All the PCs that are significantly associated with the eQTLs and genes.

  • data.withPC: The data matrix that contains eQTLs, gene expression, and associated PCs.

  • corr.PCs: The matrix of correlations between PCs and eQTLs/genes.

  • PCs.asso.list: List of all associated PCs for each of the eQTLs and genes.

  • qobj: The output from applying the qvalue function.

Author(s)

Md Bahadur Badsha (mbbadshar@gmail.com)

References

1. Badsha MB and Fu AQ (2019). Learning causal biological networks with the principle of Mendelian randomization. Frontiers in Genetics, 10:460.

2. Badsha MB, Martin EA and Fu AQ (2021). MRPC: An R package for inference of causal graphs. Frontiers in Genetics, 10:651812.

See Also

data_GEUVADIS_combined

Examples


## Not run: 

# Load genomewide gene expression data in GEUVADIS 
# 373 individuals
# 23722 genes
data_githubURL <- "https://github.com/audreyqyfu/mrpc_data/raw/master/data_GEUVADIS_allgenes.RData"
load(url(data_githubURL))
PCs <- prcomp(data_GEUVADIS_allgenes,scale=TRUE)
# Extract the PCs matrix 
PCs.matrix <- PCs$x

# The eQTL-gene set contains eQTL rs7124238 and genes SBF1-AS1 and SWAP70

data_GEU_Q50 <- data_GEUVADIS$Data_Q50$Data_EUR
colnames(data_GEU_Q50) <- c("rs7124238","SBF2-AS1","SWAP70")
data <- data_GEU_Q50

# Identify associated PCs for this eQTL-gene set

Output <- IdentifyAssociatedPCs(PCs.matrix,no.PCs=10,data,fdr.level=0.05,corr.threshold=TRUE
,corr.value = 0.3)

# Gene SBF2-AS1 is significantly associated with PC2 
# Data with PC2 as a potential confounder

data_withPC <- Output$data.withPC

n <- nrow(data_withPC)         # Number of rows
V <- colnames(data_withPC)     # Column names

# Calculate Pearson correlation for MRPC analysis

suffStat <- list(C = cor(data_withPC,use = "complete.obs"),
                 n = n)

# Infer the graph by MRPC 

MRPC.fit_FDR<- MRPC(data_withPC,
                    suffStat,
                    GV = 1,
                    FDR = 0.05,
                    indepTest = 'gaussCItest',
                    labels = V,
                    FDRcontrol = 'LOND',
                    verbose = TRUE)
                    
plot(MRPC.fit_FDR, main="MRPC with PCs (potential confounders)")


## End(Not run)

MRPC documentation built on April 11, 2022, 5:10 p.m.