netphenogeno: Reconstructs conditional dependence network among genetic...

View source: R/netphenogeno.R

netphenogenoR Documentation

Reconstructs conditional dependence network among genetic loci and phenotypes

Description

This is one of the main functions of the netgwas package. This function reconstructs a conditional independence network between genotypes and phenotypes for diploids and polyploids. Three methods are available to reconstruct networks, namely (i) Gibbs sampling, (ii) approximation method, and (iii) nonparanormal approach within the Gaussian copula graphical model. The first two methods are able to deal with missing genotypes. The last one is computationally faster.

Usage

netphenogeno(data, method = "gibbs", rho = NULL, n.rho = NULL, rho.ratio = NULL,
		ncores = 1, em.iter = 5, em.tol=.001, verbose = TRUE)

Arguments

data

An (n \times p) matrix or a data.frame corresponding to the data matrix (n is the sample size and p is the number of variables). The p columns include either a marker or trait(s) information. Input data can contain missing values.

method

Reconstructing both genotype-phenotype interactions network and genotype-phenotype-environment interactions network with three methods: "gibbs", "approx", and "npn". For a medium (~500) and a large number of variables we recommend to choose "gibbs" and "approx", respectively. Choosing "npn" for a very large number of variables (> 2000) is computationally efficient. The default method is "gibbs".

rho

A decreasing sequence of non-negative numbers that control the sparsity level. Leaving the input as rho = NULL, the program automatically computes a sequence of rho based on n.rho and rho.ratio. Users can also supply a decreasing sequence values to override this.

n.rho

The number of regularization parameters. The default value is 10.

rho.ratio

Determines distance between the elements of rho sequence. A small value of rho.ratio results in a large distance between the elements of rho sequence. And a large value of rho.ratio results into a small distance between elements of rho. The default value is 0.3.

ncores

The number of cores to use for the calculations. Using ncores = "all" automatically detects number of available cores and runs the computations in parallel on (available cores - 1).

em.iter

The number of EM iterations. The default value is 5.

em.tol

A criteria to stop the EM iterations. The default value is .001.

verbose

Providing a detail message for tracing output. The default value is TRUE.

Details

This function reconstructs both genotype-phenotype network and genotype-phenotype-environment interactions network. In genotype-phenotype networks nodes are either markers or phenotypes; each phenotype is connected by an edge to a marker if there is a direct association between them given the rest of the variables. Different phenotypes may also interconnect. In addition to markers and phenotypes information, the input data can include environmental variables. Then, the interactions network shows the conditional dependence relationships between markers, phenotypes and environmental factors.

Value

An object with S3 class "netgwas" is returned:

Theta

A list of estimated p by p precision matrices that show the conditional independence relationships patterns among measured items.

path

A list of estimated p by p adjacency matrices. This is the graph path corresponding to Theta.

ES

A list of estimated p by p conditional expectation corresponding to rho.

Z

A list of n by p transformed data based on Gaussian copula.

rho

A n.rho dimensional vector containing the penalty terms.

loglik

A n.rho dimensional vector containing the maximized log-likelihood values along the graph path.

data

The n by p input data matrix. The n by p transformed data in case of using "npn".

Note

This function estimates a graph path . To select an optimal graph please refer to selectnet.

Author(s)

Pariya Behrouzi and Ernst C. Wit
Maintainers: Pariya Behrouzi pariya.behrouzi@gmail.com

References

1. Behrouzi, P., and Wit, E. C. (2019). Detecting epistatic selection with partially observed genotype data by using copula graphical models. Journal of the Royal Statistical Society: Series C (Applied Statistics), 68(1), 141-160.
2. Behrouzi, P., and Wit, E. C. (2017c). netgwas: An R Package for Network-Based Genome-Wide Association Studies. arXiv preprint, arXiv:1710.01236.
3. D. Witten and J. Friedman. New insights and faster computations for the graphical lasso. Journal of Computational and Graphical Statistics, to appear, 2011.
4. Guo, Jian, et al. "Graphical models for ordinal data." Journal of Computational and Graphical Statistics 24.1 (2015): 183-204.

See Also

selectnet

Examples

    
    
		data(thaliana)
		head(thaliana, n=3)
		#Construct a path for genotype-phenotype interactions network in thaliana data
		res <-  netphenogeno(data = thaliana); res
		plot(res)
		#Select an optimal network
		sel <- selectnet(res)
		#Plot selected network and the conditional correlation (CI) relationships 
		plot(sel, vis="CI")
		plot(sel, vis="CI", n.mem = c(8, 56, 31, 33, 31, 30), w.btw =50, w.within= 1)
		
		#Visualize interactive plot for the selected network
		#Color "red" for 8 phenotypes, and different colors for each chromosome.
		cl <- c(rep("red", 8), rep("white",56), rep("tan1",31), 
		      rep("gray",33), rep("lightblue2",31), rep("salmon2",30))
		      
		#The IDs of phenotypes and SNPs to be shown in the network       
    id <- c("DTF_LD","CLN_LD","RLN_LD","TLN_LD","DTF_SD","CLN_SD","RLN_SD", 
        "TLN_SD","snp15","snp16","snp17","snp49","snp50","snp60","snp75",
        "snp76","snp81","snp83","snp84","snp86","snp82", "snp113","snp150",
        "snp155","snp159","snp156","snp161","snp158","snp160","snp162","snp181")
		
		plot(sel, vis="interactive", n.mem = c(8, 56, 31, 33, 31, 30), vertex.color= cl,
		    label.vertex= "some", sel.nod.label= id, edge.color= "gray", w.btw= 50,
		    w.within= 1)
		
		#Partial correlations between genotypes and phenotypes in the thaliana dataset.
		library(Matrix)
		image(sel$par.cor, xlab="geno-pheno", ylab="geno-pheno", sub="")
	

netgwas documentation built on Aug. 7, 2023, 5:10 p.m.