mWaveQTL: Main function to perform Fast functional association analysis...

Description Usage Arguments Details Value References Examples

View source: R/mWaveQTL.R

Description

Perform a screening for a trait on a given phenotype and a specified level of resolution

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
mWaveQTL(
  Y,
  genotype,
  pos,
  confounder,
  lev_res,
  sigma_b,
  para = FALSE,
  nperm = 10000,
  npermstop = 100
)

Arguments

Y

phenotype vector, has to be numeric. For case control code it as 0 and 1.

genotype

genotype matrix (either data.frame or numeric matrix). Lines=SNPs in increasing order in term of base pair, columns=individuals. No missing values allowed.

pos

vector of spatial position of the variables. It has to be in the same order as long as the number of column of the genotype matrix line order/length.

confounder

the confounding matrix with the same sample order as Y. The intercept should not be included, if missing will generate a intercept matrix.

lev_res

the maximum level of resolution needed

sigma_b

the parameter of the NIG prior used for the Bayes Factor computation. We advised to set this value between 0.1 and 0.2

para

logical parameter for parallelisation, if not specified set at FALSE.

nperm

the number of permutation peformed to asses the significance of a regions. Default value 10,000

npermstop

Number of permutations allowed to be larger than the observed test statistics for the regions, before stopping permuation procedure. Default value =100

Details

The mWaveQTL function computes the Likelihood ratio used for testing significance of a genetic region. In addition it computes the porportion of wavelets coefficients associated by level of resolution, and the Bayes factor used for this estimation.

Value

A named vector. First position the estimated value of the Lambda statistics, then the proportion of association per level of resolution then the computed Bayes Factor per wavelet coefficient.

References

Shim and Stephens,Wavelet-based genetic association analysis of functional phenotypes arising from high-thoughput sequencing asssays,The Annals of Applied Statistics, 2015, Vol. 9, No. 2, 665–686

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
## Not run: 


set.seed(66)
#########################################
#Generate a randomly sample genotype size=1Mb
#########################################

#5000 Randomly choosen pos
my_pos <- sort(sample(1:1000000, size=5000,replace = FALSE))
#############################
#Three different bump genotype
#############################
my_functions <-data.frame(f0 = c(rep(0,400000),rep(0,200000),rep(0,400000)),
                         f1 = c(rep(0,400000),rep(1,200000),rep(0,400000)) ,
                         f2=c(rep(0,400000),rep(2,200000),rep(0,400000)))


library(gridExtra)
###########################
#Minor allele frequency 30%
###########################
MAF=0.3
sampl_schem <- c((1-MAF)^2,2*MAF*(1-MAF),MAF^2)
#######################################
#sampling at Hardy Weinberg equilibrium
#######################################
#Assigning class

#sample size =4000
n_size=4000
type_fn <-sample(0:2,replace = TRUE,size=n_size,  prob=  sampl_schem  )


genotype <-  matrix(my_functions[my_pos,2 ], ncol=1 ) %*%t(matrix(type_fn,ncol=1))
#dim(genotype)= nSNP, nind

###############################################################
#Generate a phenotype with variance explained by genotype  0.5%
###############################################################
varexp=0.005
var_noise <- (1-varexp)*var(sample(0:2,replace = TRUE,size=10000,
                                  prob=sampl_schem ))/varexp
Y <-  rnorm(n=n_size,sd=sqrt(var_noise)) +type_fn
df <- data.frame(y=Y,genotype =factor(type_fn))
P1 <- ggplot(df,aes(y=y,x=genotype))+
 geom_boxplot()+
 xlab("Type of genotype")+
 theme(axis.text=element_text(size=12),
       axis.title=element_text(size=14,face="bold"))+
 ylab("Simulated Phenotype")+
 theme_bw()+
 ggtitle("Variation of the phenotype\ndepending of thegenotype, \nVariance explained =0.5%")

df <- data.frame(pos= rep(my_pos,3),y=c(my_functions[my_pos,1],my_functions[my_pos,2],my_functions[my_pos,3]),
                mycol = factor(c(rep("f0",length(my_pos)),rep("f1",length(my_pos)),rep("f2",length(my_pos))) ) )

P2 <- ggplot(df,aes(y=y,x=pos,color=mycol))+
 geom_point(size=1)+
 xlab("Base pair")+
 ylab("Number of variants")+
 theme_bw()+
 theme(legend.title=element_blank())+
 ggtitle("Three different kind of genotype")

grid.arrange(P1,P2,ncol=2)

##################
#Screening
##################
res <- mWaveQTL( Y,
                genotype=genotype,
                pos=my_pos,
                         lev_res=6,
                         sigma_b = 0.2)
##############
#Visualisation
##############
pos <- c(min(my_pos),max(my_pos))
plot_mWaveQTL(res=res,pos=pos,lev_res=6)



## End(Not run)

william-denault/mWaveQTL documentation built on Sept. 13, 2020, 2:50 p.m.