donuts | R Documentation |

This function takes 2 or 3 GWAS summary statistics as input and output summary statistics for the direct and indirect genetic effects.

donuts( ss.own, ss2, ss3 = NULL, l12 = 0, l13 = 0, l23 = 0, n1 = NULL, n2 = NULL, n3 = NULL, alpha = 0, mode = 2, OutDir = getwd() )

`ss.own` |
a data.frame; GWAS-O summary statistics |

`ss2` |
a data.frame; 2nd input GWAS summary statistics.
Depending on the |

`ss3` |
a data.frame; default is NULL. 3rd input GWAS summary statistics; This is GWAS-P when |

`l12` |
numeric; |

`l13` |
numeric; |

`l23` |
numeric; |

`n1` |
integer; Sample size for ss.own; default is NULL. Only needs to be specified when sample size is not included in ss.own summary statistics. If specified, this number will be used in the analysis. |

`n2` |
integer; Sample size for ss2; default is NULL. Only needs to be specified when sample size is not included in ss2 summary statistics. If specified, this number will be used in the analysis. |

`n3` |
integer; Sample size for ss3; default is NULL. Only needs to be specified when sample size is not included in ss2 summary statistics. If specified, this number will be used in the analysis. |

`alpha` |
numeric or a data.frame; correlation between spousal genotypes (i.e., Corr(Gm, Gp)) at each locus; default is 0.
This value measures the degree of assortative mating. |

`mode` |
integer 1, 2, or 3; default is 2; specify analysis scenario – see Details. |

`OutDir` |
Output directory to write the direct and indirect effect summary statistics files. Default is the current directory. If is NULL, the output files won't be written (but will still return the results as a data.frame). |

This function will first check whether there are duplicated SNPs using variant IDs. SNPs with duplicated IDs will be removed.
Then, it will take intersection of the SNPs among all the inputs (and also with SNPs in `alpha`

if it's a data.frame) and only the overlapping SNPs will be kept in the output.
The first input summary statistics' `A1`

and `A2`

will be used. That is, the other input's `BETA`

will multiply by -1 if A1 and A2 are flipped,
or will be re-coded as `NA`

if the alleles cannot be matched.

GWAS-O: standard GWAS of own phenotype ~ own genotype

GWAS-M: offspring phenotype ~ mother's genotype

GWAS-P: offspring phenotype ~ father's genotype

GWAS-MP: offspring phenotype ~ parental genotype, where we pool together mothers and fathers from different families to run the GWAS

The input GWAS summary statistics must contain the following columns with exactly the following column names (they can contain additional columns, but those will not be used):

CHR: chromosome

BP: base-pair coordinate

SNP: variant IDs

A1: effect allele

A2: non-effect allele

BETA: effect size

SE: standard error

P: p-value

They can also contain "N" column for the sample size at each locus.
If the summary statistics does not contain "N", they can be specified by `n1`

, `n2`

, or `n3`

for the 3 input, respectively.
Note, if the sample size is specified by `n1`

, `n2`

, or `n3`

, these values will be used even if the input summary statistics contains "N" column.

The default value for `alpha`

is 0. You can also specify the spousal correlation at each SNP using a data.frame for `alpha`

.
If you want to do so, the data.frame `alpha`

must contain 2 columns: "SNP" column for the variant ID and "alpha" column for the spousal correlation.
When `alpha`

is specified as a data.frame, only the overlapping SNPs with those in the input summary statistics will be kept in the output.

When `mode`

== 1, 3 inputs are expected: `ss.own`

is GWAS-O, `ss2`

is GWAS-M, and `ss3`

is GWAS-P.
The returned data.frame will contain the input summary statistics and the direct, indirect, indirect maternal, and indirect paternal effects.
If `OutDir`

is not NULL, will write summary statistics for the direct effect (`direct_effect.sumstats.gz`

),
indirect effect (`indirect_effect.sumstats.gz`

), indirect maternal effect (`indirect_maternal_effect.sumstats.gz`

),
indirect paternal effect (`direct_effect.sumstats.gz`

), and a file containing everythings (`all_aligned.sumstats.gz`

).

When `mode`

== 2, 2 inputs are expected: `ss.own`

is GWAS-O, `ss2`

is GWAS-MP.
The returned data.frame will contain the input summary statistics and the direct and indirect effects.
If `OutDir`

is not NULL, will write summary statistics for the direct effect (`direct_effect.sumstats.gz`

) and
indirect effect (`indirect_effect.sumstats.gz`

), and a file containing everythings (`all_aligned.sumstats.gz`

).

When `mode`

== 3, 2 inputs are expected: `ss.own`

is GWAS-O, `ss2`

is GWAS-M or GWAS-P.
If `ss2`

is GWAS-M, you're assuming the indirect paternal effect is 0. If `ss2`

is GWAS-P, you're assuming the indirect maternal effect is 0.
If `OutDir`

is not NULL, will write summary statistics for the direct effect (`direct_effect.sumstats.gz`

),
indirect effect (`indirect_effect.sumstats.gz`

), indirect maternal (if `ss2`

is GWAS-M) or indirect paternal (if `ss2`

is GWAS-P) effect (`indirect_ss2_effect.sumstats.gz`

),
and a file containing everythings (`all_aligned.sumstats.gz`

).

Besides the summary statistics, it is highly recommended to first run LDSC among any pair of your inputs and use LDSC's genetic covariance intercept to account for possible sample overlap.

Note, since the direct and indirect effects are linear combinations of input GWAS, it is thus critical that all the input GWAS were done on a same phenotype scale.

Returns a data.frame containing both the input and output summary statistics. The basic information about the SNPs are copied from `ss.own`

.
The contents of this data.frame will be different depending on the `mode`

.

`CHR`

: chromosome

`BP`

: base-pair coordinate

`SNP`

: variant IDs

`A1`

: effect allele

`A2`

: non-effect allele

`alpha`

: Corr(Gm, Gp) at each locus

`beta.{own, ss2, ss3, dir, ind, ind.mat, ind.pat, ind.ss2}`

: effect sizes in the input GWAS summary statistics and for the direct and indirect effects.

`se.{own, ss2, ss3, dir, ind, ind.mat, ind.pat, ind.ss2}`

: standard errors in the input GWAS summary statistics and for the direct and indirect effects.

`p.{own, ss2, ss3, dir, ind, ind.mat, ind.pat, ind.ss2}`

: p-values in the input GWAS summary statistics and for the direct and indirect effects.

`n.{own, ss2, ss3, dir, ind, ind.mat, ind.pat, ind.ss2}`

: sample sizes in the input GWAS summary statistics and the effective sample sizes for the direct and indirect effects.

Tutorials and examples can be found at: https://github.com/qlu-lab/DONUTS

Yuchang Wu (ywu423@wisc.edu), University of Wisconsin-Madison

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.