R/~old/README.md

PopulationPathways

The same pathways are present in all humans, but are these genes somehow tuned differently based on population ancestry?

This is a project to develop a computational pipeline that determines instances of population-driven SNP-SNP coevolution at the pathway level in an effort to better understand the evolution of human pathways. A wide number of software packages and statistical methods are used and are outlined below (NOTE: the following pipeline is currently carried out between the CEU and YRI population cohorts)

Methods summary

| METHOD | RESULT | | ------------- | ------------- | | PLINK | ~1.5 million SNPs genotyped in CEU and YRI (HapMap 3) | | GSEA | 56 pathways (19 non-redundant) enriched for population-driven positive selection (i.e., high FST genes) | | Linkage disequilibrium | 4 (CEU) / 10 (YRI) within-pathway and 16 (CEU) / 59 (YRI) between-pathway coevolution signals discoered (FDR ≤ 0.2)|

HYPOTHESIS: evolutionary maintenance of pathway-level SNP interactions that influence population fitness

Steps of computational pipeline:

Pipeline scripts found here

# within R environment (calls recodeFAM.R, popPCA.R, calcFST.R, SNP2gene.R, setupGSEArun.R, getPathStats.R, LDstatsWPM.R, and LDstatsBPM.R)
> source(runPipeline.R) # runs entire pipeline

1) PLINK

2) GSEA (Gene-set enrichment analysis)

Input:

Output:

> head(res)
Geneset
1                                                     CELL FATE COMMITMENT%GOBP%GO:0045165
2              SRP-DEPENDENT COTRANSLATIONAL PROTEIN TARGETING TO MEMBRANE%GOBP%GO:0006614
3                 EUKARYOTIC TRANSLATION TERMINATION%REACTOME DATABASE ID RELEASE 56%72764
4                 EUKARYOTIC TRANSLATION ELONGATION%REACTOME DATABASE ID RELEASE 56%156842
5                                           VIRAL MRNA TRANSLATION%REACTOME%R-HSA-192823.1
6 TRANSMEMBRANE RECEPTOR PROTEIN SERINE/THREONINE KINASE SIGNALING PATHWAY%GOBP%GO:0007178
Size    ES   NES NominalP   FDR   FWER
1  109 0.406 4.667   0.0000 0.009 0.0506
2   83 0.130 4.629   0.0007 0.009 0.0556
3   77 0.114 4.592   0.0025 0.009 0.0623
4   78 0.107 5.024   0.0019 0.010 0.0184
5   75 0.126 4.901   0.0013 0.010 0.0260
6  134 0.372 4.676   0.0000 0.010 0.0492

Results:

3) Linkage disequilibrium (LD)

Input:

Output:

Results:

SUMMARY * Proof-of-concept pipeline successfully discovers instances of coevolution among pathways influencing population fitness * Pathways demonstrate population-driven signals of positive selection (FST) between Europeans and Africans * LD maintains inter-chromosomal SNP-SNP interactions within and between pathways enriched for positive selection



BaderLab/POPPATHR documentation built on Dec. 17, 2021, 9:53 a.m.