Description Usage Arguments Details Author(s) References Examples
This function runs a bacterial genome-wide association test. It runs either the Continuous Test when given continuous phenotype data. When given binary data the user may run either the Synchronous Test or PhyC or both.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
pheno |
Matrix. Dimensions: nrow = number of samples, ncol = 1. Either continuous or binary (0/1). Row.names() must match tree$tip.label. Required input. |
geno |
Matrix. Dimensions: nrow = number of samples, ncol = number of genotypes. Binary (0/1). Row.names() must match tree$tip.label. Required input. |
tree |
Phylo object. If unrooted, will be rooted using phytools::midpoint.root() method. Required input. |
tree_type |
Characer. Default = "phylogram". User can supply either: "phylogram" or "fan." Determines how the trees are plotted in the output. |
file_name |
Character. Suffix for output files. Default value "hogwash". |
dir |
Character. Path to output directory. Default value is current directory: "." |
perm |
Integer. Number of permutations to run. Default value is: 10,000. |
fdr |
Numeric. False discovery rate. Between 0 and 1. Default value is: 0.15. |
bootstrap |
Numeric. Confidence threshold for tree bootstrap values. Default value is: 0.70. |
group_genotype_key |
Matrix. Dimenions: nrow = number of unique genotypes, ncol = 2. Optional input. |
grouping_method |
Character. Either "pre-ar" or "post-ar". Default = "post_ar". Determines which grouping method is used if and only if a group_genotype_key is provided; if no key is provided this argument is ignored. |
test |
Character. Default = "both". User can supply three options: "both", "phyc", or "synchronous". Determines which test is run for binary data. |
Overview: hogwash reads in one phenotype (either continuous or binary), a matrix of binary genotypes, and a phylogenetic tree. Given these inputs it performs an ancestral reconstruction of that phenotype and each genotype. The ancestral reconstructions are used to perform one of several tests to associate the the genotypes with the phenotype:
Continuous Test
Synchronous Test
PhyC Test (Farhat et al.)
Once a test finishes running it returns (i) p-values for all genotypes tested, (ii) a manhattan plot of those p-values; if any of the genotypes tested were significant associated with the phenotype after FDR correction it also returns (iii) a list of significant hits and (iv) figures of the genotype & phenotype reconstructions on the tree.
Grouping: A feature of hogwash is the ability to organize genotypes into biologically meaningful groups. Testing for an association between an individual SNP and a phenotype is quite stringent, but patterns may emerge when grouping together biologically related genotypes. For example, grouping together all variants (insertions, deletions and SNPs) within a gene or promoter region could allow the user to identify a particular gene as being associated with a phenotype while any individual variant within that gene may not have deep penetrance in the isolates being tested. Grouped genotypes could have increased power to identify convergent evolution because they captures larger trends in functional impact at the group level and reduce the multiple testing correction burden. Use cases for this method could be to group SNPs into genes, kmers or genes into pathways, etc... Each of the three tests can be run on disaggregated data or aggregated data with the inclusion of a grouping key. There are two grouping options: grouping prior to ancestral reconstruction or grouping post ancestral reconstruction.
Katie Saund
Farhat MR, Shapiro BJ, Kieser KJ, et al. Genomic analysis identifies targets of convergent positive selection in drug-resistant Mycobacterium tuberculosis. Nat Genet. 2013;45(10):1183–1189. doi:10.1038/ng.2747
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 | # Both Synchronous Test & PhyC for discrete phenotype
phenotype <- hogwash::antibiotic_resistance
genotype <- hogwash::snp_genotype
tree <- hogwash::tree
hogwash(pheno = phenotype,
geno = genotype,
tree = tree)
# Continuous Test for continuous phenotype
phenotype <- hogwash::growth
genotype <- hogwash::snp_genotype
tree <- hogwash::tree
hogwash(pheno = phenotype,
geno = genotype,
tree = tree)
# Continuous Test while grouping SNPs into genes
phenotype <- hogwash::growth
genotype <- hogwash::snp_genotype
tree <- hogwash::tree
key <- hogwash::snp_gene_key
hogwash(pheno = phenotype,
geno = genotype,
tree = tree,
group_genotype_key = key,
grouping_method = "post-ar")
# Both Synchronous Test & PhyC while grouping SNPs into genes
phenotype <- hogwash::antibiotic_resistance
genotype <- hogwash::snp_genotype
tree <- hogwash::tree
key <- hogwash::snp_gene_key
hogwash(pheno = phenotype,
geno = genotype,
tree = tree,
group_genotype_key = key,
grouping_method = "post-ar")
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.