In gtonkinhill/fastbaps: A fast genetic clustering algorithm that approximates a Dirichlet Process Mixture model

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "inst/vignette-supp/",
  echo=TRUE, 
  warning=FALSE, 
  message=FALSE,
  tidy=TRUE
)

fastbaps

Installation

fastbaps is currently available on github. It can be installed with devtools

install.packages("devtools")

devtools::install_github("gtonkinhill/fastbaps")

If you would like to also build the vignette with your installation run:

devtools::install_github("gtonkinhill/fastbaps", build_vignettes = TRUE)

Conda

fastbaps can also be installed using Conda

conda install -c conda-forge -c bioconda -c defaults r-fastbaps

Choice of Prior

Fastbaps includes a number of options for the Dirichlet prior hyperparamters. These range in order from most conservative to least as symmetric, baps, optimised.symmetric and optimised.baps. The choice of prior can be set using the optimise_prior function.

It is also possible to condition on a pre-existing phylogeny, which allows a user to partition the phylogeny using the fastbaps algorithm. This is described in more detail further down in the introduction.

Quick Start

Run fastbaps.

NOTE: You need to replace the variable fasta.file.name with the path to your fasta file. The system.file function is only used in this example vignette.

# devtools::install_github("gtonkinhill/fastbaps")
library(fastbaps)
library(ape)

fasta.file.name <- system.file("extdata", "seqs.fa", package = "fastbaps")
sparse.data <- import_fasta_sparse_nt(fasta.file.name)
sparse.data <- optimise_prior(sparse.data, type = "optimise.symmetric")
baps.hc <- fast_baps(sparse.data)
clusters <- best_baps_partition(sparse.data, as.phylo(baps.hc))

All these steps can be combined and the algorithm run over multiple levels by running

sparse.data <- optimise_prior(sparse.data, type = "optimise.symmetric")
multi <- multi_res_baps(sparse.data)

Command Line Script

The fastbaps package now includes a command line script. The location of this script can be found by running

system.file("run_fastbaps", package = "fastbaps")

This script can then be copied to a location on the users path. If you have installed fastbaps using conda, this will already have been done for you.

Citation

To cite fastbaps please use

Tonkin-Hill,G., Lees,J.A., Bentley,S.D., Frost,S.D.W. and Corander,J. (2019) Fast hierarchical Bayesian analysis of population structure. Nucleic Acids Res., 10.1093/nar/gkz361.

Introduction

intro_rmd <- 'vignettes/introduction.Rmd'

raw_rmd <- readLines(intro_rmd)

# remove yaml 
yaml_lines <- grep("---", raw_rmd)

# remove appendix (session info)
appendix <- grep("Appendix", raw_rmd)

compressed_rmd <- raw_rmd[c(-seq(yaml_lines[1], yaml_lines[2], by = 1), 
                            -seq(appendix, length(raw_rmd)))]
writeLines(compressed_rmd, "child.Rmd")

if (file.exists("child.Rmd")) {
  file.remove("child.Rmd")
}

gtonkinhill/fastbaps documentation built on Sept. 25, 2022, 1:56 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

gtonkinhill/fastbaps
A fast genetic clustering algorithm that approximates a Dirichlet Process Mixture model

In gtonkinhill/fastbaps: A fast genetic clustering algorithm that approximates a Dirichlet Process Mixture model

fastbaps

Installation

Conda

Choice of Prior

Quick Start

Command Line Script

Citation

Introduction

R Package Documentation

Browse R Packages

We want your feedback!

gtonkinhill/fastbaps A fast genetic clustering algorithm that approximates a Dirichlet Process Mixture model

In gtonkinhill/fastbaps: A fast genetic clustering algorithm that approximates a Dirichlet Process Mixture model

fastbaps

Installation

Conda

Choice of Prior

Quick Start

Command Line Script

Citation

Introduction

R Package Documentation

Browse R Packages

We want your feedback!

gtonkinhill/fastbaps
A fast genetic clustering algorithm that approximates a Dirichlet Process Mixture model