
The pubmlst package makes it easy to derive the sequence type and species for Campylobacter jejuni and Campylobacter coli where the Multi-locus sequence typing profile (allelic profile) is available, by looking up the data using PubMLST information.

The package encapsulates the data from PubMLST and allows downloading of the latest profiles directly within R.

Given a data.frame (or database) containing 7 columns for the 7 housekeeping genes, the package will allow determination of the sequence type, clonal complex, and species.

It allows imputation where one or more of the loci have missing alleles, if there is a unique match within the PubMLST database.


Pubmlst is not currently available from CRAN, but you can install it from github with:

# install.packages("devtools")



# assemble some data
df <- data.frame(id=c("A", "B"), ASP=c(2,2), GLN=c(1,4), GLT=c(54,1), 
                 GLY=c(NA, 2), PGM=c(4,2), TKT=c(1,1), UNC=c(5,NA))

# impute the MLST sequence type

jmarshallnz/pubmlst documentation built on Nov. 16, 2020, 2:44 a.m.