Description Usage Format Source
The dataset of nifH sequences used for: 1) constructing reference alignment and tree, 2) evaluating ppit
accuracy, and 3) taxonomic inferencing. Sequences curated from GenBank using ARBitrator (Heller et al., 2014).
1 |
Data frame
containing 8876 rows and 26 columns.
Domain of source organism
Phylum of source organism
Class of source organism
Order of source organism
Family of source organism
Genus of source organism
Species of source organism
Strain of source organism
Type/reference strains marked with "X"
Sequences used in percent identity calculation marked with "X"
Sequences used during taxonomic inferencing marked with "X"
Database version in which sequence was added
Source organism of sequence
Tip label on nifH_reference_tree_v2
Nucleotide accession of source scaffold, genome, etc.
Length of source scaffold, genome, etc. (bp)
Date when sequence was deposited into GenBank
Nucleotide sequence of reference nifH
Coding start position in nucleotide accession
Coding stop position in nucleotide accession
Length of nifH reference sequence (bp)
Location of nifH (i.e., chromosome, plasmid, undetermined)
Protein accession number
Sequences used for initial ARBitrator search marked with "X"
Sequences used MAFFT-DASH seed alignment
Suspected NifH homologs marked with "X"
...
BJ Kapili and AE Dekas. PPIT: an R package for inferring microbial taxonomy from nifH sequences. In. prep.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.