View source: R/make.gene.matrix.R
make.gene.matrix | R Documentation |
Creates a matrix which links gene sequences from multiple loci to their associated specimen, determined by assigning a voucher to each specimen using parsed metadata for each sequence.
make.gene.matrix(metadata, locusCol = "cleanedGeneRegion", vouchersCol = "newLabels", ncbiCol = "NCBI_accession", orgsCol = "organism", logerrors = TRUE, verbose = FALSE)
metadata |
The output of |
locusCol |
An optional string, the name of the column in the metadata which contains the name of the gene region. |
vouchersCol |
An optional string, the name of the column in the metadata which contains the voucher. |
ncbiCol |
An optional string, the name of the column in the metadata which contains the NCBI accession number. |
orgsCol |
An optional string, the name of the column in the metadata which contains the name of the taxon. |
logerrors |
An optional logical value indicating whether the function should export a csv file with sequences which did not have a voucher, which are automatically excluded from the output matrix. \itemverbose An optional logical value indicating whether the function should print every row of the metadata it successfully incorporates into the matrix. |
locusCol
, vouchersCol
, ncbiCol
, and orgsCol
all
have default values that correspond to the default names of those columns from
other functions in the morton package. They are cleanedGeneRegion
,
newLabels
, NCBI_accession
, and organism
, respectively.
The default value for logerrors
is TRUE. For verbose
, it is FALSE.
verbose
can be a useful tool when troubleshooting to pinpoint where the
function has stopped.
This function traverses the metadata data table from parse.INSDSeq
and
generates a matrix where unique vouchers are the rows andgene loci are the
columns. Each cell represents a sequence, where its x and y position in the
matrix indicate which voucher and gene locus it belongs to. Cells therefore
contain the NCBI accession number for the sequence which they are associated
with. If there are multiple sequences for a single voucher and gene locus, the
NCBI accession numbers are both entered into the cell, delimited by a pipe(|).
A matrix which connects NCBI gene sequences to their associated loci and vouchers.
Andrew Hipp and Kasey Pham
parse.INSDSeq
, make.unique.vouchers
, make.fasta.files
, make.shared.gene.matrix
, cbind
manip
methods
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.