find_taxoname: Locate and extract taxonomic names from given input files
In qingyuexu/bioparser: Parser and Crawler for Biodiversity Checklists.

Description Usage Arguments Value Examples

find_taxoname locates and extracts taxonomic names from txt, docx, pdf or html files and reorganizes the taxonomy names into standard order: genus, species, subspecies, author&year, distribution. The function can output the result to a txt file and each row of the file is one entry of a taxonomic name. The result txt file of this function can be further processed into a tabular format in csv which contains more detailed information using function parse_taxolist.

1	find_taxoname(filepath, filename, type, encoding = "unknown", output_name = "FALSE")

`filepath`	Required. The path of the file which the data is to be read from. If it does not contain an absolute path, the file name is relative to the current working directory.
`filename`	Required. The name of the file which the data is to be read from.
`type`	Required. Currently accept 'txt', 'docx', and 'pdf' format files.
`encoding`	Optional. The encoding method of the input file. Default value is 'unknown'.
`output_name`	Required. The path and name of the file for writing. If it does not contain an absolute path, the file name is relative to the current working directory.

A data frame containing the result of finding and reorganizing taxonomic names in the input file into standard format.

A TXT file written from the above data frame and each line of this file contains one entry of taxonomic names.

df <- find_taxoname(filepath = "./Examples/input_data",
                    filename = "taxo01.txt",
                    type = "txt",
                    output_name = "./Examples/output_data/taxo01_output")