There are thousands of administrative documents about the U.S. Endangered Species Act (ESA) available on the internet, thousands that are not yet publicly available, and thousands more produced each year. We have gathered PDFs of what is publicly available and added our copies of documents acquired through other means (e.g., Freedom of Information Act [FOIA] requests) to a base collection. The plain text of each document has been extracted or Optical Character Recognition (OCR) used to identify the text. All of this is loaded into an Elasticsearch database, and a web app developed to facilitate searching all of these documents.
Each document in the elasticsearch database includes the following fields:
index
: esadocs
type
: type of document, including
five_year_review
federal_register
recovery_plan
section_7a1
section_7a2
section_10a1A
CCA
CCAA
HCP
SHA
misc
raw_txt
, the raw text of the document, for index and search
txt
, the path to the text file
pdf
, the path to the pdf
basename
, the base name of the pdf and txt files for joining
file_name
, the text (a
tag) from the ECOS link of the document, or another name to ID the document
orig_link
, the original URL (href
) from ECOS
Were you searching for a document and find an error? That's entirely possible, especially for documents where the text was extracted by OCR from a PDF with low-resolution pages. If you have a correct version - either because you have the original, manually entered the text, or by other means - then please get in touch. We plan to offer a more automated version of error correction, e.g., texts in a git repo with the opportunity to fork and submit pull requests, in the future. For now, we will make corrections manually.
Do you have or know of ESA-related documents that could be added to our database? Please get in touch to discuss how we can work together to make publicly available as much information as possible.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.